Artificial intelligence (AI)

OISMA: On-the-Fly In-Memory Stochastic Multiplication Architecture for Approximate Matrix Multiplication

OISMA: On-the-Fly In-Memory Stochastic Multiplication Architecture for Approximate Matrix Multiplication 150 150

Abstract:

Artificial intelligence (AI) models are currently driven by a significant upscaling of their complexity, with massive matrix-multiplication workloads representing the major computational bottleneck. In-memory computing (IMC) architectures are proposed to avoid the von Neumann bottleneck. However, both digital/binary-based and analog IMC architectures suffer from various limitations, which significantly degrade …

View on IEEE Xplore

A 28-nm FD-SOI CMOS Analog-IMC Core Based on PCM Featuring 8 512 × 512-Weight Layers and 28M Weights×TOPs/W/mm2

A 28-nm FD-SOI CMOS Analog-IMC Core Based on PCM Featuring 8 512 × 512-Weight Layers and 28M Weights×TOPs/W/mm2 150 150

Abstract:

In-memory computing (IMC) hardware accelerators for deep neural networks (DNNs) require storing a massive number of coefficients within a single computing macro to avoid performance degradation in multicore clusters. This aspect, often overlooked by common figures of merit (FoMs), can be effectively addressed by phase-change memory (PCM) technology, thanks to …

View on IEEE Xplore

A 28-nm Digital Transpose SRAM Compute-in-Memory Macro With Accurate/Approximate Dual Mode for Floating-Point Edge Training and Inference

A 28-nm Digital Transpose SRAM Compute-in-Memory Macro With Accurate/Approximate Dual Mode for Floating-Point Edge Training and Inference 150 150

Abstract:

Static random-access memory (SRAM)-based computing-in-memory (CIM) macros have been widely studied to improve the energy efficiency of edge artificial intelligence (AI) inference tasks. However, less attention has been given to AI training, which requires CIM macros to not only perform matrix multiply-accumulate (MAC) operations but also support matrix transposition. …

View on IEEE Xplore

A 14-nm Nonvolatile-Volatile-Fused Compute-In-Memory Macro Based on Logic-Compatible Flash for Plastic Neural Networks

A 14-nm Nonvolatile-Volatile-Fused Compute-In-Memory Macro Based on Logic-Compatible Flash for Plastic Neural Networks 150 150

Abstract:

Designing computing-in-memory (CIM) chips with synaptic plasticity can potentially support energy-efficient on-chip learning in edge devices for rapid local task adaptation. Its silicon implementation is challenging as it requires hybridizing nonvolatile and volatile memory (VM) and customized computational operations. In this work, we propose a plastic CIM (P-CIM) macro featuring: 1) …

View on IEEE Xplore

A Microscaling Multi-Mode Gain-Cell Computing-in-Memory Macro for Advanced AI Edge Device

A Microscaling Multi-Mode Gain-Cell Computing-in-Memory Macro for Advanced AI Edge Device 150 150

Abstract:

The microscaling (MX) format is an emerging data representation that quantizes high-bitwidth floating-point (FP) values into low-bitwidth FP-like values with a shared-scale (SS) exponent. When implemented with computing-in-memory (CIM), MX allows an attractive tradeoff between accuracy and hardware efficiency for specific neural network (NN) workloads. This work presents the first …

View on IEEE Xplore