Computing-in-memory (CIM)

A 28-nm Digital Transpose SRAM Compute-in-Memory Macro With Accurate/Approximate Dual Mode for Floating-Point Edge Training and Inference

A 28-nm Digital Transpose SRAM Compute-in-Memory Macro With Accurate/Approximate Dual Mode for Floating-Point Edge Training and Inference 150 150

Abstract:

Static random-access memory (SRAM)-based computing-in-memory (CIM) macros have been widely studied to improve the energy efficiency of edge artificial intelligence (AI) inference tasks. However, less attention has been given to AI training, which requires CIM macros to not only perform matrix multiply-accumulate (MAC) operations but also support matrix transposition. …

View on IEEE Xplore

EMO-CIM: An Input/Stationary-Data Similarity-Aware Computing-In-Memory Design for Variable Vector-Wise Computation in Edge Multioperator AI Acceleration

EMO-CIM: An Input/Stationary-Data Similarity-Aware Computing-In-Memory Design for Variable Vector-Wise Computation in Edge Multioperator AI Acceleration 150 150

Abstract:

We propose an edge multioperator computing-in-memory (EMO-CIM) design that supports variable vector-wise multiply-and-accumulate (MAC) in CNN, Depthwise (DW)-Convolution, and Attention operators. It features: 1) a single EMO-CIM bank (ECB) excels in variable vector-wise MAC (V-MAC) for multioperators; 2) merging local input-shared compute units (LISCUs) with a decode-unit and adder-tree (DUAT) facilitates …

View on IEEE Xplore

A 14-nm Nonvolatile-Volatile-Fused Compute-In-Memory Macro Based on Logic-Compatible Flash for Plastic Neural Networks

A 14-nm Nonvolatile-Volatile-Fused Compute-In-Memory Macro Based on Logic-Compatible Flash for Plastic Neural Networks 150 150

Abstract:

Designing computing-in-memory (CIM) chips with synaptic plasticity can potentially support energy-efficient on-chip learning in edge devices for rapid local task adaptation. Its silicon implementation is challenging as it requires hybridizing nonvolatile and volatile memory (VM) and customized computational operations. In this work, we propose a plastic CIM (P-CIM) macro featuring: 1) …

View on IEEE Xplore

MixCIM: A Hybrid Computing-in-Memory Macro With Less Data-Movement and Better Memory-Reuse for Depthwise Separable Neural Networks

MixCIM: A Hybrid Computing-in-Memory Macro With Less Data-Movement and Better Memory-Reuse for Depthwise Separable Neural Networks 150 150

Abstract:

Computing-in-memory (CIM) architectures have demonstrated strong potential for edge artificial intelligence (AI) devices due to their enhanced parallelism and energy efficiency. With the growing complexity of AI tasks and the rapid increase in model size, computation and deployment costs have surged. Depthwise separable neural networks (DSNNs) have attracted interest for …

View on IEEE Xplore

AACIM: A 2785-TOPS/W, 161-TOP/mm2, <1.17%-RMSE, Analog-In Analog-Out Computing-In-Memory Macro in 28 nm

AACIM: A 2785-TOPS/W, 161-TOP/mm2, <1.17%-RMSE, Analog-In Analog-Out Computing-In-Memory Macro in 28 nm 150 150

Abstract:

This article presents an analog-in analog-out CIM macro (AACIM) for use in analog deep neural network (DNN) processors. Our macro receives analog inputs, performs a 64-by-32 vector–matrix multiplication (VMM) with a current-discharging computation mechanism, and produces analog outputs. It stores a 4-bit weight as an analog voltage in the …

View on IEEE Xplore

A Microscaling Multi-Mode Gain-Cell Computing-in-Memory Macro for Advanced AI Edge Device

A Microscaling Multi-Mode Gain-Cell Computing-in-Memory Macro for Advanced AI Edge Device 150 150

Abstract:

The microscaling (MX) format is an emerging data representation that quantizes high-bitwidth floating-point (FP) values into low-bitwidth FP-like values with a shared-scale (SS) exponent. When implemented with computing-in-memory (CIM), MX allows an attractive tradeoff between accuracy and hardware efficiency for specific neural network (NN) workloads. This work presents the first …

View on IEEE Xplore

A 28-nm Computing-in-Memory Processor With Zig-Zag Backbone-Systolic CIM and Block-/Self-Gating CAM for NN/Recommendation Applications

A 28-nm Computing-in-Memory Processor With Zig-Zag Backbone-Systolic CIM and Block-/Self-Gating CAM for NN/Recommendation Applications 150 150

Abstract:

Computing-in-memory (CIM) chips have demonstrated promising energy efficiency for artificial intelligence (AI) applications such as neural networks (NNs), Transformer, and recommendation system (RecSys). However, several challenges still exist. First, a large gap between the macro and system-level CIM energy efficiency is observed. Second, several memory-dominate operations, such as embedding in …

View on IEEE Xplore