In-memory computing

DPe-CIM: A 4T-1C Dual-Port eDRAM-Based Compute-in-Memory for Simultaneous Computing and Refresh With Adaptive Refresh and Data Conversion Reduction Scheme

DPe-CIM: A 4T-1C Dual-Port eDRAM-Based Compute-in-Memory for Simultaneous Computing and Refresh With Adaptive Refresh and Data Conversion Reduction Scheme 150 150

Abstract:

This article presents DPe-CIM, a 4T-1C dual-port embedded dynamic random access memory (eDRAM)-based compute-in-memory (CIM) macro with adaptive refresh and data conversion reduction. DPe-CIM proposes four key features that improve area and energy efficiency: 1) dual-port eDRAM cell (DPC) separates the multiply-and-accumulate (MAC) and refresh ports, enabling simultaneous MAC …

View on IEEE Xplore

AACIM: A 2785-TOPS/W, 161-TOP/mm2, <1.17%-RMSE, Analog-In Analog-Out Computing-In-Memory Macro in 28 nm

AACIM: A 2785-TOPS/W, 161-TOP/mm2, <1.17%-RMSE, Analog-In Analog-Out Computing-In-Memory Macro in 28 nm 150 150

Abstract:

This article presents an analog-in analog-out CIM macro (AACIM) for use in analog deep neural network (DNN) processors. Our macro receives analog inputs, performs a 64-by-32 vector–matrix multiplication (VMM) with a current-discharging computation mechanism, and produces analog outputs. It stores a 4-bit weight as an analog voltage in the …

View on IEEE Xplore

A Microscaling Multi-Mode Gain-Cell Computing-in-Memory Macro for Advanced AI Edge Device

A Microscaling Multi-Mode Gain-Cell Computing-in-Memory Macro for Advanced AI Edge Device 150 150

Abstract:

The microscaling (MX) format is an emerging data representation that quantizes high-bitwidth floating-point (FP) values into low-bitwidth FP-like values with a shared-scale (SS) exponent. When implemented with computing-in-memory (CIM), MX allows an attractive tradeoff between accuracy and hardware efficiency for specific neural network (NN) workloads. This work presents the first …

View on IEEE Xplore

A 28-nm Computing-in-Memory Processor With Zig-Zag Backbone-Systolic CIM and Block-/Self-Gating CAM for NN/Recommendation Applications

A 28-nm Computing-in-Memory Processor With Zig-Zag Backbone-Systolic CIM and Block-/Self-Gating CAM for NN/Recommendation Applications 150 150

Abstract:

Computing-in-memory (CIM) chips have demonstrated promising energy efficiency for artificial intelligence (AI) applications such as neural networks (NNs), Transformer, and recommendation system (RecSys). However, several challenges still exist. First, a large gap between the macro and system-level CIM energy efficiency is observed. Second, several memory-dominate operations, such as embedding in …

View on IEEE Xplore

FIMA: A Scalable Ferroelectric Compute-in-Memory Annealer for Accelerating Boolean Satisfiability

FIMA: A Scalable Ferroelectric Compute-in-Memory Annealer for Accelerating Boolean Satisfiability 150 150

Abstract:

In-memory compute kernels present a promising approach for addressing data-centric workloads. However, their scalability—particularly for computationally intensive tasks solving combinatorial optimization problems such as Boolean satisfiability (SAT), which are inherently difficult to decompose—remains a significant challenge. In this work, we propose a ferroelectric nonvolatile memory (NVM)-based compute-in-memory …

View on IEEE Xplore

MIX-ACIM: A 28-nm Mixed-Precision Analog Compute-in-Memory With Digital Feature Restoration for Vector-Matrix Multiplication

MIX-ACIM: A 28-nm Mixed-Precision Analog Compute-in-Memory With Digital Feature Restoration for Vector-Matrix Multiplication 150 150

Abstract:

A mixed-precision analog compute-in-memory (Mix-ACIM) is presented for mixed-precision vector-matrix multiplication (VMM). The design features an all-analog current-domain fixed-point (FxP) VMM with floating-point conversion and feature restoration. A 28 nm CMOS test chip shows 41 TOPS/W and 24 TOPS/mm2 for FxP (8-bit input/weight and 12-bit output) and 24.18 TFLOPS/W and 3.3 …

View on IEEE Xplore

Device Nonideality-Aware Compute-in-Memory Array Architecting: Direct Voltage Sensing, I–V Symmetric Bitcell, and Padding Array

Device Nonideality-Aware Compute-in-Memory Array Architecting: Direct Voltage Sensing, I–V Symmetric Bitcell, and Padding Array 150 150

Abstract:

A voltage sensing compute-in-memory (CIM) architecture has been designed to improve the analog computing accuracy, and a chip on 90-nm flash platform has been successfully fabricated, with the bidirectional operation enabled by the symmetric bitcell structure. By padding the weight sum to a global value for all bit lines (BLs), …

View on IEEE Xplore