compute-in-memory (CIM)

A 28-nm PVT Inner-Tracking Time-Domain Compute-In-Memory Macro for Edge-AI Devices

A 28-nm PVT Inner-Tracking Time-Domain Compute-In-Memory Macro for Edge-AI Devices 150 150

Abstract:

This article presents an energy-efficient and process-, voltage-, and temperature (PVT)-robust time-domain (TD) compute-in-memory (CIM) macro for edge artificial intelligence (AI) devices. It features: 1) a PVT inner-tracking (PIT) technique that aligns the PVT responses of TD computation and TD quantization, delivering inherent robustness without incurring extra power or circuit …

View on IEEE Xplore

Denim: Heterogeneous Compute-in-Memory Accelerator Exploiting Denoising–Similarity for Diffusion Models

Denim: Heterogeneous Compute-in-Memory Accelerator Exploiting Denoising–Similarity for Diffusion Models 150 150

Abstract:

Diffusion models have recently revolutionized the field of image synthesis due to their ability to generate photorealistic images. However, one of the main drawbacks of diffusion models is that the image generation process is expensive. Large image-to-image networks have to be applied multiple times in order to iteratively optimize the …

View on IEEE Xplore

PERCEL: A Rewritable NVM CIM Incorporating a CTT-Based Per-Cell DAC

PERCEL: A Rewritable NVM CIM Incorporating a CTT-Based Per-Cell DAC 150 150

Abstract:

Compute-in-memory (CiM) accelerators perform matrix vector multiplications (MVMs) directly inside memory arrays, reducing data movement and improving both energy efficiency and throughput for artificial intelligence (AI) workloads. To reduce the number of conversions, recent designs use multibit compute cells. Nevertheless, practical multibit CiM still faces a tension among accuracy, efficiency, …

View on IEEE Xplore

A Folded-Differential Switched-Capacitor SRAM CIM Macro With Scalable MAC Sizes for TinyML Inference

A Folded-Differential Switched-Capacitor SRAM CIM Macro With Scalable MAC Sizes for TinyML Inference 150 150

Abstract:

This letter presents a switched-capacitor SRAM compute-in-memory macro optimized for TinyML inference. Key features include: 1) an area-efficient folded-differential multiply-and-accumulate (FD-MAC) scheme to double the signal margin; 2) a closed-loop floating-inverter amplifier (FIA)-based charge accumulation technique for signal-to-noise ratio enhancement and multiply-and-accumulate (MAC) voltage integration; and 3) a sparsity-aware multistep MAC method …

View on IEEE Xplore

An Approximate Digital CIM Macro With Low-Power Multiply-Add Units and Dynamic Sparse-Adaptive Configuring for Edge AI Inference

An Approximate Digital CIM Macro With Low-Power Multiply-Add Units and Dynamic Sparse-Adaptive Configuring for Edge AI Inference 150 150

Abstract:

This letter presents an approximate digital compute-in-memory (CIM) macro for low-power edge AI inference. It introduces three hierarchical innovations: 1) novel fused approximate multiply-add units (FAMUs) that reduces power and area consumption; 2) a bit-critical weight allocation architecture that optimally balances accuracy and hardware cost; and 3) a dynamic sparsity-adaptive configuration method to …

View on IEEE Xplore

A 28-nm FeFET Compute-in-Memory Macro With 64×64 Array Size and On-Chip 4-Bit Flash ADC

A 28-nm FeFET Compute-in-Memory Macro With 64×64 Array Size and On-Chip 4-Bit Flash ADC 150 150

Abstract:

Compute-in-memory (CIM) using emerging nonvolatile memory devices is a promising candidate for energy-efficient deep neural network (DNN) inference at the edge. Ferroelectric field-effect transistors (FeFETs) have recently gained attention as nonvolatile, CMOS-compatible devices with a higher on/off ratio and lower read and write energy compared to resistive random-access memory (…

View on IEEE Xplore

DPe-CIM: A 4T-1C Dual-Port eDRAM-Based Compute-in-Memory for Simultaneous Computing and Refresh With Adaptive Refresh and Data Conversion Reduction Scheme

DPe-CIM: A 4T-1C Dual-Port eDRAM-Based Compute-in-Memory for Simultaneous Computing and Refresh With Adaptive Refresh and Data Conversion Reduction Scheme 150 150

Abstract:

This article presents DPe-CIM, a 4T-1C dual-port embedded dynamic random access memory (eDRAM)-based compute-in-memory (CIM) macro with adaptive refresh and data conversion reduction. DPe-CIM proposes four key features that improve area and energy efficiency: 1) dual-port eDRAM cell (DPC) separates the multiply-and-accumulate (MAC) and refresh ports, enabling simultaneous MAC …

View on IEEE Xplore

1.58-b FeFET-Based Ternary Neural Networks: Achieving Robust Compute-In-Memory With Weight-Input Transformations

1.58-b FeFET-Based Ternary Neural Networks: Achieving Robust Compute-In-Memory With Weight-Input Transformations 150 150

Abstract:

Ternary weight neural networks (TWNs), with weights quantized to three states (−1, 0, and 1), have emerged as promising solutions for resource-constrained edge artificial intelligence (AI) platforms due to their high energy efficiency with acceptable inference accuracy. Further energy savings can be achieved with TWN accelerators utilizing techniques such as compute-in-memory (CiM) and …

View on IEEE Xplore

Understanding Reliability Trade-Offs in 1T-nC and 2T-nC FeRAM Designs

Understanding Reliability Trade-Offs in 1T-nC and 2T-nC FeRAM Designs 150 150

Abstract:

Ferroelectric random access memory (FeRAM) is a promising candidate for energy-efficient nonvolatile memory, particularly for logic-in-memory and compute-in-memory (CIM) applications. Among the available cell architectures, One-Transistor–n-Capacitor (1T-nC) and two-transistor–n-capacitor (2T-nC) FeRAMs each offer distinct trade-offs in density, scalability, and reliability. In this work, we present a comparative study …

View on IEEE Xplore