Accuracy

An Approximate Digital CIM Macro With Low-Power Multiply-Add Units and Dynamic Sparse-Adaptive Configuring for Edge AI Inference

An Approximate Digital CIM Macro With Low-Power Multiply-Add Units and Dynamic Sparse-Adaptive Configuring for Edge AI Inference 150 150

Abstract:

This paper presents an approximate digital compute-in-memory (CIM) macro for low-power edge AI inference. It introduces three hierarchical innovations: 1) novel fused approximate multiply-add units (FAMUs) that reduces power and area consumption; 2) a bit-critical weight allocation architecture that optimally balances accuracy and hardware cost; and 3) a dynamic sparsity-adaptive configuration method to …

View on IEEE Xplore

A 16 V-Output Switched-Capacitor Sigma Converter With 180 ns Transient Response and 94% Efficiency for LiDAR Receivers

A 16 V-Output Switched-Capacitor Sigma Converter With 180 ns Transient Response and 94% Efficiency for LiDAR Receivers 150 150

Abstract:

This article presents a step-up switched-capacitor (SC) sigma converter with fast transient, high efficiency, and high accuracy for light detection and ranging (LiDAR) receiver applications. The proposed sigma converter combines an unregulated SC converter in the high-voltage (HV) domain and a low-dropout (LDO) regulator in the low-voltage (LV) domain, achieving …

View on IEEE Xplore

A Cryo-CMOS Smart Temperature Sensor for the Ultrawide Temperature Range From 5 K to 296 K

A Cryo-CMOS Smart Temperature Sensor for the Ultrawide Temperature Range From 5 K to 296 K 150 150

Abstract:

This work presents a cryo-CMOS smart temperature sensor operating from room temperature down to 5 K. By adopting sensing elements (CMOS bulk diodes, pMOS/DTMOS in weak inversion) that circumvent the poor cryogenic performance of Si BJTs, a robust switched-capacitor second-order sigma–delta readout and cryogenic-aware design techniques, the sensor achieves …

View on IEEE Xplore

A 3 nm FinFET 125 TOPS/W-29 TFLOPS/W, 90 TOPS/mm2-17 TFLOPS/mm2 SRAM-Based INT8, and FP16 Digital-CIM Compiler With Support for Multi-Weight Update/Cycle

A 3 nm FinFET 125 TOPS/W-29 TFLOPS/W, 90 TOPS/mm2-17 TFLOPS/mm2 SRAM-Based INT8, and FP16 Digital-CIM Compiler With Support for Multi-Weight Update/Cycle 150 150

Abstract:

This article presents an static random-access memory (SRAM)-based digital compute-in-memory (CIM) compiler implemented with 3 nm high- $\kappa $ metal gate (HKMG) FinFET technology, supporting flexible INT8 and FP16 formats for weight and activation multiply-accumulate (MAC) operations, offering configuration flexibility, high accuracy, and improved area and power efficiency. The FP16 digital …

View on IEEE Xplore

A 28-nm FeFET Compute-in-Memory Macro With 64×64 Array Size and On-Chip 4-Bit Flash ADC

A 28-nm FeFET Compute-in-Memory Macro With 64×64 Array Size and On-Chip 4-Bit Flash ADC 150 150

Abstract:

Compute-in-memory (CIM) using emerging nonvolatile memory devices is a promising candidate for energy-efficient deep neural network (DNN) inference at the edge. Ferroelectric field-effect transistors (FeFETs) have recently gained attention as nonvolatile, CMOS-compatible devices with a higher on/off ratio and lower read and write energy compared to resistive random-access memory (…

View on IEEE Xplore

SparseCol: A 1320 BTOPS/W Precision-Scalable NPU Exploiting Training-Free Structured Bit-Level Sparsity and Dynamic Dataflow

SparseCol: A 1320 BTOPS/W Precision-Scalable NPU Exploiting Training-Free Structured Bit-Level Sparsity and Dynamic Dataflow 150 150

Abstract:

Bit-serial computation enables sequential processing of data at the bit level, providing several advantages, such as scalable computational precision. This approach has gained significant attention, especially for exploiting bit-level sparsity (BLS) in AI workloads. While current bit-serial processors leverage BLS to eliminate the computation associated with zero bits, they face …

View on IEEE Xplore

Advancing On-Cell Near-Field Monitoring for Thermal Runaway Detection in EV Batteries

Advancing On-Cell Near-Field Monitoring for Thermal Runaway Detection in EV Batteries 150 150

Abstract:

A cell monitoring system for performance and safety enhancement is presented. It is the first commercially available single-chip-on-cell near-field contactless solution for automotive battery management, simplifying pack interconnect and reducing points of failure. This letter is a companion paper to the earlier ISSCC paper. It provides further details on the …

View on IEEE Xplore

MEGA.mini: An Energy-Efficient NPU Leveraging a Novel Big/Little Core With Hybrid Input Activation for Generative AI Acceleration

MEGA.mini: An Energy-Efficient NPU Leveraging a Novel Big/Little Core With Hybrid Input Activation for Generative AI Acceleration 150 150

Abstract:

This article presents a processor for the acceleration of generative AI (GenAI) based on a novel heterogeneous core architecture called MEGA.mini. The processor introduces three algorithmic features: 1) fixed-point (FXP) and floating-point (FP) hybrid input activation (IA) representation; 2) a delayed-statistics-based normalization (NORM); and 3) conditional polynomial-based nonlinear activation (NLA) approximation. These …

View on IEEE Xplore

DPe-CIM: A 4T-1C Dual-Port eDRAM-Based Compute-in-Memory for Simultaneous Computing and Refresh With Adaptive Refresh and Data Conversion Reduction Scheme

DPe-CIM: A 4T-1C Dual-Port eDRAM-Based Compute-in-Memory for Simultaneous Computing and Refresh With Adaptive Refresh and Data Conversion Reduction Scheme 150 150

Abstract:

This article presents DPe-CIM, a 4T-1C dual-port embedded dynamic random access memory (eDRAM)-based compute-in-memory (CIM) macro with adaptive refresh and data conversion reduction. DPe-CIM proposes four key features that improve area and energy efficiency: 1) dual-port eDRAM cell (DPC) separates the multiply-and-accumulate (MAC) and refresh ports, enabling simultaneous MAC …

View on IEEE Xplore