Logic

An Approximate Digital CIM Macro With Low-Power Multiply-Add Units and Dynamic Sparse-Adaptive Configuring for Edge AI Inference

An Approximate Digital CIM Macro With Low-Power Multiply-Add Units and Dynamic Sparse-Adaptive Configuring for Edge AI Inference 150 150

Abstract:

This paper presents an approximate digital compute-in-memory (CIM) macro for low-power edge AI inference. It introduces three hierarchical innovations: 1) novel fused approximate multiply-add units (FAMUs) that reduces power and area consumption; 2) a bit-critical weight allocation architecture that optimally balances accuracy and hardware cost; and 3) a dynamic sparsity-adaptive configuration method to …

View on IEEE Xplore

A 3-nm FinFET 563-kbit 35.5-Mbit/mm2 Dual-Rail SRAM With 3.89-pJ/Access High Energy Efficient and 27.5-μW/Mbit One-Cycle Latency Low-Leakage Mode

A 3-nm FinFET 563-kbit 35.5-Mbit/mm2 Dual-Rail SRAM With 3.89-pJ/Access High Energy Efficient and 27.5-μW/Mbit One-Cycle Latency Low-Leakage Mode 150 150

Abstract:

This article presents a high-density (HD) 6T SRAM macro designed in 3-nm FinFET technology with an extended dual-rail (XDR) architecture, addressing active energy and leakage for mobile applications. Two key innovations are introduced: the delayed-wordline in write operation (DEWL) technique and a one-cycle latency low-leakage access mode (1-CLM). The XDR …

View on IEEE Xplore

A 500 MS/s Robust 2b/cycle Pipelined-SAR ADC Achieving 64.6-dB SNDR and 82.6-dB SFDR With Linearity Enhancement Techniques

A 500 MS/s Robust 2b/cycle Pipelined-SAR ADC Achieving 64.6-dB SNDR and 82.6-dB SFDR With Linearity Enhancement Techniques 150 150

Abstract:

This letter presents a 14-bit 500-MS/s 3-stage pipelined successive approximation register (SAR) analog-to-digital converter (ADC). By exploiting robust 2b/cycle SAR ADCs, this ADC incorporates significant voltage and time redundancy. High SFDR is achieved through several linearity enhancement techniques. First, a DAC splitting technique addresses the common-mode voltage matching …

View on IEEE Xplore

A High-Speed D-FF and a 11-Bit Up-Down Counter Using Unipolar Oxide TFTs on a Flexible Foil

A High-Speed D-FF and a 11-Bit Up-Down Counter Using Unipolar Oxide TFTs on a Flexible Foil 150 150

Abstract:

This manuscript presents an experimental characterization of a novel high speed D flip-flop (D-FF). The circuit was fabricated on a $27\mu $ m thick flexible polyimide substrate using a nMOS only, single gate amorphous Indium-Gallium-Zinc-Oxide (a-IGZO) thin-film transistor (TFT) technology. Reliable response of the D-FF was noticed from measurements up to …

View on IEEE Xplore

Energy-Efficient Logic-in-Memory and Neuromorphic Computing in Raised Source and Drain MOSFETs

Energy-Efficient Logic-in-Memory and Neuromorphic Computing in Raised Source and Drain MOSFETs 150 150

Abstract:

This work highlights the potential application of raised source and drain (RSD) MOSFETs-based charge trapping memory (CTM) for next-generation computing applications. This simulation study presents a double-gate (DG)-RSD MOSFET technology with a short gate length (50 nm) to significantly improve the performance of logic-in-memory (LIM) and neuromorphic computing (NC) systems. …

View on IEEE Xplore

DPe-CIM: A 4T-1C Dual-Port eDRAM-Based Compute-in-Memory for Simultaneous Computing and Refresh With Adaptive Refresh and Data Conversion Reduction Scheme

DPe-CIM: A 4T-1C Dual-Port eDRAM-Based Compute-in-Memory for Simultaneous Computing and Refresh With Adaptive Refresh and Data Conversion Reduction Scheme 150 150

Abstract:

This article presents DPe-CIM, a 4T-1C dual-port embedded dynamic random access memory (eDRAM)-based compute-in-memory (CIM) macro with adaptive refresh and data conversion reduction. DPe-CIM proposes four key features that improve area and energy efficiency: 1) dual-port eDRAM cell (DPC) separates the multiply-and-accumulate (MAC) and refresh ports, enabling simultaneous MAC …

View on IEEE Xplore

A 0.8-μm 32-Mpixel Always-On CMOS Image Sensor With Windmill-Pattern Edge Extraction and On-Chip DNN

A 0.8-μm 32-Mpixel Always-On CMOS Image Sensor With Windmill-Pattern Edge Extraction and On-Chip DNN 150 150

Abstract:

This letter presents a CMOS image sensor (CIS) that integrates two operation modes: 1) a high-resolution viewing mode with $0.8~\mu $ m 32 Mpixels and 2) a low-power always-on object recognition mode consuming 2.67 mW at 10 frames/s. The CIS features a unique windmill-pattern analog edge extraction circuit that is resilient to illumination variations. An …

View on IEEE Xplore

A RISC-V SoC With Reconfigurable Custom Instructions on a Synthesized eFPGA Fabric in 22nm FinFET

A RISC-V SoC With Reconfigurable Custom Instructions on a Synthesized eFPGA Fabric in 22nm FinFET 150 150

Abstract:

This letter presents a flexible and energy-efficient RISC-V system-on-chip (SoC) in 22nm FinFET technology, achieving state-of-the-art performance by tightly integrating the CPU with a synthesized embedded FPGA (embedded field programmable gate array (eFPGA)), enabling the implementation of reconfigurable custom instructions. The tight integration of the eFPGA with SoC scratchpad memory …

View on IEEE Xplore

3-D Stacked HBM and Compute Accelerators for LLM: Optimizing Thermal Management and Power Delivery Efficiency

3-D Stacked HBM and Compute Accelerators for LLM: Optimizing Thermal Management and Power Delivery Efficiency 150 150

Abstract:

Advanced packaging is becoming essential for designing hardware accelerators for large language models (LLMs). Different architectures, such as 2.5-D integration of memory with logic, have been proposed; however, the bandwidth limits the throughput of the complete system. Recent works have proposed memory on logic systems, where high bandwidth memory (HBM) …

View on IEEE Xplore