Random access memory

An Electrophysiology-Optogenetics Closed-Loop Bi-Directional Neural Interface for Sleep Regulation With 0.2-μJ/class Multiplexer-Based Neural Network

An Electrophysiology-Optogenetics Closed-Loop Bi-Directional Neural Interface for Sleep Regulation With 0.2-μJ/class Multiplexer-Based Neural Network 150 150

Abstract:

This work proposed a multiplexer-based neural network (MUXnet), a multiplexer-based, multiplier-free neural network (NN) structure applicable to the implementation of all inner product-based NN layers. An on-chip MUXnet-based neural signal processing unit (NSPU) was designed, achieving a state-of-the-art accuracy of 82.4% on a public human sleep staging dataset, with the lowest …

View on IEEE Xplore

A 3-D HBI Compliant 1.536 TB/s/mm2 Bandwidth Scalable Attention Accelerator With 22.5-GOPS Throughput High Speed SoftMax for Quantized Transformers in Intel 3

A 3-D HBI Compliant 1.536 TB/s/mm2 Bandwidth Scalable Attention Accelerator With 22.5-GOPS Throughput High Speed SoftMax for Quantized Transformers in Intel 3 150 150

Abstract:

This letter presents a novel hardware accelerator compatible with <3- $\mu $ m pitch 3-D Cu-Cu hybrid bonding interconnect (HBI) technology, particularly designed to efficiently execute multihead attention (MHA) of encoder transformer models. We present an accelerator that addresses performance losses due to low precision models by incorporating specialized hardware optimizations …

View on IEEE Xplore

A 14-nm Nonvolatile-Volatile-Fused Compute-In-Memory Macro Based on Logic-Compatible Flash for Plastic Neural Networks

A 14-nm Nonvolatile-Volatile-Fused Compute-In-Memory Macro Based on Logic-Compatible Flash for Plastic Neural Networks 150 150

Abstract:

Designing computing-in-memory (CIM) chips with synaptic plasticity can potentially support energy-efficient on-chip learning in edge devices for rapid local task adaptation. Its silicon implementation is challenging as it requires hybridizing nonvolatile and volatile memory (VM) and customized computational operations. In this work, we propose a plastic CIM (P-CIM) macro featuring: 1) …

View on IEEE Xplore

A Standalone-in-Memory Voltage Crossover-Based Assist Switching Circuit for Reliable and Efficient Process Tracking Memory Vmin Improvement in Intel 18A-RibbonFET Technology

A Standalone-in-Memory Voltage Crossover-Based Assist Switching Circuit for Reliable and Efficient Process Tracking Memory Vmin Improvement in Intel 18A-RibbonFET Technology 150 150

Abstract:

Advanced CMOS memory requires voltage biasing assist techniques to achieve low operating voltages (Vmin), which must be deactivated at higher voltages for high electric field reliability. Centralized power management unit (PMU) control signals face timing synchronization and process tracking challenges when distributed across cores to activate assist circuits in various …

View on IEEE Xplore

A 3 nm FinFET 125 TOPS/W-29 TFLOPS/W, 90 TOPS/mm2-17 TFLOPS/mm2 SRAM-Based INT8, and FP16 Digital-CIM Compiler With Support for Multi-Weight Update/Cycle

A 3 nm FinFET 125 TOPS/W-29 TFLOPS/W, 90 TOPS/mm2-17 TFLOPS/mm2 SRAM-Based INT8, and FP16 Digital-CIM Compiler With Support for Multi-Weight Update/Cycle 150 150

Abstract:

This article presents an static random-access memory (SRAM)-based digital compute-in-memory (CIM) compiler implemented with 3 nm high- $\kappa $ metal gate (HKMG) FinFET technology, supporting flexible INT8 and FP16 formats for weight and activation multiply-accumulate (MAC) operations, offering configuration flexibility, high accuracy, and improved area and power efficiency. The FP16 digital …

View on IEEE Xplore

An Energy-Efficient CNN Processor Supporting Bi-Directional FPN for Small-Object Detection on High-Resolution Videos in 16-nm FinFET

An Energy-Efficient CNN Processor Supporting Bi-Directional FPN for Small-Object Detection on High-Resolution Videos in 16-nm FinFET 150 150

Abstract:

The capability to detect small objects precisely in real time is essential for intelligent systems, particularly in advanced driver assistance systems (ADASs), as it ensures continuous awareness of distant obstacles for enhanced safety. However, achieving high detection precision for small objects requires high-resolution input inference on deep convolutional neural network (…

View on IEEE Xplore

A Multicore Programmable Variable-Precision Near-Memory Accelerator for CNN and Transformer Models

A Multicore Programmable Variable-Precision Near-Memory Accelerator for CNN and Transformer Models 150 150

Abstract:

Convolutional neural network (CNN) and transformer are the most popular neural network models in computer vision (CV) and natural language processing (NLP). It is quite common to use both these two models in multimodal scenarios, such as text-to-image generation. However, these two models have very different memory mappings, dataflows and …

View on IEEE Xplore

SHINSAI: A 586 mm2 Reusable Active TSV Interposer With Programmable Interconnect Fabric and 512 Mb Underdeck Memory

SHINSAI: A 586 mm2 Reusable Active TSV Interposer With Programmable Interconnect Fabric and 512 Mb Underdeck Memory 150 150

Abstract:

This article presents SHINSAI—a 586 mm2 reusable active through-silicon via (TSV) interposer addressing key challenges in multi-chiplet integration (MCI) architectures. While active interposers overcome fundamental limitations of passive counterparts by integrating functional circuitry, existing solutions face three critical constraints: 1) non-recurring engineering (NRE) costs from application-specific interposers negating chiplet reuse benefits; 2) …

View on IEEE Xplore

A 0.8-μm 32-Mpixel Always-On CMOS Image Sensor With Windmill-Pattern Edge Extraction and On-Chip DNN

A 0.8-μm 32-Mpixel Always-On CMOS Image Sensor With Windmill-Pattern Edge Extraction and On-Chip DNN 150 150

Abstract:

This letter presents a CMOS image sensor (CIS) that integrates two operation modes: 1) a high-resolution viewing mode with $0.8~\mu $ m 32 Mpixels and 2) a low-power always-on object recognition mode consuming 2.67 mW at 10 frames/s. The CIS features a unique windmill-pattern analog edge extraction circuit that is resilient to illumination variations. An …

View on IEEE Xplore