Modeling

STAR-SRAM: 16-bit Floating-Point SRAM-Based Digital Computing-in-Memory Macro in a 28 nm

STAR-SRAM: 16-bit Floating-Point SRAM-Based Digital Computing-in-Memory Macro in a 28 nm 150 150

Abstract:

A digital computing-in-memory (DCIM) macro emerges as a promising building block in a deep neural network (DNN) accelerator. To better support DNN workloads, circuit designers aim to improve three main metrics for macros: energy efficiency, compute density, and weight density. Improvements in those metrics directly translate into reduced energy consumption, …

View on IEEE Xplore

A Cryogenic Superconducting Quantum Computing Unit Interface Chipset With Phase-Detection-Based Readout and Phase-Shifter-Based Controller

A Cryogenic Superconducting Quantum Computing Unit Interface Chipset With Phase-Detection-Based Readout and Phase-Shifter-Based Controller 150 150

Abstract:

This article presents a cryogenic quantum interface chipset at 3.5 K for superconducting transmon qubit operations. The chipset comprises a phase-detection readout and a phase-shifter-based polar-modulation controller with flexible scalability. With the proposed phase-detection readout scheme, a 9-bit time-to-digital converter (TDC)-based state detector is used to read out the qubit …

View on IEEE Xplore

ASAP: A 28-nm Transformer Training Accelerator With Alternating Sparsity and Asymmetrical Microscaling Precision

ASAP: A 28-nm Transformer Training Accelerator With Alternating Sparsity and Asymmetrical Microscaling Precision 150 150

Abstract:

This work presents ASAP, a 28-nm transformer-training accelerator that combines N:M structured sparsity with asymmetric microscaling floating-point (MXFP) precision through a unified algorithm–hardware co-design. ASAP introduces a progressive sparsity schedule in which pruned compute resources are reassigned to increase numerical precision for important weights and activations, stabilizing optimization …

View on IEEE Xplore

A BEV Perception Transformer Accelerator With Saliency-Driven Image/Point Cloud Fusion and Phase-Linked Dataflow in 28 nm CMOS

A BEV Perception Transformer Accelerator With Saliency-Driven Image/Point Cloud Fusion and Phase-Linked Dataflow in 28 nm CMOS 150 150

Abstract:

Deploying advanced Transformer-based models for real-time, high-accuracy multimodal bird’s-eye-view (BEV) perception in autonomous driving imposes substantial hardware demands. To address this, we propose a low-cost, low-power image/point-cloud fusion Transformer accelerator that supports two modes: high-performance driving and ultra-low-power sentry operation. We first propose a cross-modal saliency evaluation mechanism …

View on IEEE Xplore

A 11.0-TOPS/W Diffusion Accelerator With Temporal Data Reuse for Real-Time Text-to-Motion Generation

A 11.0-TOPS/W Diffusion Accelerator With Temporal Data Reuse for Real-Time Text-to-Motion Generation 150 150

Abstract:

Text-to-motion models are AI systems that generate human motion sequences directly from natural language descriptions, serving as key enablers for immersive virtual avatars and interactive digital humans in AR/VR ecosystems. However, state-of-the-art text-to-motion diffusion models suffer from substantial computational costs due to their iterative nature, making them ill-suited for …

View on IEEE Xplore