Training

A 7.5-μW 35-Keyword End-to-End Keyword Spotting System With Random Augmented On-Chip Training

A 7.5-μW 35-Keyword End-to-End Keyword Spotting System With Random Augmented On-Chip Training 150 150

Abstract:

Fully integrated keyword spotting (KWS) systems designed for low-power operation face two major challenges. First, increasing the number of supported keywords significantly raises system complexity and power consumption. Second, most existing systems are not personalized to individual users, as they are trained on data from native English speakers, leading to …

View on IEEE Xplore

Chameleon: A Multiplier-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential Data

Chameleon: A Multiplier-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential Data 150 150

Abstract:

On-device learning at the edge enables low-latency, private personalization with improved long-term robustness and reduced maintenance costs. Yet, achieving scalable, low-power (LP) end-to-end on-chip learning, especially from real-world sequential data with a limited number of examples, is an open challenge. Indeed, accelerators supporting error backpropagation optimize for learning performance at …

View on IEEE Xplore

SparseCol: A 1320 BTOPS/W Precision-Scalable NPU Exploiting Training-Free Structured Bit-Level Sparsity and Dynamic Dataflow

SparseCol: A 1320 BTOPS/W Precision-Scalable NPU Exploiting Training-Free Structured Bit-Level Sparsity and Dynamic Dataflow 150 150

Abstract:

Bit-serial computation enables sequential processing of data at the bit level, providing several advantages, such as scalable computational precision. This approach has gained significant attention, especially for exploiting bit-level sparsity (BLS) in AI workloads. While current bit-serial processors leverage BLS to eliminate the computation associated with zero bits, they face …

View on IEEE Xplore

Antiferromagnetic Programmable Neuron: Structure, Training, and Pattern Recognition Applications

Antiferromagnetic Programmable Neuron: Structure, Training, and Pattern Recognition Applications 150 150

Abstract:

Artificial neurons based on antiferromagnetic (AFM) spin Hall oscillators (SHOs) are promising elements for creating ultrafast, energy-efficient neuromorphic computing systems. These structures can generate picosecond spikes in response to dc and ac electric currents, thereby mimicking the reaction of biological neurons to an external stimulus. However, conventional AFM neurons have …

View on IEEE Xplore

Space-Mate: A 303.5-mW Real-Time Sparse Mixture-of-Experts-Based NeRF-SLAM Processor for Mobile Spatial Computing

Space-Mate: A 303.5-mW Real-Time Sparse Mixture-of-Experts-Based NeRF-SLAM Processor for Mobile Spatial Computing 150 150

Abstract:

Simultaneous localization and mapping (SLAM) provides crucial ego-pose information and 3-D maps of the user environment, which are fundamental to emerging mobile spatial computing devices. Dense 3-D mapping and accurate pose estimation are particularly necessary for applications like augmented reality (AR) and autonomous navigation. However, existing SLAM processors are typically …

View on IEEE Xplore

Energy-Efficient Reconfigurable XGBoost Inference Accelerator With Modular Unit Trees via Selective Node Execution and Data Movement

Energy-Efficient Reconfigurable XGBoost Inference Accelerator With Modular Unit Trees via Selective Node Execution and Data Movement 150 150

Abstract:

The extreme gradient boosting (XGBoost) has emerged as a powerful AI algorithm, achieving high accuracy and winning multiple Kaggle competitions in various tasks including medical diagnosis, recommendation systems, and autonomous driving. It has great potential for running on edge devices due to its binary tree-based simple computing kernel, offering unique …

View on IEEE Xplore

An 8.62-μW 75-dB DRSoC Fully Integrated SoC for Spoken Language Understanding

An 8.62-μW 75-dB DRSoC Fully Integrated SoC for Spoken Language Understanding 150 150

Abstract:

We present a sub-10- $\mu $ W fully integrated SoC for on-device spoken language understanding (SLU). Its analog feature extractor (FEx) applies global and per-channel automatic gain control (AGC) to extend the system’s dynamic range (DR)—a critical requirement for real-world scenarios, including far-field operations. The on-chip streaming-mode recurrent …

View on IEEE Xplore

LIF Neuron Based on a Charge-Powered Ring Oscillator in Weak Inversion Achieving 201 fJ/SOP

LIF Neuron Based on a Charge-Powered Ring Oscillator in Weak Inversion Achieving 201 fJ/SOP 150 150

Abstract:

This letter presents the experimental results of a leaky-integrate-and-fire neuron (LIF) neuron based on time-domain analog circuitry. This kind of neuron is the core of spiking neural network (SNN) used in edge applications. Edge applications require power-efficient neuron designs whose power consumption is extremely low when idle, and low when …

View on IEEE Xplore

A 160-Gb/s D-Band Bi-Directional CMOS Mixer Covering 112–170 GHz for 6G Transceivers

A 160-Gb/s D-Band Bi-Directional CMOS Mixer Covering 112–170 GHz for 6G Transceivers 150 150

Abstract:

This work presents a D-band bi-directional CMOS double-balanced mixer (DBM) supporting data rates over 160 Gb/s with a 58-GHz RF bandwidth (112–170 GHz). The mixer employs four identical NMOS passive switches ( $12~\mu $ m/60 nm) in a DBM topology, providing the isolation between RF, LO, and IF ports. Both IF and RF …

View on IEEE Xplore