Convolutional neural networks

SparseCol: A 1320 BTOPS/W Precision-Scalable NPU Exploiting Training-Free Structured Bit-Level Sparsity and Dynamic Dataflow

SparseCol: A 1320 BTOPS/W Precision-Scalable NPU Exploiting Training-Free Structured Bit-Level Sparsity and Dynamic Dataflow 150 150

Abstract:

Bit-serial computation enables sequential processing of data at the bit level, providing several advantages, such as scalable computational precision. This approach has gained significant attention, especially for exploiting bit-level sparsity (BLS) in AI workloads. While current bit-serial processors leverage BLS to eliminate the computation associated with zero bits, they face …

View on IEEE Xplore

An Energy-Efficient CNN Processor Supporting Bi-Directional FPN for Small-Object Detection on High-Resolution Videos in 16-nm FinFET

An Energy-Efficient CNN Processor Supporting Bi-Directional FPN for Small-Object Detection on High-Resolution Videos in 16-nm FinFET 150 150

Abstract:

The capability to detect small objects precisely in real time is essential for intelligent systems, particularly in advanced driver assistance systems (ADASs), as it ensures continuous awareness of distant obstacles for enhanced safety. However, achieving high detection precision for small objects requires high-resolution input inference on deep convolutional neural network (…

View on IEEE Xplore

A Multicore Programmable Variable-Precision Near-Memory Accelerator for CNN and Transformer Models

A Multicore Programmable Variable-Precision Near-Memory Accelerator for CNN and Transformer Models 150 150

Abstract:

Convolutional neural network (CNN) and transformer are the most popular neural network models in computer vision (CV) and natural language processing (NLP). It is quite common to use both these two models in multimodal scenarios, such as text-to-image generation. However, these two models have very different memory mappings, dataflows and …

View on IEEE Xplore