quantization

ASAP: A 28-nm Transformer Training Accelerator With Alternating Sparsity and Asymmetrical Microscaling Precision

ASAP: A 28-nm Transformer Training Accelerator With Alternating Sparsity and Asymmetrical Microscaling Precision 150 150

Abstract:

This work presents ASAP, a 28-nm transformer-training accelerator that combines N:M structured sparsity with asymmetric microscaling floating-point (MXFP) precision through a unified algorithm–hardware co-design. ASAP introduces a progressive sparsity schedule in which pruned compute resources are reassigned to increase numerical precision for important weights and activations, stabilizing optimization …

View on IEEE Xplore

DPIM: A 2T1C eDRAM Transformer-in-Memory Chip With Sparsity-Aware Quantization and Heterogeneous Dense–Sparse Core

DPIM: A 2T1C eDRAM Transformer-in-Memory Chip With Sparsity-Aware Quantization and Heterogeneous Dense–Sparse Core 150 150

Abstract:

Transformer models have revolutionized artificial intelligence (AI) applications across various domains, but their increasing complexity poses significant challenges in terms of computational and memory demands. While processing-in-memory (PIM) paradigms have been adopted to address these limitations, existing PIM-based transformer accelerators still face hurdles such as: 1) focusing solely on optimizing attention …

View on IEEE Xplore