ASAP: A 28-nm Transformer Training Accelerator With Alternating Sparsity and Asymmetrical Microscaling Precision https://sscs.ieee.org/wp-content/themes/movedo/images/empty/thumbnail.jpg 150 150 https://secure.gravatar.com/avatar/8fcdccb598784519a6037b6f80b02dee03caa773fc8d223c13bfce179d70f915?s=96&d=mm&r=g
Abstract:
This work presents ASAP, a 28-nm transformer-training accelerator that combines N:M structured sparsity with asymmetric microscaling floating-point (MXFP) precision through a unified algorithm–hardware co-design. ASAP introduces a progressive sparsity schedule in which pruned compute resources are reassigned to increase numerical precision for important weights and activations, stabilizing optimization …