MITTA: A Multi-Task Transformer Accelerator With Mixed Precision Structured Sparsity and Hierarchical Task-Adaptive Power Management https://sscs.ieee.org/wp-content/themes/movedo/images/empty/thumbnail.jpg 150 150 https://secure.gravatar.com/avatar/8fcdccb598784519a6037b6f80b02dee03caa773fc8d223c13bfce179d70f915?s=96&d=mm&r=g
Abstract:
This article presents MITTA, the first silicon-proven transformer accelerator optimized for multi-task inference across both natural language processing (NLP) and image processing domains. MITTA accelerates a task-sharing algorithm that minimizes sub-task computation by reusing both activations and weights from a shared base task, requiring only sparse delta computation for sub-tasks. …