Abstract:
Compute in memory (CiM) accelerators perform matrix vector multiplications (MVMs) directly inside memory arrays, reducing data movement and improving both energy efficiency and throughput for AI workloads. To reduce the number of conversions, recent designs use multi-bit compute cells. Nevertheless, practical multi-bit CiM still faces a tension between accuracy, efficiency, …