Loading Events
Virtual Event
  • This event has passed.

Abstract: Computing-in-memory (CiM), in conjunction with ultra-low precision (ULP) inputs and weights, promises a pathway towards the deployment of DNN workloads on resource-constrained edge platforms. For instance, CiM-enabled binary neural network (BNN) accelerators (with inputs and weights quantized to two states: {-1,1}) have shown potential by significantly reducing storage and energy cost while maintaining acceptable inference accuracies in certain tasks. More complex workloads, however, mandate some increase in the weight/input precision. To that end, ternary weight neural networks (TWNs), with weights quantized to three states: {−1, 0, and 1}, have been shown to achieve acceptable accuracy, while still maintaining the benefits of high energy efficiency. However, memory arrays typically designed for CiM are plagued with hardware non-idealities (like parasitic resistances) and device non-linearities that significantly impair inference accuracy, especially in scaled technologies. Therefore, translating the promise of high resource efficiency of ULP-DNNs to practical edge computing systems requires alleviation of such non-idealities through cross-layer design.

In this talk, we present non-ideality mitigation techniques for both BNNs and TWNs, which are broadly based on weight and input transformations (WIT). We begin by analyzing the impact of hardware non-idealities on the inference accuracy of CiM-BNNs, establishing that the unique properties of CiM-BNNs make them more prone to non-idealities compared to higher precision DNNs. To mitigate these non-ideal impacts, we propose a training-free technique called TWINN to reduce the average crossbar-array current generated during CiM by statically and dynamically flipping the BNN weights and inputs, respectively, while maintaining the matrix-vector-multiplication (MVM) functionality. This mitigates the adverse impact of parasitic resistances on inference accuracy. Our analyses on ResNet-18 and VGG-small CiM-BNNs in 7nm technology node show that, TWINN recoups the inference accuracy to near-ideal (software) levels in several design points considered. These benefits are accompanied by energy reduction, albeit at the cost of mild latency/area increase.

In the second part of the talk, we present the application of WIT to CiM-enabled TWNs utilizing emerging technologies such as Ferroelectric transistors (FeFETs). As FeFETs demonstrate multi- weight storage capabilities, they can potentially enable highly scalable TWN-CiM acceleration. However, standard 3-level FeFET-based CiM is severely vulnerable to hardware non-idealities at deeply scaled technology nodes. To address this, we employ WIT for increasing the computational robustness. Using our rigorous cross-layer simulation framework based on phase field models for FeFETs, we evaluate the proposed design and benchmark against existing FeFET solutions. We show that WIT significantly recovers inference accuracy for 1T-FeFET by >50% for ResNet18 TWN- CiM on CIFAR100. Compared to existing FeFET-based CiM designs, the proposed solution drastically reduces CiM macro-area and energy at similar inference accuracy.

Register here: https://ieee.webex.com/weblink/register/rc4a9706dcc3900ebc420111788dba4ea

Bios:

Sumeet Kumar Gupta is an Elmore Associate Professor of Electrical and Computer Engineering at Purdue University. Prior to this, he was an Assistant professor of Electric Engineering at The Pennsylvania State University from 2014 to 2017 and a Senior Engineer at Qualcomm Inc. from 2012 to 2014. Dr. Gupta received the B. Tech. degree in Electrical Engineering from the Indian Institute of Technology, Delhi, India in 2006, and the M.S. and Ph.D. degrees in Electrical and Computer Engineering from Purdue University, West Lafayette IN in 2008 and 2012, respectively.  His research interests include low power VLSI circuit design, in-memory computing, AI hardware design, nano-electronics and spintronics, device-circuit co-design and nano-scale device modeling and simulations. He has published over 150 articles in refereed journals and conferences and is a senior member of IEEE and EDS. Dr. Gupta was the recipient of DARPA Young Faculty Award in 2016, Early Career Professorships by Purdue and Penn State in 2021 and 2014, respectively, and the 6th TSMC Outstanding Student Research Bronze Award in 2012. He has also received Magoon Award and the Outstanding Teaching Assistant Award from Purdue University in 2007 and Intel PhD Fellowship in 2009.

Imtiaz Ahmed completed his B.Sc. in Electrical and Electronic Engineering from Bangladesh University of Engineering and Technology (BUET), Dhaka, Bangladesh in 2019, with a concentration in Electronics. Currently he is pursuing Ph.D. in Electrical and Computer Engineering at Purdue University, conducting research at the Integrated Circuits & Devices Lab (ICDL). His research revolves around efficient AI hardware design, with a keen focus on emerging memory technologies, in-memory computing, and robust deep neural network accelerator design. His work aims to bridge the gap between semiconductor innovation and next-generation AI accelerators, developing energy-efficient, high-performance, and scalable AI hardware. Additionally, Imtiaz is an SRC Research Scholar affiliated with CoCoSys: Center for the Co-Design of Cognitive Systems, under Joint University Microelectronics Program 2.0 (JUMP2.0)- a Semiconductor Research Corporation (SRC) program sponsored by the Defense Advanced Research Projects Agency (DARPA). Prior to joining Purdue University, Imtiaz worked as a Lecturer at BRAC University, Bangladesh, for 2.5 years, where he taught and coordinated courses on Electromagnetic Waves and Fields, Microprocessors, Embedded Systems, and VLSI Design Lab. He was also actively involved in curriculum redesign and the integration of Outcome-Based Education (OBE) frameworks to align coursework with industry and research advancements

Details

Organizer

  • SSCS Technical Webinars
  • Email