IEEE Solid-State Circuits Directions Series: Think Impact with ICs Workshop on System, Circuit, Device, and Packaging Co-Optimization for Next Generation AI Systems

7 August @ 7:00 pm9:30 pm EDT
Loading Events

Organizers: Atsutake Kosuge, Kyuho Lee

Wednesday, August 7th at 7:00 PM ET



Starting from UTC 23:00 until 26:00
Korea and Japan: 08:00 – 11:00, August 8th
SF: 16:00 – 19:00, August 7th
NYC: 19:00 – 22:00, August 7th


  1. Kunle Olukotun (7:00pm-7:30pm, NYC)
  2. Rio Yokota (7:30pm-8:00pm)
  3. Jinwook Oh (8:00pm-8:30pm)
  4. Pritish Narayanan (8:30pm-9:00pm)
  5. Mihai Dragos Rotaru (9:00pm-9:30pm)

Workshop Abstract:

In the past decade, Artificial Intelligence (AI) has made remarkable progress, from image processing to large language models. This rapid advancement has led to a significant increase in the complexity of AI models, which now require computational capabilities exceeding hundreds of peta-FLOPS. Consequently, there is a growing demand for advancements in AI hardware, encompassing systems, circuits, devices, and packaging. This demand highlights the essential contributions of the Solid-State Circuits Society (SSCS).

In response to these global needs, the upcoming SSCS workshop will focus on the co-optimization of systems, circuits, devices, and packaging for next-generation AI systems. The program features five distinguished speakers.

Firstly, Professor Kunle Olukotun, co-founder of SambaNova Systems and a professor at Stanford University, will share his insights on optimizing computing systems for training. Secondly, Professor Rio Yokota from the Global Scientific Information and Computing Center will discuss low-precision techniques and distributed training methods using supercomputers for large language model training. Thirdly, Dr. Jinwook Oh, CTO of Rebellion, will introduce AI accelerators for latency-critical Machine Learning applications. Fourthly, Dr. Pritish Narayanan from IBM will present on the optimization of Analog Compute-in-Memory accelerators. Lastly, Dr. Mihai Dragos Rotaru from A*STAR will discuss advanced packaging for efficient implementation of heterogeneous system chiplet.

We anticipate that this workshop will provide valuable insights into the comprehensive contributions and future directions of SSCS in realizing the potential of AI.

Kunle Olukotun

Title: Computing Systems and Dataflow Processor in the Foundation Model Era

Abstract: Generative AI applications with their ability to produce natural language, computer code and images are transforming all aspects of society. These applications are powered by huge foundation models such as GTP-4, which have 10s of billions of parameters and are trained on trillions of tokens, have obtained state-of-the-art quality in natural language processing, vision and speech applications. These models are computationally challenging because they require 100s of petaFLOPS of computing capacity for training and inference. Future foundation models will have even greater capabilities provided by more complex model architectures with longer sequence lengths, irregular data access (sparsity) and irregular control flow. In this talk I will describe how the evolving characteristics of foundation models will impact the design of the optimized computing systems required for training and serving these models. A key element of improving the performance and lowering the cost of deploying future foundation models will be optimizing the data movement (Dataflow) within the model using specialized hardware. In contrast to human-in-the-loop applications such as conversational AI, an emerging application of foundation models is in real-time processing applications that operate without human supervision. I will describe how continuous real-time machine learning can be used to create an intelligent network data plane.

Bio: Kunle Olukotun is the Cadence Design Professor of Electrical Engineering and Computer Science at Stanford University. Olukotun is a pioneer in multicore processor design and the leader of the Stanford Hydra chip multiprocessor (CMP) research project. He founded Afara Websystems to develop high-throughput, low-power multicore processors for server systems. The Afara multi-core multi-thread processor, called Niagara, was acquired by Sun Microsystems and now powers Oracle’s SPARC-based servers. Olukotun co-founded SambaNova Systems, a Machine Learning and Artificial Intelligence company, and continues to lead as their Chief Technologist. Olukotun is a member of the National Academy of Engineering, an ACM Fellow, and an IEEE Fellow for contributions to multiprocessors on a chip design and the commercialization of this technology. He received the 2023 ACM-IEEE CS Eckert-Mauchly Award.

Rio Yokota

Title: Large Language Model Training with Low-precision Arithmetic in HPC Systems (Tentative)

Abstract: Pre-training with large datasets is crucial for improving the performance of LLMs. This talk will present the latest techniques in LLM training, focusing on low-precision training like FP16, BF16, and FP8 and distributed training using supercomputers. We will also discuss the latest research results, including a model pre-trained with only 1.58 bits [].

Bio: Rio Yokota is a Professor at the Global Scientific Information and Computing Center, Tokyo Institute of Technology. His research interests lie at the intersection of high performance computing, linear algebra, and machine learning. He is the developer numerous libraries for fast multipole methods (ExaFMM), hierarchical low-rank algorithms (Hatrix), and information matrices in deep learning (ASDFGHJKL) that scale to the full system on the largest supercomputers today. He has been optimizing algorithms on GPUs since 2006, and was part of a team that received the Gordon Bell prize in 2009 using the first GPU supercomputer. Rio is a member of ACM, IEEE, and SIAM.

Jinwook Oh

Title: A Versatile AI Accelerator for Latency Critical ML Applications

Abstract: The growing computational demands of AI inference have led to widespread use of hardware accelerators for different platforms, spanning from edge to the datacenter/cloud. Certain AI application areas have a hard inference latency deadline for successful execution. We present our new AI accelerator which achieves high inference capability with outstanding single-stream responsiveness for demanding service-layer objective (SLO)-based AI services and pipelined inference applications, including large language models (LLM). Thanks to the low thermal design power (TDP) of our chip, the scale-out solution can support multi-stream applications, as well as total cost of ownership (TCO)-centric systems effectively.

Our chip ATOM commercialized for the latest LLM/GenAI models (e.g. LLaMA3, SDXL-Turbo, etc.) while delivering outstanding power efficiency.

Bio: Jinwook Oh received the B.S. degree from Seoul National University, Seoul, South Korea, in 2008, and the M.S. and Ph.D. degrees from the Department of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea, in 2010 and 2013, respectively. He is currently the Lead Architect with Rebellions Inc., Seongnam, South Korea based artificial intelligence (AI) startup, focusing on building AI solutions for Global Finance Market. He joined and left the IBM T. J. Watson Research Laboratory, Yorktown Heights, NY, USA, in 2014 and 2020, respectively, where he worked on different hardware explorations while conducting research/engineering for new AI hardware, algorithms, and systems.

Pritish Narayanan

Title: Device-circuit-system aspects of Analog AI Accelerator Design

Abstract: Recent demonstrations of Analog Compute-in-Memory at ~10M weights have shown near-software equivalence on small but commercially relevant inference tasks. Continued scale-up will require additional device, circuit optimizations trading off error and performance, together with system design that organizes the right number of macros with digital compute and on-chip memory, using fast and energy-efficient mechanisms for on-chip data transport. This talk will provide an overview of recent progress made at IBM towards such Analog CIM accelerators.

Bio: Pritish Narayanan received the Ph.D. degree in ECE from the University of Massachusetts Amherst in 2013, and joined IBM Research –Almaden, San Jose, CA, USA, as a Research Staff Member. His research interests include emerging technologies for logic, nonvolatile memory, and cognitive computing.,He was a recipient of the Best Paper Awards at ISVLSI 2008, IEEE DFT 2010, 2011, and NanoArch 2013, and has reviewed for the IEEE Transactions on Very Large Scale Integration (VLSI) Systems, the IEEE Transactions on Nanotechnology, the ACM Journal on Emerging Technologies in Computing Systems, and several IEEE conferences

Mihai Dragos Rotaru

Title: 2.5D/3D Advanced packaging for heterogeneous system integration

Abstract: The seminar on advanced packaging technologies presents a critical exploration of the evolving landscape in semiconductor packaging, highlighting the increasing importance of 2.5D and 3D hybrid bonding, silicon photonics, and other emerging solutions. This discussion will delve into the fundamentals of advanced packaging, focusing on fan-out wafer level packaging (FOWLP), integrated bridge technology, and 3D integration technology. The chiplets are integrated using a 2.5D FOWLP technology, which enables high-density interconnects and improved performance.

Bio: Dr. Mihai Dragos Rotaru holds the position of Senior Scientist at (IME), A*STAR, in Singapore since Sept 2019. Between 2007 and 2019 Mihai was an Associate Professor with School of Electronics and Computer Science at University of Southampton UK. Prior to that he was with Institute of Microelectronics leading the Electrical and Optical Package design team for 6 years. From 2000 to 2001 he also worked as a Research Assistant with School of Science and Technology at City University London.

Mihai has earned a Bachelor of Engineering and a Master of Science degrees from Technical University of Cluj-Napoca, Romania, in 1996 and 1997, respectively. He was awarded a Ph.D. in electrical engineering in 2000 from University of Southampton, UK.

His research interests include signal and power integrity of heterogenous integrated systems on advanced packaging technology platforms for high performance computing applications. Dr. Rotaru has authored and co-authored more than 120 peer-reviewed journal and conference papers. He has served as TPC member of Electronics Packaging Technology Conference (EPTC).


7 August
7:00 pm – 9:30 pm EDT




Danielle Marinese