A 11.0-TOPS/W Diffusion Accelerator With Temporal Data Reuse for Real-Time Text-to-Motion Generation
Abstract:
Text-to-motion models are AI systems that generate human motion sequences directly from natural language descriptions, serving as key enablers for immersive virtual avatars and interactive digital humans in AR/VR ecosystems. However, state-of-the-art text-to-motion diffusion models suffer from substantial computational costs due to their iterative nature, making them ill-suited for …