“Bringing Adventure Gaming to Life Using Real-time Generative AI on Your PC” by Long, Kumar, Cheruvu, Ramos, Pastushenkov, et al. …
Conference:
Type(s):
Title:
- Bringing Adventure Gaming to Life Using Real-time Generative AI on Your PC
Session/Category Title: XR in Practice
Presenter(s)/Author(s):
Abstract:
Imagine a new kind of tabletop gaming experience, where a narrator is describing a complex fantasy world, and players gathered around the table can see the events of their world unfolding in real-time, on their PC devices. In this talk, we will show attendees how to execute multi-modal Generative AI (Gen AI) models, running on a PC, in real-time to create immersive scenery, followed by an interactive live demonstration. This talk walks through the optimization of Gen AI modalities, chaining audio transcription and diffusion models, together in real-time. We compress these models with the OpenVINO™ Toolkit and leverage the Intel® Core™ Ultra processor to split these workloads across CPU, integrated GPU, and Neural Processing Unit (NPU), getting the best performance for each model. We also cover exciting new developments with temporal-consistent and depth estimation approaches towards high-resolution, 3D, pop-up, scenery generation. This talk equips participants with hands-on tools to resolve challenges with real-time, high-quality gaming scenery generation on their PC for gaming.
References:
[1]
Jose Ma. Santiago III, Richard Lance Parayno, Jordan Aiko Deja, and Briane Paul V. Samson. 2023. Rolling the Dice: Imagining Generative AI as a Dungeons & Dragons Storytelling Companion. arXiv:2304.01860. Retrieved from https://arxiv.org/abs/2304.01860
[2]
Chris Callison-Burch, Gaurav Singh Tomar, Lara Martin, Daphne Ippolito, Suma Bailis, and David Reitter. 2022. Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9379–9393, Abu Dhabi, United Arab Emirates. Association for Computational
[3]
Tuul Triyason. 2023. Exploring the Potential of ChatGPT as a Dungeon Master in Dungeons & Dragons tabletop game. In 13th International Conference on Advances in Information Technology (IAIT 2023), December 06–09, 2023, Bangkok, Thailand. ACM, New York, NY, USA 6 Pages. https://doi.org/10.1145/3628454.3628457
[4]
Omer Bar-Tal, Hila Chefer, Omer Tov, Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa Hur, Guanghui Liu, Amit Raj, Yuanzhen Li, Michael Rubinstein, Tomer Michaeli, Oliver Wang, Deqing Sun, Tali Dekel, Inbar Mosseriet. al. 2024. Lumiere: A Space-Time Diffusion Model for Video Generation. arXiv:2401.12945. Retrieved from https://arxiv.org/abs/2401.12945
[5]
Tim Brooks, Bill Peebles, Connor Holmes, Will DePue, Yufei Guo, Li Jing, David Schnurr, Joe Taylor, Troy Luhman, Eric Luhman, Clarence Ng, Ricky Wang, Aditya Ramesh. 2024. Video generation models as world simulators. (Feb 2024). Retrieved Feb 21, 2024 from https://openai.com/research/video-generation-models-as-world-simulators
[6]
Zhuo Wu and Raymond Lo. 2023. Connecting the Real-World and Metaverse via Real-Time Digitization with Intel Arc. In SIGGRAPH Asia 2022 Courses (SA ’22). Association for Computing Machinery, New York, NY, USA, Article 3, 1–14. https://doi.org/10.1145/3550495.3558224
[7]
Simian Luo, Yiqin Tan, Longbo Huang, Jian Li, and Hang Zhao. 2023. Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference. arXiv:2310.04378. Retrieved from https://arxiv.org/abs/2310.04378
[8]
Sanchit Gandhi, Patrick von Platen, and Alexander M. Rush. 2023. Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling. arXiv:2311.00430. Retrieved from https://arxiv.org/abs/2311.00430
[9]
Alexander Kozlov, Ivan Lazarevich, Vasily Shamporov, Nikolay Lyalyushkin, and Yury Gorbachev. (2021). Neural Network Compression Framework for Fast Model Inference. In: Arai, K. (eds) Intelligent Computing. Lecture Notes in Networks and Systems, vol 285. Springer, Cham. https://doi.org/10.1007/978-3-030-80129-8_17
[10]
Nikita Savelyev, Alexander Kozlov, Ekaterina Aidova, and Maxim Proshin. “Optimizing Whisper and Distil-Whisper for Speech Recognition with OpenVINO™and NNCF,” Intel Corporation, 29 Jan. 2024. Available online: Intel Community Blog. [Accessed 19 February 2024].
[11]
Liubov Talamanova, Ekaterina Aidova, and Alexander Kozlov. “Optimizing Latent Consistency Model for Image Generation with OpenVINO™ and NNCF,” Intel Corporation, 23 Nov. 2023. Available online: Intel Community Blog. [Accessed 19 February 2024].
[12]
Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, and Hengshuang Zhao. 2024. Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. arXiv:2401.10891. Retrieved from https://arxiv.org/abs/2401.10891