“Physical Non-inertial Poser (PNP): Modeling Non-inertial Effects in Sparse-inertial Human Motion Capture”
Conference:
Type(s):
Title:
- Physical Non-inertial Poser (PNP): Modeling Non-inertial Effects in Sparse-inertial Human Motion Capture
Presenter(s)/Author(s):
Abstract:
We propose a novel approach for human motion capture using sparse inertial sensors, considering the non-inertial behavior of the root frame during acceleration and rotation. By modeling fictitious forces and using a novel IMU measurement synthesis technique, accurate motion capture is achieved with improved robustness.
References:
[1]
Karan Ahuja, Eyal Ofek, Mar Gonzalez-Franco, Christian Holz, and Andrew D Wilson. 2021. Coolmoves: User motion accentuation in virtual reality. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 2 (2021), 1?23.
[2]
Sadegh Aliakbarian, Pashmina Cameron, Federica Bogo, Andrew Fitzgibbon, and Thomas J Cashman. 2022. Flag: Flow-based 3d avatar generation from sparse observations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13253?13262.
[3]
Sadegh Aliakbarian, Fatemeh Saleh, David Collier, Pashmina Cameron, and Darren Cosker. 2023. HMD-NeMo: Online 3D Avatar Motion Generation From Sparse Observations. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 9622?9631.
[4]
Carlos Campos, Richard Elvira, Juan J. G?mez, Jos? M. M. Montiel, and Juan D. Tard?s. 2021. ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM. IEEE Transactions on Robotics 37, 6 (2021), 1874?1890.
[5]
Angela Castillo, Maria Escobar, Guillaume Jeanneret, Albert Pumarola, Pablo Arbel?ez, Ali Thabet, and Artsiom Sanakoyeu. 2023. BoDiffusion: Diffusing Sparse Observations for Full-Body Human Motion Synthesis. arXiv preprint arXiv:2304.11118 (2023).
[6]
Andrea Dittadi, Sebastian Dziadzio, Darren Cosker, Ben Lundell, Thomas J Cashman, and Jamie Shotton. 2021. Full-body motion from a single head-mounted device: Generating smpl poses from partial observations. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 11687?11697.
[7]
Yuming Du, Robin Kips, Albert Pumarola, Sebastian Starke, Ali Thabet, and Artsiom Sanakoyeu. 2023. Avatars grow legs: Generating smooth human motion from sparse tracking inputs with diffusion model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 481?490.
[8]
Martin Felis. 2017. RBDL: an efficient rigid-body dynamics library using recursive algorithms. Autonomous Robots 41 (02 2017).
[9]
Tamar Flash and Neville Hogan. 1985. The Coordination of Arm Movements: An Experimentally Confirmed Mathematical Model. The Journal of neuroscience : the official journal of the Society for Neuroscience 5 (08 1985).
[10]
Sepp Hochreiter and J?rgen Schmidhuber. 1997. Long Short-term Memory. Neural computation 9 (12 1997).
[11]
Yinghao Huang, Manuel Kaufmann, Emre Aksan, Michael J. Black, Otmar Hilliges, and Gerard Pons-Moll. 2018. Deep Inertial Poser Learning to Reconstruct Human Pose from SparseInertial Measurements in Real Time. ACM Transactions on Graphics, (Proc. SIGGRAPH Asia) 37 (nov 2018).
[12]
Jiaxi Jiang, Paul Streli, Manuel Meier, Andreas Fender, and Christian Holz. 2023. EgoPoser: Robust Real-Time Ego-Body Pose Estimation in Large Scenes. arXiv preprint arXiv:2308.06493 (2023).
[13]
Jiaxi Jiang, Paul Streli, Huajian Qiu, Andreas Fender, Larissa Laich, Patrick Snape, and Christian Holz. 2022a. Avatarposer: Articulated full-body pose tracking from sparse motion sensing. In Computer Vision?ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23?27, 2022, Proceedings, Part V. Springer, 443?460.
[14]
Yifeng Jiang, Yuting Ye, Deepak Gopinath, Jungdam Won, Alexander W. Winkler, and C. Karen Liu. 2022b. Transformer Inertial Poser: Real-Time Human Motion Reconstruction from Sparse IMUs with Simultaneous Terrain Generation. In SIGGRAPH Asia 2022 Conference Papers.
[15]
Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12 2014).
[16]
Hyeokhyen Kwon, Catherine Tong, Harish Haresamudram, Yan Gao, Gregory D Abowd, Nicholas D Lane, and Thomas Ploetz. 2020. Imutube: Automatic extraction of virtual on-body accelerometry from video for human activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 3 (2020), 1?29.
[17]
Lev Davidovich Landau and Evgenii Mikhailovich Lifshitz. 1976. Mechanics. Vol. 1 (3rd ed.). Course of Theoretical Physics. Butterworth?Heinemann. 126?130 pages.
[18]
Jiye Lee and Hanbyul Joo. 2024. Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera. arXiv preprint arXiv:2401.00847 (2024).
[19]
Sunmin Lee, Sebastian Starke, Yuting Ye, Jungdam Won, and Alexander Winkler. 2023. QuestEnvSim: Environment-Aware Simulated Motion Tracking from Sparse Sensors. arXiv preprint arXiv:2306.05666 (2023).
[20]
Han Liang, Yannan He, Chengfeng Zhao, Mutian Li, Jingya Wang, Jingyi Yu, and Lan Xu. 2023. Hybridcap: Inertia-aid monocular capture of challenging human motions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 1539?1548.
[21]
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A Skinned Multi-Person Linear Model. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 34 (oct 2015).
[22]
Naureen Mahmood, Nima Ghorbani, Nikolaus F. Troje, Gerard Pons-Moll, and Michael J. Black. 2019. AMASS: Archive of Motion Capture as Surface Shapes. In The IEEE International Conference on Computer Vision (ICCV).
[23]
Vimal Mollyn, Riku Arakawa, Mayank Goel, Chris Harrison, and Karan Ahuja. 2023. IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1?12.
[24]
Raul Mur-Artal and Juan D Tard?s. 2017. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE transactions on robotics 33, 5 (2017), 1255?1262.
[25]
Noitom. [n. d.]. Perception Neuron series. Website. https://www.noitom.com/.
[26]
Christopher C. Paige and Michael A. Saunders. 1982. LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares. ACM Trans. Math. Softw. 8, 1 (mar 1982), 43?71. https://doi.org/10.1145/355984.355989
[27]
Shaohua Pan, Qi Ma, Xinyu Yi, Weifeng Hu, Xiong Wang, Xingkang Zhou, Jijunnan Li, and Feng Xu. 2023. Fusing Monocular Images and Sparse IMU Signals for Real-time Human Motion Capture. In SIGGRAPH Asia 2023 Conference Papers. 1?11.
[28]
Gerard Pons-Moll, Andreas Baak, Thomas Helten, Meinard M?ller, Hans-Peter Seidel, and Bodo Rosenhahn. 2010. Multisensor-fusion for 3D full-body human motion capture. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[29]
Jose Luis Ponton, Haoran Yun, Andreas Aristidou, Carlos Andujar, and Nuria Pelechano. 2023. SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data. ACM Transactions on Graphics 43, 1 (2023), 1?14.
[30]
Pytorch. [n. d.]. Pytorch. Website. https://pytorch.org/.
[31]
Tong Qin, Peiliang Li, and Shaojie Shen. 2018. Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics 34, 4 (2018), 1004?1020.
[32]
Vitor Fortes Rey, Peter Hevesi, Onorina Kovalenko, and Paul Lukowicz. 2019. Let there be IMU data: generating training data for wearable, motion sensor based activity recognition from monocular RGB videos. In Adjunct proceedings of the 2019 ACM international joint conference on pervasive and ubiquitous computing and proceedings of the 2019 ACM international symposium on wearable computers. 699?708.
[33]
Myungjin Shin, Dohae Lee, and In-Kwon Lee. 2023. Utilizing Task-Generic Motion Prior to Recover Full-Body Motion from Very Sparse Signals. arXiv preprint arXiv:2308.15839 (2023).
[34]
Isaac Skog, Peter Handel, John-Olof Nilsson, and Jouni Rantakokko. 2010. Zero-velocity detection?An algorithm evaluation. IEEE transactions on biomedical engineering 57, 11 (2010), 2657?2666.
[35]
Joan Sola. 2017. Quaternion kinematics for the error-state Kalman filter. arXiv preprint arXiv:1711.02508 (2017).
[36]
Shingo Takeda, Tsuyoshi Okita, Paula Lago, and Sozo Inoue. 2018. A multi-sensor setting activity recognition simulation tool. In Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers. 1444?1448.
[37]
Matthew Trumble, Andrew Gilbert, Charles Malleson, Adrian Hilton, and John Collomosse. 2017. Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors. In 2017 British Machine Vision Conference (BMVC).
[38]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ?ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[39]
Vicon. [n. d.]. Award Winning Motion Capture Systems. Website. https://www.vicon.com/.
[40]
Timo von Marcard, Bodo Rosenhahn, Michael Black, and Gerard Pons-Moll. 2017. Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs. Computer Graphics Forum 36(2), Proceedings of the 38th Annual Conference of the European Association for Computer Graphics (Eurographics) (2017).
[41]
Alexander Winkler, Jungdam Won, and Yuting Ye. 2022. QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars. In SIGGRAPH Asia 2022 Conference Papers. 1?8.
[42]
Fanyi Xiao, Ling Pei, Lei Chu, Danping Zou, Wenxian Yu, Yifan Zhu, and Tao Li. 2021. A deep learning method for complex human activity recognition using virtual wearable sensors. In Spatial Data and Intelligence: First International Conference, SpatialDI 2020, Virtual Event, May 8?9, 2020, Proceedings 1. Springer, 261?270.
[43]
Xsens. [n. d.]. Xsens 3D motion tracking. Website. https://www.xsens.com/.
[44]
Dongseok Yang, Doyeon Kim, and Sung-Hee Lee. 2021. Lobstr: Real-time lower-body pose prediction from sparse upper-body tracking signals. In Computer Graphics Forum, Vol. 40. Wiley Online Library, 265?275.
[45]
Yongjing Ye, Libin Liu, Lei Hu, and Shihong Xia. 2022. Neural3Points: Learning to Generate Physically Realistic Full-body Motion for Virtual Reality Users. In Computer Graphics Forum, Vol. 41. Wiley Online Library, 183?194.
[46]
Xinyu Yi, Yuxiao Zhou, Marc Habermann, Vladislav Golyanik, Shaohua Pan, Christian Theobalt, and Feng Xu. 2023. EgoLocate: Real-time Motion Capture, Localization, and Mapping with Sparse Body-mounted Sensors. ACM Transactions on Graphics (TOG) 42, 4, Article 76 (2023), 17 pages.
[47]
Xinyu Yi, Yuxiao Zhou, Marc Habermann, Soshi Shimada, Vladislav Golyanik, Christian Theobalt, and Feng Xu. 2022. Physical Inertial Poser (PIP): Physics-aware Real-time Human Motion Tracking from Sparse Inertial Sensors. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[48]
Xinyu Yi, Yuxiao Zhou, and Feng Xu. 2021. TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors. ACM Transactions on Graphics 40 (08 2021).
[49]
Alexander D Young, Martin J Ling, and Damal K Arvind. 2011. IMUSim: A simulation environment for inertial sensing algorithm design and evaluation. In Proceedings of the 10th ACM/IEEE International Conference on Information Processing in Sensor Networks. IEEE, 199?210.
[50]
Xiaozheng Zheng, Zhuo Su, Chao Wen, Zhou Xue, and Xiaojie Jin. 2023. Realistic Full-Body Tracking from Sparse Observations via Joint-Level Modeling. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 14678?14688.