“Learning predict-and-simulate policies from unorganized human motion data” by Park, Ryu, Lee, Lee and Lee
Conference:
Type(s):
Title:
- Learning predict-and-simulate policies from unorganized human motion data
Session/Category Title: Learning to Move
Presenter(s)/Author(s):
Moderator(s):
Abstract:
The goal of this research is to create physically simulated biped characters equipped with a rich repertoire of motor skills. The user can control the characters interactively by modulating their control objectives. The characters can interact physically with each other and with the environment. We present a novel network-based algorithm that learns control policies from unorganized, minimally-labeled human motion data. The network architecture for interactive character animation incorporates an RNN-based motion generator into a DRL-based controller for physics simulation and control. The motion generator guides forward dynamics simulation by feeding a sequence of future motion frames to track. The rich future prediction facilitates policy learning from large training data sets. We will demonstrate the effectiveness of our approach with biped characters that learn a variety of dynamic motor skills from large, unorganized data and react to unexpected perturbation beyond the scope of the training data.
References:
1. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 265–283.Google ScholarDigital Library
2. Mazen Al Borno, Martin De Lasa, and Aaron Hertzmann. 2013. Trajectory optimization for full-body movements with complex contacts. IEEE transactions on visualization and computer graphics 19, 8 (2013), 1405–1414.Google Scholar
3. Trapit Bansal, Jakub Pachocki, Szymon Sidor, Ilya Sutskever, and Igor Mordatch. 2017. Emergent complexity via multi-agent competition. arXiv preprint arXiv:1710.03748 (2017).Google Scholar
4. Jernej Barbič, Marco da Silva, and Jovan Popović. 2009. Deformable Object Animation Using Reduced Optimal Control. ACM Trans. Graph. 28, 3, Article 53 (2009).Google ScholarDigital Library
5. Kevin Bergamin, Simon Claver, Daniel Holden, and James Richard Forbes. 2019. DReCon: Data-Driven Responsive Control of Physics-Based Characters. ACM Trans. Graph. 38, 6, Article 1 (2019).Google ScholarDigital Library
6. Nuttapong Chentanez, Matthias Müller, Miles Macklin, Viktor Makoviychuk, and Stefan Jeschke. 2018. Physics-based motion capture imitation with deep reinforcement learning. In Proceedings of the 11th Annual International Conference on Motion, Interaction, and Games. ACM, 1.Google ScholarDigital Library
7. Katerina Fragkiadaki, Sergey Levine, Panna Felsen, and Jitendra Malik. 2015. Recurrent network models for human dynamics. In Proceedings of the IEEE International Conference on Computer Vision. 4346–4354.Google ScholarDigital Library
8. Partha Ghosh, Jie Song, Emre Aksan, and Otmar Hilliges. 2017. Learning human motion models for long-term predictions. In 2017 International Conference on 3D Vision. 458–466.Google ScholarCross Ref
9. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672–2680.Google Scholar
10. Keith Grochow, Steven L. Martin, Aaron Hertzmann, and Zoran Popović. 2004. Style-based Inverse Kinematics. ACM Trans. Graph. 23, 3 (2004), 522–531.Google ScholarDigital Library
11. Sehoon Ha and C. Karen Liu. 2014. Iterative Training of Dynamic Skills Inspired by Human Coaching Techniques. ACM Trans. Graph. 34, 1, Article 1 (2014).Google ScholarDigital Library
12. Sehoon Ha, Yuting Ye, and C. Karen Liu. 2012. Falling and Landing Motion Control for Character Animation. ACM Trans. Graph. 31, 6, Article 155 (2012).Google ScholarDigital Library
13. Perttu Hämäläinen, Sebastian Eriksson, Esa Tanskanen, Ville Kyrki, and Jaakko Lehtinen. 2014. Online motion synthesis using sequential monte carlo. ACM Transactions on Graphics (TOG) 33, 4 (2014), 51.Google ScholarDigital Library
14. Perttu Hämäläinen, Joose Rajamäki, and C Karen Liu. 2015. Online control of simulated humanoids using particle belief propagation. ACM Transactions on Graphics (TOG) 34, 4 (2015), 81.Google ScholarDigital Library
15. Félix G Harvey and Christopher Pal. 2018. Recurrent transition networks for character locomotion. arXiv preprint arXiv:1810.02363 (2018).Google Scholar
16. Nicolas Heess, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, SM Eslami, Martin Riedmiller, et al. 2017. Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 (2017).Google Scholar
17. Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. In Advances in Neural Information Processing Systems. 4565–4573.Google Scholar
18. Daniel Holden, Taku Komura, and Jun Saito. 2017. Phase-functioned Neural Networks for Character Control. ACM Trans. Graph. 36, 4, Article 42 (2017).Google ScholarDigital Library
19. Daniel Holden, Jun Saito, and Taku Komura. 2016. A Deep Learning Framework for Character Motion Synthesis and Editing. ACM Trans. Graph. 35, 4, Article 138 (2016).Google ScholarDigital Library
20. Kyunglyul Hyun, Kyungho Lee, and Jehee Lee. 2016. Motion grammars for character animation. In Computer Graphics Forum, Vol. 35. 103–113.Google ScholarCross Ref
21. Eunjung Ju, Jungdam Won, Jehee Lee, Byungkuk Choi, Junyong Noh, and Min Gyu Choi. 2013. Data-driven Control of Flapping Flight. ACM Trans. Graph. 32, 5, Article 151 (2013).Google ScholarDigital Library
22. Manmyung Kim, Kyunglyul Hyun, Jongmin Kim, and Jehee Lee. 2009. Synchronized Multi-character Motion Editing. ACM Trans. Graph. 28, 3, Article 79 (2009).Google ScholarDigital Library
23. Paul G. Kry and Dinesh K. Pai. 2006. Interaction Capture and Synthesis. ACM Trans. Graph. 25, 3 (2006), 872–880.Google ScholarDigital Library
24. Taesoo Kwon and Jessica Hodgins. 2010. Control systems for human running using an inverted pendulum model and a reference motion capture sequence. In Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. Eurographics Association, 129–138.Google ScholarDigital Library
25. Taesoo Kwon and Jessica K. Hodgins. 2017. Momentum-Mapped Inverted Pendulum Models for Controlling Dynamic Human Motions. ACM Trans. Graph. 36, 4, Article 145 (2017).Google ScholarDigital Library
26. Jehee Lee, Jinxiang Chai, Paul S. A. Reitsma, Jessica K. Hodgins, and Nancy S. Pollard. 2002. Interactive Control of Avatars Animated with Human Motion Data. ACM Trans. Graph. 21, 3 (2002), 491–500.Google ScholarDigital Library
27. Jeongseok Lee, Michael X Grey, Sehoon Ha, Tobias Kunz, Sumit Jain, Yuting Ye, Siddhartha S Srinivasa, Mike Stilman, and C Karen Liu. 2018a. DART: Dynamic animation and robotics toolkit. The Journal of Open Source Software 3, 22 (2018), 500.Google ScholarCross Ref
28. Jehee Lee and Kang Hoon Lee. 2004. Precomputing avatar behavior from human motion data. In Proceedings of the 2004 ACM SIGGRAPH/Eurographics symposium on Computer animation. 79–87.Google ScholarDigital Library
29. Kyungho Lee, Seyoung Lee, and Jehee Lee. 2018b. Interactive Character Animation by Learning Multi-objective Control. ACM Trans. Graph. 37, 6, Article 180 (2018).Google ScholarDigital Library
30. Seunghwan Lee, Moonseok Park, Kyoungmin Lee, and Jehee Lee. 2019. Scalable Muscle-actuated Human Simulation and Control. ACM Trans. Graph. 38, 4, Article 73 (2019).Google ScholarDigital Library
31. Yoonsang Lee, Sungeun Kim, and Jehee Lee. 2010a. Data-driven Biped Control. ACM Trans. Graph. 29, 4, Article 129 (2010).Google ScholarDigital Library
32. Yongjoon Lee, Kevin Wampler, Gilbert Bernstein, Jovan Popović, and Zoran Popović. 2010b. Motion Fields for Interactive Character Locomotion. ACM Trans. Graph. 29, 6, Article 138 (2010).Google ScholarDigital Library
33. Sergey Levine, Jack M. Wang, Alexis Haraux, Zoran Popović, and Vladlen Koltun. 2012. Continuous Character Control with Low-dimensional Embeddings. ACM Trans. Graph. 31, 4, Article 28 (2012).Google ScholarDigital Library
34. Libin Liu and Jessica Hodgins. 2017. Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning. ACM Trans. Graph. 36, 4, Article 42a (2017).Google ScholarDigital Library
35. Libin Liu and Jessica Hodgins. 2018. Learning Basketball Dribbling Skills Using Trajectory Optimization and Deep Reinforcement Learning. ACM Trans. Graph. 37, 4, Article 142 (2018).Google ScholarDigital Library
36. Libin Liu, Michiel Van De Panne, and Kangkang Yin. 2016. Guided Learning of Control Graphs for Physics-Based Characters. ACM Trans. Graph. 35, 3, Article 29 (2016).Google ScholarDigital Library
37. Libin Liu, KangKang Yin, Michiel van de Panne, and Baining Guo. 2012. Terrain Runner: Control, Parameterization, Composition, and Planning for Highly Dynamic Motions. ACM Trans. Graph. 31, 6, Article 154 (2012).Google ScholarDigital Library
38. Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, and Nicolas Heess. 2018. Neural probabilistic motor primitives for humanoid control. arXiv preprint arXiv:1811.11711 (2018).Google Scholar
39. Josh Merel, Yuval Tassa, Sriram Srinivasan, Jay Lemmon, Ziyu Wang, Greg Wayne, and Nicolas Heess. 2017. Learning human behaviors from motion capture by adversarial imitation. arXiv preprint arXiv:1707.02201 (2017).Google Scholar
40. Jianyuan Min and Jinxiang Chai. 2012. Motion graphs++: a compact generative model for semantic motion analysis and synthesis. ACM Trans. Graph. 31, 6, Article 153 (2012).Google ScholarDigital Library
41. Igor Mordatch, Emanuel Todorov, and Zoran Popović. 2012. Discovery of complex behaviors through contact-invariant optimization. ACM Trans. Graph. 31, 4, Article 43 (2012).Google ScholarDigital Library
42. Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Trans. Graph. 37, 4, Article 143 (2018).Google ScholarDigital Library
43. Xue Bin Peng, Glen Berseth, KangKang Yin, and Michiel Van De Panne. 2017. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Trans. Graph. 36, 4, Article 41 (2017).Google ScholarDigital Library
44. John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. 2015. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 (2015).Google Scholar
45. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).Google Scholar
46. Hyun Joon Shin and Jehee Lee. 2006. Motion synthesis and editing in low-dimensional spaces. Computer Animation and Virtual Worlds 17, 3–4 (2006), 219–227.Google ScholarCross Ref
47. Kwang Won Sok, Manmyung Kim, and Jehee Lee. 2007. Simulating biped behaviors from human motion data. ACM Trans. Graph. 26, 3, Article 107 (2007).Google ScholarDigital Library
48. Kwang Won Sok, Katsu Yamane, Jehee Lee, and Jessica Hodgins. 2010. Editing dynamic human motions via momentum and force. In Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer animation. 11–20.Google ScholarDigital Library
49. Richard S. Sutton and Andrew G. Barto. 1998. Introduction to Reinforcement Learning (1st ed.). MIT Press, Cambridge, MA, USA.Google ScholarDigital Library
50. Jie Tan, Yuting Gu, C. Karen Liu, and Greg Turk. 2014. Learning Bicycle Stunts. ACM Trans. Graph. 33, 4, Article 50 (2014).Google ScholarDigital Library
51. Jie Tan, Karen Liu, and Greg Turk. 2011. Stable proportional-derivative controllers. IEEE Computer Graphics and Applications 31, 4 (2011), 34–44.Google ScholarDigital Library
52. Yuval Tassa, Tom Erez, and Emanuel Todorov. 2012. Synthesis and stabilization of complex behaviors through online trajectory optimization. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 4906–4913.Google ScholarCross Ref
53. Yuval Tassa, Nicolas Mansard, and Emo Todorov. 2014. Control-limited differential dynamic programming. In 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1168–1175.Google ScholarCross Ref
54. Adrien Treuille, Yongjoon Lee, and Zoran Popović. 2007. Near-optimal character animation with continuous control. ACM Trans. Graph 26, 3, Article 7 (2007).Google ScholarDigital Library
55. Yao-Yang Tsai, Wen-Chieh Lin, Kuangyou B Cheng, Jehee Lee, and Tong-Yee Lee. 2010. Real-time physics-based 3d biped character animation using an inverted pendulum model. IEEE transactions on visualization and computer graphics 16, 2 (2010), 325–337.Google Scholar
56. Jack M. Wang, David J. Fleet, and Aaron Hertzmann. 2010. Optimizing Walking Controllers for Uncertain Inputs and Environments. ACM Trans. Graph 29, 4, Article 73 (2010).Google ScholarDigital Library
57. Jack M. Wang, Samuel R. Hamner, Scott L. Delp, and Vladlen Koltun. 2012. Optimizing Locomotion Controllers Using Biologically-based Actuators and Objectives. ACM Trans. Graph. 31, 4, Article 25 (2012).Google ScholarDigital Library
58. Ziyu Wang, Josh S Merel, Scott E Reed, Nando de Freitas, Gregory Wayne, and Nicolas Heess. 2017. Robust imitation of diverse behaviors. In Advances in Neural Information Processing Systems. 5320–5329.Google Scholar
59. Jungdam Won, Jongho Park, Kwanyu Kim, and Jehee Lee. 2017. How to Train Your Dragon: Example-guided Control of Flapping Flight. ACM Trans. Graph. 36, 6, Article 198 (2017).Google ScholarDigital Library
60. Jungdam Won, Jungnam Park, and Jehee Lee. 2018. Aerobatics control of flying creatures via self-regulated learning. ACM Trans. Graph. 37, 6, Article 181 (2018).Google ScholarDigital Library
61. Yuting Ye and C. Karen Liu. 2010. Optimal Feedback Control for Character Animation Using an Abstract Model. ACM Trans. Graph. 29, 4, Article 74 (2010).Google ScholarDigital Library
62. KangKang Yin, Kevin Loken, and Michiel Van de Panne. 2007. Simbicon: Simple biped locomotion control. ACM Trans. Graph. 26, 3, Article 105 (2007).Google ScholarDigital Library
63. He Zhang, Sebastian Starke, Taku Komura, and Jun Saito. 2018. Mode-adaptive Neural Networks for Quadruped Motion Control. ACM Trans. Graph. 37, 4, Article 145 (2018).Google ScholarDigital Library
64. Yi Zhou, Zimo Li, Shuangjiu Xiao, Chong He, Zeng Huang, and Hao Li. 2018. Auto-conditioned recurrent networks for extended complex human motion synthesis. arXiv preprint arXiv:1707.05363 (2018).Google Scholar
65. Victor Zordan and Jessica K Hodgins. 2002. Motion capture-driven simulations that hit and react. In Proceedings of the 2002 ACM SIGGRAPH/Eurographics symposium on Computer animation. 89–96.Google ScholarDigital Library
66. Victor Zordan, Adriano Macchietto, Jose Medina, Marc Soriano, and Chun-Chih Wu. 2007. Interactive dynamic response for games. In Proceedings of the 2007 ACM SIGGRAPH symposium on Video games. 9–14.Google ScholarDigital Library


