Online control of simulated humanoids using particle belief propagation

We present a novel, general-purpose Model-Predictive Control (MPC) algorithm that we call Control Particle Belief Propagation (C-PBP). C-PBP combines multimodal, gradient-free sampling and a Markov Random Field factorization to effectively perform simultaneous path finding and smoothing in high-dimensional spaces. We demonstrate the method in online synthesis of interactive and physically valid humanoid movements, including balancing, recovery from both small and extreme disturbances, reaching, balancing on a ball, juggling a ball, and fully steerable locomotion in an environment with obstacles. Such a large repertoire of movements has not been demonstrated before at interactive frame rates, especially considering that all our movement emerges from simple cost functions. Furthermore, we abstain from using any precomputation to train a control policy offline, reference data such as motion capture clips, or state machines that break the movements down into more manageable subtasks. Operating under these conditions enables rapid and convenient iteration when designing the cost functions.

References:

1. Arulampalam, M., Maskell, S., Gordon, N., and Clapp, T. 2002. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans. Signal Process. 50, 2, 174–188. Google ScholarDigital Library
2. Borno, M. A., Fiume, E., Hertzmann, A., and de Lasa, M. 2014. Feedback control for rotational movements in feature space. Comput. Graph. Forum 33, 2, 225–233. Google ScholarDigital Library
3. Coros, S., Beaudoin, P., and van de Panne, M. 2010. Generalized biped walking control. ACM Trans. Graph. 29, 4 (July), 130:1–130:9. Google ScholarDigital Library
4. Da Silva, M., Abe, Y., and Popović, J. 2008. Simulation of Human Motion Data using Short-Horizon Model-Predictive Control. Comput. Graphics Forum 27, 2, 371–380.Google ScholarCross Ref
5. Eele, A., Maciejowski, J., Chau, T., and Luk, W. 2013. Parallelisation of Sequential Monte Carlo for real-time control in air traffic management. In Proc. CDC 2013, 4859–4864.Google Scholar
6. Geijtenbeek, T., and Pronost, N. 2012. Interactive Character Animation Using Simulated Physics: A State-of-the-Art Review. Comput. Graphics Forum 31, 8, 2492–2515. Google ScholarDigital Library
7. Geijtenbeek, T., van de Panne, M., and van der Stappen, A. F. 2013. Flexible muscle-based locomotion for bipedal creatures. ACM Trans. Graph. 32, 6 (Nov.), 206:1–206:11. Google ScholarDigital Library
8. Guo, S., Southern, R., Chang, J., Greer, D., and Zhang, J. 2014. Adaptive motion synthesis for virtual characters: a survey. The Visual Computer, 1–16. Google ScholarDigital Library
9. Ha, S., Ye, Y., and Liu, C. K. 2012. Falling and landing motion control for character animation. ACM Trans. Graph. 31, 6, 155:1–155:9. Google ScholarDigital Library
10. Hämäläinen, P., Eriksson, S., Tanskanen, E., Kyrki, V., and Lehtinen, J. 2014. Online motion synthesis using sequential monte carlo. ACM Trans. Graph. 33, 4 (July), 51:1–51:12. Google ScholarDigital Library
11. Ihler, A. T., and Mcallester, D. A. 2009. Particle belief propagation. In Proc. International Conference on Artificial Intelligence and Statistics, 256–263.Google Scholar
12. Jain, S., Ye, Y., and Liu, C. K. 2009. Optimization-based interactive motion synthesis. ACM Trans. Graph. 28, 1 (Feb.), 10:1–10:12. Google ScholarDigital Library
13. Janson, L., Clark, A., and Pavone, M. 2013. Fast marching tree: a fast marching sampling-based method for optimal motion planning in many dimensions. arXiv preprint arXiv:1306.3532.Google Scholar
14. Kalakrishnan, M., Chitta, S., Theodorou, E., Pastor, P., and Schaal, S. 2011. STOMP: Stochastic trajectory optimization for motion planning. In Proc. ICRA 2011, IEEE, 4569–4574.Google Scholar
15. Kantas, N., Maciejowski, J., and Lecchini-Visintini, A. 2009. Sequential monte carlo for model predictive control. In Nonlinear Model Predictive Control. Springer, 263–273.Google Scholar
16. Kappen, H. J., Gómez, V., and Opper, M. 2012. Optimal control as a graphical model inference problem. Machine learning 87, 2, 159–182. Google ScholarDigital Library
17. Lavalle, S. M. 1998. Rapidly-exploring random trees: A new tool for path planning. Tech. rep., Iowa State Univ.Google Scholar
18. Lee, Y., Kim, S., and Lee, J. 2010. Data-driven Biped Control. ACM Trans. Graph. 29, 4, 129:1–129:8. Google ScholarDigital Library
19. Liu, L., Yin, K., van de Panne, M., and Guo, B. 2012. Terrain Runner: Control, Parameterization, Composition, and Planning for Highly Dynamic Motions. ACM Trans. Graph. 31, 6, 154:1–154:10. Google ScholarDigital Library
20. Mordatch, I., de Lasa, M., and Hertzmann, A. 2010. Robust Physics-based Locomotion Using Low-dimensional Planning. ACM Trans. Graph. 29, 4, 71:1–71:8. Google ScholarDigital Library
21. Mordatch, I., Todorov, E., and Popović, Z. 2012. Discovery of complex behaviors through contact-invariant optimization. ACM Trans. Graph. 31, 4 (July), 43:1–43:8. Google ScholarDigital Library
22. Muico, U., Lee, Y., Popović, J., and Popović, Z. 2009. Contact-aware nonlinear control of dynamic characters. ACM Trans. Graph. 28, 3, 81:1–81:9. Google ScholarDigital Library
23. Pejsa, T., and Pandzic, I. 2010. State of the Art in Example-Based Motion Synthesis for Virtual Characters in Interactive Applications. Comput. Graphics Forum 29, 1, 202–226.Google ScholarCross Ref
24. Schmidt, R. A., and Wrisberg, C. A. 2008. Motor Learning and Performance, 4rd ed. Human Kinetics.Google Scholar
25. Stahl, D., and Hauth, J. 2011. PF-MPC: Particle filter-model predictive control. Syst. Control Lett. 60, 8, 632–643.Google ScholarCross Ref
26. Sudderth, E. B., Ihler, A. T., Isard, M., Freeman, W. T., and Willsky, A. S. 2010. Nonparametric belief propagation. Commun. ACM 53, 10, 95–103. Google ScholarDigital Library
27. Tan, J., Gu, Y., Liu, C. K., and Turk, G. 2014. Learning bicycle stunts. ACM Trans. Graph. 33, 4, 50:1–50:12. Google ScholarDigital Library
28. Tassa, Y., Erez, T., and Todorov, E. 2012. Synthesis and stabilization of complex behaviors through online trajectory optimization. In Proc. IROS, 4906–4913.Google Scholar
29. Tassa, Y., Mansard, N., and Todorov, E. 2014. Control-limited differential dynamic programming. In Proc. ICRA 2014, IEEE, 1168–1175.Google Scholar
30. Todorov, E. 2008. General duality between optimal control and estimation. In Proc. CDC 2008, 4286–4292.Google ScholarCross Ref
31. Toussaint, M. 2009. Robot trajectory optimization using approximate inference. In Proc. ICML 2009. Google ScholarDigital Library
32. Wampler, K., Popović, Z., and Popović, J. 2014. Generalizing locomotion style to new animals with inverse optimal regression. ACM Trans. Graph. 33, 4 (July), 49:1–49:11. Google ScholarDigital Library
33. Witkin, A., and Kass, M. 1988. Spacetime Constraints. SIGGRAPH Comput. Graph. 22, 4, 159–168. Google ScholarDigital Library
34. Wu, J.-C., and Popović, Z. 2010. Terrain-adaptive bipedal locomotion control. ACM Trans. Graph. 29, 4, 72:1–72:10. Google ScholarDigital Library
35. Xu, J., Duindam, V., Alterovitz, R., and Goldberg, K. 2008. Motion planning for steerable needles in 3d environments with obstacles using rapidly-exploring random trees and backchaining. In Proc. CASE 2008, IEEE, 41–46.Google Scholar
36. Ye, Y., and Liu, C. K. 2010. Optimal Feedback Control for Character Animation Using an Abstract Model. ACM Trans. Graph. 29, 4, 74:1–74:9. Google ScholarDigital Library
37. Yin, K., Loken, K., and van de Panne, M. 2007. Simbicon: Simple biped locomotion control. ACM Trans. Graph. 26, 3. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2015: Technical Papers

“Online control of simulated humanoids using particle belief propagation”

Conference:

Type(s):

Title:

Session/Category Title: Taking Control

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: