Learning to fly: computational controller design for hybrid UAVs with reinforcement learning

Hybrid unmanned aerial vehicles (UAV) combine advantages of multicopters and fixed-wing planes: vertical take-off, landing, and low energy use. However, hybrid UAVs are rarely used because controller design is challenging due to its complex, mixed dynamics. In this paper, we propose a method to automate this design process by training a mode-free, model-agnostic neural network controller for hybrid UAVs. We present a neural network controller design with a novel error convolution input trained by reinforcement learning. Our controller exhibits two key features: First, it does not distinguish among flying modes, and the same controller structure can be used for copters with various dynamics. Second, our controller works for real models without any additional parameter tuning process, closing the gap between virtual simulation and real fabrication. We demonstrate the efficacy of the proposed controller both in simulation and in our custom-built hybrid UAVs (Figure 1, 8). The experiments show that the controller is robust to exploit the complex dynamics when both rotors and wings are active in flight tests.

References:

1. Pieter Abbeel, Adam Coates, and Andrew Y. Ng. 2010. Autonomous Helicopter Aerobatics through Apprenticeship Learning. The International Journal of Robotics Research 29, 13 (Nov. 2010), 1608–1639. Google ScholarDigital Library
2. Pieter Abbeel, Adam Coates, Morgan Quigley, and Andrew Y. Ng. 2007. An Application of Reinforcement Learning to Aerobatic Helicopter Flight. In Proceedings of the 19th International Conference on Neural Information Processing Systems (NIPS ’07). 1–8. http://dl.acm.org/citation.cfm?id=2976456.2976457 Google ScholarDigital Library
3. Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, et al. 2018. Learning Dexterous In-Hand Manipulation. Retrieved April 22, 2019 from https://arxiv.org/abs/1808.00177Google Scholar
4. ArduPilot. 2016. ArduPilot Open Source Autopilot. Retrieved April 22, 2019 from http://ardupilot.org/Google Scholar
5. Karl Johan Aström and Richard M. Murray. 2010. Feedback Systems: an Introduction for Scientists and Engineers. Princeton University Press. Google ScholarDigital Library
6. Somil Bansal, Anayo K. Akametalu, Frank J. Jiang, Forrest Laine, and Claire J. Tomlin. 2016. Learning Quadrotor Dynamics Using Neural Network for Flight Control. In Proceedings of 2016 IEEE 55th Conference on Decision and Control (CDC ’16). 4653–4660.Google Scholar
7. Roman Bapst, Robin Ritz, Lorenz Meier, and Marc Pollefeys. 2015. Design and Implementation of an Unmanned Tail-sitter. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS ’15). IEEE, 1885–1890.Google ScholarCross Ref
8. Andrew Barry. 2016. High-Speed Autonomous Obstacle Avoidance with Pushbroom Stereo. Ph.D. Dissertation. Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
9. Gaurav Bharaj, David I. W. Levin, James Tompkin, Yun Fei, Hanspeter Pfister, Wojciech Matusik, and Changxi Zheng. 2015. Computational Design of Metallophone Contact Sounds. ACM Trans. Graph. 34, 6, Article 223 (Oct. 2015), 13 pages. Google ScholarDigital Library
10. Adam Coates, Pieter Abbeel, and Andrew Y. Ng. 2008. Learning for Control from Multiple Demonstrations. In Proceedings of the 25th International Conference on Machine Learning (ICML ’08). ACM, New York, NY, USA, 144–151. Google ScholarDigital Library
11. Stelian Coros, Bernhard Thomaszewski, Gioacchino Noris, Shinjiro Sueda, Moira Forberg, Robert W. Sumner, Wojciech Matusik, and Bernd Bickel. 2013. Computational Design of Mechanical Characters. ACM Trans. Graph. 32, 4, Article 83 (July 2013), 12 pages. Google ScholarDigital Library
12. Rick Cory and Russ Tedrake. 2008. Experiments in Fixed-Wing UAV Perching. In AIAA Guidance, Navigation and Control Conference and Exhibit. 7256.Google Scholar
13. Ruta Desai, Ye Yuan, and Stelian Coros. 2017. Computational Abstractions for Interactive Design of Robotic Devices. In 2017 IEEE International Conference on Robotics and Automation (ICRA ’17). 1196–1203.Google Scholar
14. Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, and Sergey Levine. 2017. Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer. In 2017 IEEE International Conference on Robotics and Automation (ICRA ’17). 2169–2176.Google Scholar
15. Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, Yuhuai Wu, and Peter Zhokhov. 2017. OpenAI Baselines. Retrieved April 22, 2019 from https://github.com/openai/baselinesGoogle Scholar
16. Tao Du, Adriana Schulz, Bo Zhu, Bernd Bickel, and Wojciech Matusik. 2016. Computational Multicopter Design. ACM Trans. Graph. 35, 6, Article 227 (Nov. 2016), 10 pages. Google ScholarDigital Library
17. Justin Fu, Sergey Levine, and Pieter Abbeel. 2016. One-shot Learning of manipulation skills with online dynamics adaptation and neural network priors. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS ’16). IEEE, 4019–4026.Google ScholarDigital Library
18. Moritz Geilinger, Roi Poranne, Ruta Desai, Bernhard Thomaszewski, and Stelian Coros. 2018. Skaterbots: Optimization-based Design and Motion Synthesis for Robotic Creatures with Legs and Wheels. ACM Trans. Graph. 37, 4, Article 160 (July 2018), 12 pages. Google ScholarDigital Library
19. Nicolas Heess, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, Ali Eslami, Martin Riedmiller, et al. 2017. Emergence of Locomotion Behaviours in Rich Environments. arXiv preprint arXiv:1707.02286 (2017).Google Scholar
20. R. Hugh Stone, Peter Anderson, Colin Hutchison, Allen Tsai, Peter Gibbens, and K C. Wong. 2008. Flight Testing of the T-Wing Tail-Sitter Unmanned Air Vehicle. Journal of Aircraft – J AIRCRAFT 45 (Mar. 2008), 673–685.Google Scholar
21. Jemin Hwangbo, Inkyu Sa, Roland Siegwart, and Marco Hutter. 2017. Control of a Quadrotor With Reinforcement Learning. IEEE Robotics and Automation Letters 2, 4 (Oct. 2017), 2096–2103.Google ScholarCross Ref
22. H Jin Kim, Michael I Jordan, Shankar Sastry, and Andrew Y Ng. 2004. Autonomous helicopter flight via reinforcement learning. In Advances in neural information processing systems (NIPS ’04). 799–806. Google ScholarDigital Library
23. Libin Liu and Jessica Hodgins. 2018. Learning Basketball Dribbling Skills Using Trajectory Optimization and Deep Reinforcement Learning. ACM Trans. Graph. 37, 4, Article 142 (July 2018), 14 pages. Google ScholarDigital Library
24. Vittorio Megaro, Bernhard Thomaszewski, Maurizio Nitti, Otmar Hilliges, Markus Gross, and Stelian Coros. 2015. Interactive Design of 3D-printable Robotic Creatures. ACM Trans. Graph. 34, 6, Article 216 (Oct. 2015), 9 pages. Google ScholarDigital Library
25. Lorenz Meier, Petri Tanskanen, Lionel Heng, Gim Hee Lee, Friedrich Fraundorfer, and Marc Pollefeys. 2012. PIXHAWK: A Micro Aerial Vehicle Design for Autonomous Flight using Onboard Computer Vision. Autonomous Robots 33, 1–2 (2012), 21–39. Google ScholarDigital Library
26. OnShape. 2019. https://www.onshape.com/.Google Scholar
27. Atsushi Oosedo, Satoko Abiko, Atsushi Konno, Takuya Koizumi, Tatuya Furui, and Masaru Uchiyama. 2013. Development of a quad rotor tail-sitter VTOL UAV without control surfaces and experimental verification. In 2013 IEEE International Conference on Robotics and Automation (ICRA ’13). 317–322.Google ScholarCross Ref
28. Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018a. Deep-Mimic: Example-guided Deep Reinforcement Learning of Physics-based Character Skills. ACM Trans. Graph. 37, 4, Article 143 (July 2018), 14 pages. Google ScholarDigital Library
29. Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. 2018b. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization. In 2018 IEEE International Conference on Robotics and Automation (ICRA ’18). 1–8.Google Scholar
30. Xue Bin Peng, Glen Berseth, Kangkang Yin, and Michiel Van De Panne. 2017. DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning. ACM Trans. Graph. 36, 4, Article 41 (July 2017), 13 pages. Google ScholarDigital Library
31. Robin Ritz and Raffaello D’Andrea. 2017. A Global Controller for Flying Wing Tailsitter Vehicles. In 2017 IEEE International Conference on Robotics and Automation (ICRA ’17). 2731–2738.Google ScholarCross Ref
32. Fereshteh Sadeghi and Sergey Levine. 2017. CAD2RL: Real Single-Image Flight Without a Single Real Image. In Proceedings of Robotics: Science and Systems. Cambridge, Massachusetts.Google ScholarCross Ref
33. Charles Schaff, David Yunis, Ayan Chakrabarti, and Matthew R Walter. 2018. Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning. arXiv preprint arXiv:1801.01432 (2018).Google Scholar
34. John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust Region Policy Optimization. In International Conference on Machine Learning (ICML ’15). 1889–1897. Google ScholarDigital Library
35. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347 (2017).Google Scholar
36. Adriana Schulz, Jie Xu, Bo Zhu, Changxi Zheng, Eitan Grinspun, and Wojciech Matusik. 2017. Interactive Design Space Exploration and Optimization for CAD Models. ACM Trans. Graph. 36, 4, Article 157 (July 2017), 14 pages. Google ScholarDigital Library
37. Jie Tan, Tingnan Zhang, Erwin Coumans, Atil Iscen, Yunfei Bai, Danijar Hafner, Steven Bohez, and Vincent Vanhoucke. 2018. Sim-to-Real: Learning Agile Locomotion For Quadruped Robots. In Proceedings of Robotics: Science and Systems. Pittsburgh, Pennsylvania.Google ScholarCross Ref
38. Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. 2017. Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. In 2017 IEEE/RSJ International Conferenceon Intelligent Robots and Systems (IROS ’17). IEEE, 23–30.Google ScholarCross Ref
39. Nobuyuki Umetani, Takeo Igarashi, and Niloy J Mitra. 2012. Guided Exploration of Physically Valid Shapes for Furniture Design. ACM Trans. Graph. 31, 4 (2012), 86–1. Google ScholarDigital Library
40. Nobuyuki Umetani, Yuki Koyama, Ryan Schmidt, and Takeo Igarashi. 2014. Pteromys: Interactive Design and Optimization of Free-formed Free-flight Model Airplanes. ACM Trans. Graph. 33, 4, Article 65 (July 2014), 10 pages. Google ScholarDigital Library
41. Jungdam Won, Jongho Park, Kwanyu Kim, and Jehee Lee. 2017. How to Train Your Dragon: Example-guided Control of Flapping Flight. ACM Trans. Graph. 36, 6, Article 198 (Nov. 2017), 13 pages. Google ScholarDigital Library
42. Jungdam Won, Jungnam Park, and Jehee Lee. 2018. Aerobatics Control of Flying Creatures via Self-regulated Learning. ACM Trans. Graph. 37, 6, Article 181 (Dec. 2018), 10 pages. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2019: Technical Papers

“Learning to fly: computational controller design for hybrid UAVs with reinforcement learning” by Xu, Du, Foshey, Li, Zhu, et al. …

Conference:

Type(s):

Title:

Session/Category Title: Capture Control

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: