“A GPU-based multilevel additive schwarz preconditioner for cloth and deformable body simulation” by Wu, Wang and Wang

  • ©Botao Wu, Zhendong Wang, and Huamin Wang




    A GPU-based multilevel additive schwarz preconditioner for cloth and deformable body simulation



    In this paper, we wish to push the limit of real-time cloth and deformable body simulation to a higher level with 50K to 500K vertices, based on the development of a novel GPU-based multilevel additive Schwarz (MAS) pre-conditioner. Similar to other preconditioners under the MAS framework, our preconditioner naturally adopts multilevel and domain decomposition concepts. But contrary to previous works, we advocate the use of small, non-overlapping domains that can well explore the parallel computing power on a GPU. Based on this idea, we investigate and invent a series of algorithms for our preconditioner, including multilevel domain construction using Morton codes, low-cost matrix precomputation by one-way Gauss-Jordan elimination, and conflict-free symmetric-matrix-vector multiplication in runtime preconditioning. The experiment shows that our preconditioner is effective, fast, cheap to precompute and scalable with respect to stiffness and problem size. It is compatible with many linear and nonlinear solvers used in cloth and deformable body simulation with dynamic contacts, such as PCG, accelerated gradient descent and L-BFGS. On a GPU, our preconditioner speeds up a PCG solver by approximately a factor of four, and its CPU version outperforms a number of competitors, including ILU0 and ILUT.


    1. Christie Alappat, Achim Basermann, Alan R. Bishop, Holger Fehske, Georg Hager, Olaf Schenk, Jonas Thies, and Gerhard Wellein. 2020. A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication. ACM Trans. Parallel Comput. 7, 3, Article 19 (jun 2020), 37 pages.Google ScholarDigital Library
    2. David Baraff and Andrew Witkin. 1998. Large Steps in Cloth Simulation. In Proceedings of SIGGRAPH 98 (Computer Graphics Proceedings, Annual Conference Series), Eugene Fiume (Ed.). Association for Computing Machinery, New York, NY, USA, 43–54.Google ScholarDigital Library
    3. Jernej Barbič and Yili Zhao. 2011. Real-Time Large-Deformation Substructuring. ACM Trans. Graph. (SIGGRAPH) 30, 4, Article 91 (jul 2011), 8 pages.Google ScholarDigital Library
    4. Nathan Bell and Michael Garland. 2008. Efficient Sparse Matrix-Vector Multiplication on CUDA. NVIDIA Technical Report NVR-2008-004. NVIDIA Corporation.Google Scholar
    5. Miklos Bergou, Max Wardetzky, David Harmon, Denis Zorin, and Eitan Grinspun. 2006. A Quadratic Bending Model for Inextensible Surfaces. In Proceedings of SGP (Cagliari, Sardinia, Italy). 227–230.Google Scholar
    6. Sofien Bouaziz, Sebastian Martin, Tiantian Liu, Ladislav Kavan, and Mark Pauly. 2014. Projective Dynamics: Fusing Constraint Projections for Fast Simulation. ACM Trans. Graph. (SIGGRAPH) 33, 4, Article 154 (July 2014), 11 pages.Google ScholarDigital Library
    7. Robert Bridson, Ronald Fedkiw, and John Anderson. 2002. Robust Treatment of Collisions, Contact and Friction for Cloth Animation. ACM Trans. Graph. (SIGGRAPH) 21, 3 (July 2002), 594–603.Google ScholarDigital Library
    8. Jed Brown and Peter Brune. 2013. Low-Rank Quasi-Newton Updates for Robust Jacobian Lagging in Newton Methods. International Conference on Mathematics and Computational Methods Applied to Nuclear Science and Engineering (2013).Google Scholar
    9. Xiao-Chuan Cai and Yousef Saad. 1996. Overlapping Domain Decomposition Algorithms for General Sparse Matrices. Numerical Linear Algebra with Applications 3, 3 (1996), 221–237.Google ScholarCross Ref
    10. Xiao-Chuan Cai and Marcus Sarkis. 1999. A Restricted Additive Schwarz Preconditioner for General Sparse Linear Systems. SIAM Journal on Scientific Computing 21, 2 (1999), 792–797.Google ScholarDigital Library
    11. Ali Charara, David E. Keyes, and Hatem Ltaief. 2017. A Framework for Dense Triangular Matrix Kernels on Various Manycore Architectures. Concurrency and Computation: Practice and Experience 29, 15 (2017).Google Scholar
    12. Jiong Chen, Florian Schäfer, Jin Huang, and Mathieu Desbrun. 2021. Multiscale Cholesky Preconditioning for Ill-Conditioned Problems. ACM Trans. Graph. (SIGGRAPH) 40, 4, Article 81 (July 2021), 13 pages.Google ScholarDigital Library
    13. Kwang-Jin Choi and Hyeong-Seok Ko. 2002. Stable but Responsive Cloth. ACM Trans. Graph. (SIGGRAPH) 21, 3 (July 2002), 604–611.Google ScholarDigital Library
    14. Victorita Dolean, Pierre Jolivet, and Frédéric Nataf. 2015. An Introduction to Domain Decomposition Methods. Society for Industrial and Applied Mathematics, Philadelphia, PA.Google Scholar
    15. Maksymilian Dryja, Marcus V. Sarkis, and Olof B. Widlund. 1996. Multilevel Schwarz Methods for Elliptic Problems with Discontinuous Coefficients in Three Dimensions. Numer. Math. 72, 3 (01 Jan 1996), 313–348.Google Scholar
    16. Maksymilian Dryja and Olof B Widlund. 1989. Some Domain Decomposition Algorithms for Elliptic Problems. In Iterative methods for large linear systems. Elsevier, San Diego, CA, 273–291.Google Scholar
    17. Maksymilian Dryja and Olof B. Widlund. 1991. Multilevel Additive Methods for Elliptic Finite Element Problems. In Parallel Algorithms for Partial Differential Equations, Proceedings of the Sixth GAMM-Seminar. Friedr. Vieweg and Sohn, Braunschweig, Germany, 58–69.Google Scholar
    18. Essex Edwards and Robert Bridson. 2015. The Discretely-Discontinuous Galerkin Coarse Grid for Domain Decomposition. CoRR abs/1504.00907 (2015). arXiv:1504.00907Google Scholar
    19. Charbel Farhat, Michel Lesoinne, Patrick LeTallec, Kendall Pierson, and Daniel Rixen. 2001. FETI-DP: A Dual-Primal Unified FETI Method—Part I: A Faster Alternative to the Two-Level FETI Method. Internat. J. Numer. Methods Engrg. 50, 7 (2001), 1523–1544.Google ScholarCross Ref
    20. Charbel Farhat and Francois-Xavier Roux. 1991. A Method of Finite Element Tearing and Interconnecting and Its Parallel Solution Algorithm. Internat. J. Numer. Methods Engrg. 32, 6 (1991), 1205–1227.Google ScholarCross Ref
    21. Marco Fratarcangeli, Valentina Tibaldo, and Fabio Pellacini. 2016. Vivace: A Practical Gauss-Seidel Method for Stable Soft Body Dynamics. ACM Trans. Graph. (SIGGRAPH Asia) 35, 6, Article 214 (Nov. 2016), 9 pages.Google Scholar
    22. Andreas Frommer and Daniel B. Szyld. 1999. Weighted Max Norms, Splittings, and Overlapping Additive Schwarz Iterations. Numer. Math. 83, 2 (01 Aug 1999), 259–278.Google Scholar
    23. Martin J. Gander. 2008. Schwarz Methods over the Course of Time. Electronic Transactions on Numerical Analysis 31 (2008), 228–255.Google Scholar
    24. Theodoros Gkountouvas, Vasileios Karakasis, Kornilios Kourtis, Georgios Goumas, and Nectarios Koziris. 2013. Improving the Performance of the Symmetric Sparse Matrix-Vector Multiplication in Multicore. In Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing. 273–283.Google ScholarDigital Library
    25. Gundolf Haase, Ulrich Langer, and Arnd Meyer. 1991. The Approximate Dirichlet Domain Decomposition Method. Part I: An Algebraic Approach. Computing 47, 2 (01 Jun 1991), 137–151.Google Scholar
    26. Liliya Kharevych, Weiwei Yang, Yiying Tong, Eva Kanso, Jerrold E. Marsden, Peter Schröder, and Mathieu Desbrun. 2006. Geometric, Variational Integrators for Computer Animation. In Proceedings of SCA (Vienna, Austria). 43–51.Google Scholar
    27. Theodore Kim and Doug L. James. 2011. Physics-Based Character Skinning Using Multi-Domain Subspace Deformations. In Proceedings of SCA (Vancouver, British Columbia, Canada). 63–72.Google Scholar
    28. Dilip Krishnan, Raanan Fattal, and Richard Szeliski. 2013. Efficient Preconditioning of Laplacian Matrices for Computer Graphics. ACM Trans. Graph. 32, 4, Article 142 (July 2013), 15 pages.Google ScholarDigital Library
    29. Christian Lauterbach, Michael Garland, Shubhabrata Sengupta, David Luebke, and Dinesh Manocha. 2009. Fast BVH Construction on GPUs. Computer Graphics Forum 28, 2 (2009), 375–384.Google ScholarCross Ref
    30. Yongjoon Lee, Sung-Eui Yoon, Seungwoo Oh, Duksu Kim, and Sunghee Choi. 2010. Multi-Resolution Cloth Simulation. Comput. Graph. Forum (Pacific Graphics) 29, 7 (2010), 2225–2232.Google ScholarCross Ref
    31. Minchen Li, Zachary Ferguson, Teseo Schneider, Timothy Langlois, Denis Zorin, Daniele Panozzo, Chenfanfu Jiang, and Danny M. Kaufman. 2020. Incremental Potential Contact: Intersection-and Inversion-Free, Large-Deformation Dynamics. ACM Trans. Graph. (SIGGRAPH) 39, 4, Article 49 (jul 2020), 20 pages.Google Scholar
    32. Minchen Li, Ming Gao, Timothy Langlois, Chenfanfu Jiang, and Danny M. Kaufman. 2019. Decomposed Optimization Time Integrator for Large-Step Elastodynamics. ACM Trans. Graph. (SIGGRAPH) 38, 4, Article 70 (jul 2019), 10 pages.Google ScholarDigital Library
    33. Ruipeng Li and Yousef Saad. 2017. Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners. SIAM J. Matrix Anal. Appl. 38, 3 (2017), 807–828.Google ScholarDigital Library
    34. Tiantian Liu, Adam W. Bargteil, James F. O’Brien, and Ladislav Kavan. 2013. Fast Simulation of Mass-Spring Systems. ACM Trans. Graph. (SIGGRAPH Asia) 32, 6, Article 214 (Nov. 2013), 7 pages.Google Scholar
    35. Tiantian Liu, Sofien Bouaziz, and Ladislav Kavan. 2017. Quasi-Newton Methods for Real-Time Simulation of Hyperelastic Materials. ACM Trans. Graph. 36, 3, Article 23 (may 2017), 16 pages.Google ScholarDigital Library
    36. Jan Mandel. 1993. Balancing Domain Decomposition. Communications in Numerical Methods in Engineering 9, 3 (1993), 233–241.Google ScholarCross Ref
    37. Sebastian Martin, Bernhard Thomaszewski, Eitan Grinspun, and Markus Gross. 2011. Example-Based Elastic Materials. ACM Trans. Graph. (SIGGRAPH) 30, 4, Article 72 (July 2011), 8 pages.Google ScholarDigital Library
    38. Matthias Müller, Bruno Heidelberger, Matthias Teschner, and Markus Gross. 2005. Meshless Deformations Based on Shape Matching. ACM Trans. Graph. (SIGGRAPH) 24, 3 (July 2005), 471–478.Google ScholarDigital Library
    39. Rajib Nath, Stanimire Tomov, Tingxing “Tim” Dong, and Jack Dongarra. 2011. Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (Seattle, Washington). Article 6, 10 pages.Google ScholarDigital Library
    40. Maxim Naumov, Marat Arsaev, Patrice Castonguay, Jonathan Cohen, Julien Demouth, Joe Eaton, Simon Layton, Nikolay Markovskiy, Istvan Reguly, Nikolai Sakharnykh, Vijay Sellappan, and Robert Strzodka. 2015. AmgX: A Library for GPU Accelerated Algebraic Multigrid and Preconditioned Iterative Methods. SIAM Journal on Scientific Computing 37, 5 (2015), 602–626.Google ScholarDigital Library
    41. Roy A. Nicolaides. 1987. Deflation of Conjugate Gradients with Applications to Boundary Value Problems. SIAM J. Numer. Anal. 24, 2 (1987), 355–365.Google ScholarDigital Library
    42. NVIDIA. 2021. AmgX. https://developer.nvidia.com/amgx.Google Scholar
    43. Michael Ortiz and Laurent Stainier. 1999. The Variational Formulation of Viscoplastic Constitutive Updates. Computer Methods in Applied Mechanics and Engineering 171, 3 (1999), 419–444.Google ScholarCross Ref
    44. Garry Rodrigue, Kang LiShan, and Liu Yu-Hui. 1989. Convergence and Comparison Analysis of Some Numerical Schwarz Methods. Numer. Math. 56, 2 (01 Feb 1989), 123–138.Google Scholar
    45. Yousef Saad. 1994. ILUT: A Dual Threshold Incomplete LU Factorization. Numerical Linear Algebra with Applications 1, 4 (1994), 387–402.Google ScholarCross Ref
    46. Hermann A. Schwarz. 1870. Ueber Einen Grenz Übergang Durch Alternirendes Verfahren. Wolf J. XV. 272–286.Google Scholar
    47. N. Spillane and D.J. Rixen. 2013. Automatic Spectral Coarse Spaces for Robust Finite Element Tearing and Interconnecting and Balanced Domain Decomposition Algorithms. Internat. J. Numer. Methods Engrg. 95, 11 (2013), 953–990.Google ScholarCross Ref
    48. Rasmus Tamstorf, Toby Jones, and Stephen F. McCormick. 2015. Smoothed Aggregation Multigrid for Cloth Simulation. ACM Trans. Graph. (SIGGRAPH Asia) 34, 6, Article 245 (Oct. 2015), 13 pages.Google Scholar
    49. Min Tang, Zhongyuan Liu, Ruofeng Tong, and Dinesh Manocha. 2018. PSCC: Parallel Self-Collision Culling with Spatial Hashing on GPUs. Proc. ACM Comput. Graph. Interact. Tech. 1, 1, Article 18 (July 2018), 18 pages.Google ScholarDigital Library
    50. Joseph Teran, Silvia Blemker, Victor Ng-Thow-Hing, and Ronald Fedkiw. 2003. Finite Volume Methods for the Simulation of Skeletal Muscle. In Proceedings of SCA 2003 (San Diego, California). 68–74.Google Scholar
    51. Joseph Teran, Eftychios Sifakis, Geoffrey Irving, and Ronald Fedkiw. 2005. Robust Quasistatic Finite Elements and Flesh Simulation. In Proceedings of SCA (Los Angeles, California). 181–190.Google ScholarDigital Library
    52. Demetri Terzopoulos, John Platt, Alan Barr, and Kurt Fleischer. 1987. Elastically Deformable Models. Computer Graphics (SIGGRAPH 87) 21, 4 (Aug. 1987), 205–214.Google Scholar
    53. Pascal Volino, Nadia Magnenat-Thalmann, and Francois Faure. 2009. A Simple Approach to Nonlinear Tensile Stiffness for Accurate Cloth Simulation. ACM Trans. Graph. 28, 4, Article 105 (Sept. 2009), 16 pages.Google ScholarDigital Library
    54. Huamin Wang. 2015. A Chebyshev Semi-Iterative Approach for Accelerating Projective and Position-Based Dynamics. ACM Trans. Graph. (SIGGRAPH Asia) 34, 6, Article 246 (Oct. 2015), 9 pages.Google Scholar
    55. Huamin Wang. 2021. GPU-Based Simulation of Cloth Wrinkles at Submillimeter Levels. ACM Trans. Graph. 40, 4, Article 169 (July 2021), 14 pages.Google ScholarDigital Library
    56. Huamin Wang and Yin Yang. 2016. Descent Methods for Elastic Body Simulation on the GPU. ACM Trans. Graph. (SIGGRAPH Asia) 35, 6, Article 212 (Nov. 2016), 10 pages.Google Scholar
    57. Zhendong Wang, Longhua Wu, Marco Fratarcangeli, Min Tang, and Huamin Wang. 2018. Parallel Multigrid for Nonlinear Cloth Simulation. Comput. Graph. Forum (Pacific Graphics) 37, 7 (2018), 131–141.Google ScholarCross Ref
    58. Joerg Willems. 2013. Spectral Coarse Spaces in Robust Two-Level Schwarz Methods. In Numerical Solution of Partial Differential Equations: Theory, Algorithms, and Their Applications. Springer Proceedings in Mathematics & Statistics, Vol. 45. Springer, 303–326.Google Scholar
    59. Longhua Wu, Botao Wu, Yin Yang, and Huamin Wang. 2020. A Safe and Fast Repulsion Method for GPU-Based Cloth Self Collisions. ACM Trans. Graph. 40, 1, Article 5 (Dec. 2020), 18 pages.Google Scholar
    60. Xiaofeng Wu, Rajaditya Mukherjee, and Huamin Wang. 2015. A Unified Approach for Subspace Simulation of Deformable Bodies in Multiple Domains. ACM Trans. Graph. (SIGGRAPH Asia) 34, 6, Article 241 (Oct. 2015), 9 pages.Google Scholar
    61. Zangyueyang Xian, Xin Tong, and Tiantian Liu. 2019. A Scalable Galerkin Multigrid Method for Real-Time Simulation of Deformable Objects. ACM Trans. Graph. 38, 6, Article 162 (Nov. 2019), 13 pages.Google ScholarDigital Library
    62. Yin Yang, Weiwei Xu, Xiaohu Guo, Kun Zhou, and Baining Guo. 2013. Boundary-Aware Multi-Domain Subspace Deformation. IEEE Transactions on Visualization and Computer Graphics 19, 10 (2013), 1633–1645.Google ScholarCross Ref
    63. Xuejun Zhang. 1992. Multilevel Schwarz Methods. Numer. Math. 63, 1 (01 Dec 1992), 521–539.Google Scholar
    64. Yongning Zhu, Eftychios Sifakis, Joseph Teran, and Achi Brandt. 2010. An Efficient Multigrid Method for the Simulation of High-resolution Elastic Solids. ACM Trans. Graph. 29, 2, Article 16 (April 2010), 18 pages.Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page: