Opt: A Domain Specific Language for Non-Linear Least Squares Optimization in Graphics and Imaging

Many graphics and vision problems can be expressed as non-linear least squares optimizations of objective functions over visual data, such as images and meshes. The mathematical descriptions of these functions are extremely concise, but their implementation in real code is tedious, especially when optimized for real-time performance on modern GPUs in interactive applications. In this work, we propose a new language, Opt,¹ for writing these objective functions over image- or graph-structured unknowns concisely and at a high level. Our compiler automatically transforms these specifications into state-of-the-art GPU solvers based on Gauss-Newton or Levenberg-Marquardt methods. Opt can generate different variations of the solver, so users can easily explore tradeoffs in numerical precision, matrix-free methods, and solver approaches.

In our results, we implement a variety of real-world graphics and vision applications. Their energy functions are expressible in tens of lines of code and produce highly optimized GPU solver implementations. These solvers are competitive in performance with the best published hand-tuned, application-specific GPU solvers, and orders of magnitude beyond a general-purpose auto-generated solver.

References:

Sameer Agarwal, Keir Mierle et al. 2010. Ceres Solver. Retrieved from http://ceres-solver.org.
Sameer Agarwal, Noah Snavely, Ian Simon, Steven M. Seitz, and Richard Szeliski. 2009. Building Rome in a day. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. IEEE, 72–79.
Marc Alexa, Daniel Cohen-Or, and David Levin. As-rigid-as-possible shape interpolation. In Proceedings of the Special Interest Group on Computer Graphics (SIGGRAPH’00). New York.
Stefania Bellavia, Jacek Gondzio, and Benedetta Morini. 2013. A matrix-free preconditioner for sparse symmetric positive definite systems and least-squares problems. SIAM J. Sci. Comput. 35, 1 (2013), A192–A211.
Gilbert Louis Bernstein, Chinmayee Shah, Crystal Lemire, Zachary Devito, Matthew Fisher, Philip Levis, and Pat Hanrahan. 2016. Ebb: A DSL for physical simulation on CPUs and GPUs. ACM Trans. Graph. (TOG) 35, 2 (2016), 21.
Ake Björck. 1996. Numerical Methods for Least Squares Problems. Siam.
Sergey Bochkanov. 1999. ALGLIB. Retrieved from http://www.alglib.net.
Nicolas Bonneel, Kalyan Sunkavalli, James Tompkin, Deqing Sun, Sylvain Paris, and Hanspeter Pfister. 2014. Interact. Intrinsic Video Edit. 33, 6 (Nov. 2014), 197:1–10. DOI:https://doi.org/10.1145/2661229.2661253
Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press, New York, NY.
Martine Ceberio and Vladik Kreinovich. 2004. Greedy algorithms for optimizing multivariate Horner schemes. SIGSAM Bull. 38, 1 (March 2004), 8–15. DOI:https://doi.org/10.1145/980175.980179
Angela Dai, Matthias Nießner, Michael Zollhöfer, Shahram Izadi, and Christian Theobalt. 2017. BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Trans. Graph. (TOG) 36, 3 (2017), 24.
Frank Dellaert. 2012. GTSAM. Retrieved from https://collab.cc.gatech. edu/borg/gtsam.
Ron S. Dembo, Stanley C. Eisenstat, and Trond Steihaug. 1982. Inexact Newton methods. SIAM J. Numer. Anal. 19, 2 (1982), 400–408.
Ron S. Dembo and Trond Steihaug. 1983. Truncated-Newtono algorithms for large-scale unconstrained optimization. Math. Program. 26, 2 (1983), 190–212.
Mathieu Desbrun, Mark Meyer, Peter Schröder, and Alan H. Barr. 1999. Implicit fairing of irregular meshes using diffusion and curvature flow. In Proceedings of the Special Interest Group on Computer Graphics (SIGGRAPH’99). New York.
Zachary DeVito, James Hegarty, Alex Aiken, Pat Hanrahan, and Jan Vitek. 2013. Terra: A multi-stage language for high-performance computing. In Proceedings of the Conference on Programming Language, Design, and Implementation (PLDI’13). New York, 105–116.
Marek Dvoroznak. 2014. Interactive as-rigid-as-possible image deformation and registration. In Proceedings of the 18th Central European Seminar on Computer Graphics.
R. Fourer, D. M. Gay, and B. Kernighan. 1989. Algorithms and model formulations in mathematical programming. Springer-Verlag New York, Inc., New York, Chapter AMPL: A Mathematical Programming Language, 150–151. Retrieved from http://dl.acm.org/citation.cfm?id=107479.107491
Michael Grant and Stephen Boyd. 2008. Graph implementations for nonsmooth convex programs. In Recent Advances in Learning and Control. Springer-Verlag Limited, 95–110.
Michael Grant and Stephen Boyd. 2014. CVX: Matlab Software for Disciplined Convex Programming, version 2.1. Retrieved from http://cvxr.com/cvx.
Serge Gratton, Amos S. Lawless, and Nancy K. Nichols. 2007. Approximate Gauss-Newton methods for nonlinear least squares problems. SIAM J. Optim. 18, 1 (2007), 106–132.
Andreas Griewank and Andrea Walther. 2008. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation (2nd ed.). SIAM, Philadelphia, PA.
Marcus J. Grote and Thomas Huckle. 1997. Parallel preconditioning with sparse approximate inverses. SIAM J. Sci. Comput. 18, 3 (1997), 838–853.
F. Grund. 1982. Automatic differentiation: Techniques and applications. Lecture notes in computer science 120. ZAMM 62, 7 (1982), 355–355. DOI:https://doi.org/10.1002/zamm.19820620735
Gaël Guennebaud, Benoît Jacob, and others. 2010. Eigen v3. Retrieved from http://eigen.tuxfamily.org.
Brian Guenter. 2007. Efficient symbolic differentiation for graphics applications. In Proceedings of the Special Interest Group on Computer Graphics (SIGGRAPH’07). Article 108.
Brian Guenter, John Rapp, and Mark Finch. 2011. Symbolic Differentiation in GPU Shaders. Technical Report MSR-TR-2011-31. Retrieved from http://research.microsoft.com/apps/pubs/default.aspx?id=146019.
Felix Heide, Steven Diamond, Matthias Nießner, Jonathan Ragan-Kelley, Wolfgang Heidrich, and Gordon Wetzstein. 2016. ProxImaL: Efficient image optimization using proximal algorithms. ACM Trans. Graph. (TOG) 35, 4 (2016), 84.
Berthold K. P. Horn and Brian G. Schunck. 1981. Determining optical flow. Artific. Intell. 17 (1981), 185–203.
Matthias Innmann, Michael Zollhöfer, Matthias Nießner, Christian Theobalt, and Marc Stamminger. 2016. VolumeDeform: Real-time volumetric non-rigid reconstruction. In Proceedings of the European Conference on Computer Vision. Springer, 362–379.
Liliya Kharevych, Boris Springborn, and Peter Schröder. 2006. Discrete conformal mappings via circle patterns. ACM Trans. Graph. (TOG) 25, 2 (2006), 412–438.
Fredrik Kjolstad, Shoaib Kamil, Jonathan Ragan-Kelley, David I. W. Levin, Shinjiro Sueda, Desai Chen, Etienne Vouga, Danny M. Kaufman, Gurtej Kanwar, Wojciech Matusik, and others. 2016. Simit: A language for physical simulation. ACM Trans. Graph. (TOG) 35, 2 (2016), 20.
Dana A. Knoll and David E. Keyes. 2004. Jacobian-free Newton–Krylov methods: A survey of approaches and applications. J. Comput. Phys. 193, 2 (2004), 357–397.
Rainer Kummerle, Giorgio Grisetti, Hauke Strasdat, Kurt Konolige, and Wolfram Burgard. 2011. g2o: A general framework for graph optimization. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’11). IEEE, 3607–3613.
Edwin H. Land and John J. McCann. 1971. Lightness and retinex theory. J. Optic. Soc. Amer. 61, 1 (Jan. 1971), 1–11. DOI:https://doi.org/10.1364/JOSA.61.000001
Kenneth Levenberg. 1944. A method for the solution of certain non-linear problems in least squares. Quart. Appl. Math. 2, 2 (1944), 164–168.
Zohar Levi and Denis Zorin. 2014. Strict minimizers for geometric optimization. ACM Trans. Graph. (TOG) 33, 6 (2014), 185.
Hao Li, Bart Adams, Leonidas J. Guibas, and Mark Pauly. 2009. Robust single-view geometry and motion reconstruction. In ACM Trans. Graph. (TOG), Vol. 28. ACM, 175.
Donald W. Marquardt. 1963. An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Industr. Appl. Math. 11, 2 (1963), 431–441.
Hermann Matthies and Gilbert Strang. 1979. The solution of nonlinear finite element equations. Int. J. Numer. Methods Eng. 14, 11 (1979), 1613–1626.
Abhimitra Meka, Michael Zollhöfer, Christian Richardt, and Christian Theobalt. 2016. Live intrinsic video. ACM Trans. Graph. (Proceedings SIGGRAPH) 35, 4 (2016). DOI:https://doi.org/10.1145/2897824.2925907
Mark Meyer, Mathieu Desbrun, Peter Schrder, and Alan H. Barr. 2003. Discrete differential-geometry operators for triangulated 2-manifolds. In Visualization and Mathematics III, Hans-Christian Hege and Konrad Polthier (Eds.). Springer, Berlin, 35–57. DOI:https://doi.org/10.1007/978-3-662-05105-4_2
Stephen G. Nash. 2000. A survey of truncated-Newton methods. J. Comput. Appl. Math. 124, 1 (2000), 45–59.
Jorge Nocedal and Stephen Wright. 2006a. Numerical Optimization. Springer Science 8 Business Media.
J. Nocedal and S. J. Wright. 2006b. Numerical Optimization (2nd ed.). Springer, New York.
NVIDIA 2012. CUDA CUSPARSE Library. NVIDIA.
Patrick Pérez, Michel Gangnet, and Andrew Blake. 2003. Poisson image editing. In Proceedings of the Special Interest Group on Computer Graphics (SIGGRAPH’03). ACM, New York, NY, 313–318. DOI:https://doi.org/10.1145/1201775.882269
Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Frédo Durand. 2012. Decoupling algorithms from schedules for easy optimization of image processing pipelines. ACM Trans. Graph. 31, 4, Article 32 (July 2012), 12 pages. DOI:https://doi.org/10.1145/2185520.2185528
Noah Snavely, Steven M. Seitz, and Richard Szeliski. 2006. Photo tourism: Exploring photo collections in 3D. In ACM Trans. Graph. (TOG) 25, 835–846.
Olga Sorkine and Marc Alexa. 2007. As-rigid-as-possible surface modeling. In Proceedings of the Symposium on Geometry Processing, Vol. 4.
Robert W. Sumner, Johannes Schmid, and Mark Pauly. 2007. Embedded deformation for shape manipulation. In Proceedings of the Special Interest Group on Computer Graphics (SIGGRAPH’07). New York, Article 80.
SymPy Development Team. 2014. SymPy: Python Library for Symbolic Mathematics. Retrieved from http://www.sympy.org.
The MathWorks Inc. 2015. MATLAB. https://de.mathworks.com/products/matlab.html.
J. Thies, M. Zollhöfer, M. Nießner, L. Valgaerts, M. Stamminger, and C. Theobalt. 2015. Real-time expression transfer for facial reenactment. ACM Trans. Graph. (TOG) 34, 6 (2015), 183:1–183:14.
J. Thies, M. Zollhöfer, M. Stamminger, C. Theobalt, and M. Nießner. 2016a. Face2Face: Real-time face capture and reenactment of RGB videos. In Proceedings of the Computer Vision and Pattern Recognition (CVPR’16). IEEE.
Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2016b. FaceVR: Real-time facial reenactment and eye gaze control in virtual reality. arXiv preprint arXiv:1610.03151 (2016).
Michael D. Tiemann. 1989. The GNU Instruction Scheduler. Technical Report. Cambridge, MA.
Bill Triggs, Philip F. McLauchlan, Richard I. Hartley, and Andrew W. Fitzgibbon. 1999. Bundle adjustment—a modern synthesis. In Vision Algorithms: Theory and Practice. Springer, 298–372.
Daniel Vlasic, Ilya Baran, Wojciech Matusik, and Jovan Popović. 2008. Articulated mesh animation from multi-view silhouettes. In ACM Trans. Graph. (TOG), Vol. 27. ACM, 97.
Daniel Vlasic, Pieter Peers, Ilya Baran, Paul Debevec, Jovan Popović, Szymon Rusinkiewicz, and Wojciech Matusik. 2009. Dynamic shape capture using multi-view photometric stereo. ACM Trans. Graph. (TOG) 28, 5 (2009), 174.
A. Walther and A. Griewank. 2012. Getting started with ADOL-C. In Combinatorial Scientific Computing, U. Naumann and O. Schenk (Eds.). Chapman-Hall CRC Computational Science, Chapter 7, 181–202.
Michael Wand, Philipp Jenke, Qixing Huang, Martin Bokeloh, Leonidas Guibas, and Andreas Schilling. 2007. Reconstruction of deforming geometry from time-varying point clouds. In Proceedings of the Symposium on Geometry Processing. Citeseer, 49–58.
Daniel Weber, Jan Bender, Markus Schnoes, André Stork, and Dieter Fellner. 2013. Efficient GPU data structures and methods to solve sparse linear systems in dynamics applications. In Proceedings of the Computer Graphics Forum, Vol. 32. Wiley Online Library, 16–26.
Cornelius Wefelscheid and Olaf Hellwich. 2013. OpenOF: Framework for sparse non-linear least squares optimization on a GPU. In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP’13).
Wolfram Research. 2000. Mathematica. (2000).
S. J. Wright and John Norman Holt. 1985. An inexact Levenberg-Marquardt method for large sparse nonlinear least squres. J. Austral. Math. Soc. Ser. B. Appl. Math. 26, 04 (1985), 387–403.
Changchang Wu, Sameer Agarwal, Brian Curless, and Steven M. Seitz. 2011. Multicore bundle adjustment. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE, 3057–3064.
Chenglei Wu, Michael Zollhöfer, Matthias Nießner, Marc Stamminger, Shahram Izadi, and Christian Theobalt. 2014. Real-time shading-based refinement for consumer depth cameras. ACM Trans. Graph. (TOG) 33, 6 (2014), 200:1–200:10.
Wei Xu, Ning Zheng, and Ken Hayami. 2016. Jacobian-free implicit inner-iteration preconditioner for nonlinear least squares problems. Journal of Scientific Computing 68, 3 (2016), 1055–1081.
Yuting Yang, Sam Prestwood, and Connelly Barnes. 2016. VizGen: Accelerating visual computing prototypes in dynamic languages. ACM Trans. Graph. (TOG) 35, 6 (2016), 206.
Christopher Zach. 2014. Robust bundle adjustment revisited. In Computer Vision–ECCV 2014. Springer, 772–787.
Michael Zollhöfer, Angela Dai, Matthias Innmann, Chenglei Wu, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2015. Shading-based refinement on volumetric signed distance functions. ACM Trans. Graph. 34, 4, Article 96 (July 2015), 14 pages. DOI:https://doi.org/10.1145/2766887
Michael Zollhöfer, Matthias Nießner, Shahram Izadi, Christoph Rhemann, Christopher Zach, Matthew Fisher, Chenglei Wu, Andrew Fitzgibbon, Charles Loop, Christian Theobalt, and Marc Stamminger. 2014. Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. (TOG) 33, 4 (2014).