“Coupled Structure-From-Motion and 3D Symmetry Detection for Urban Façades” by Ceylan, Mitra, Zheng and Pauly

  • ©Duygu Ceylan, Niloy J. Mitra, Youyi Zheng, and Mark Pauly




    Coupled Structure-From-Motion and 3D Symmetry Detection for Urban Façades

Session/Category Title:   Depth for All Occasions




    Repeated structures are ubiquitous in urban facades. Such repetitions lead to ambiguity in establishing correspondences across sets of unordered images. A decoupled structure-from-motion reconstruction followed by symmetry detection often produces errors: outputs are either noisy and incomplete, or even worse, appear to be valid but actually have a wrong number of repeated elements. We present an optimization framework for extracting repeated elements in images of urban facades, while simultaneously calibrating the input images and recovering the 3D scene geometry using a graph-based global analysis. We evaluate the robustness of the proposed scheme on a range of challenging examples containing widespread repetitions and nondistinctive features. These image sets are common but cannot be handled well with state-of-the-art methods. We show that the recovered symmetry information along with the 3D geometry enables a range of novel image editing operations that maintain consistency across the images.


    1. S. Alhalawani, Y.-L. Yang, H. Liu, and N. J. Mitra. 2013. Interactive facades: Analysis and synthesis of semi-regular facades. Comput. Graph. Forum 32, 2pt3, 215–224.
    2. G. Baatz, K. Koser, D. Chen, R. Grzeszcuzuk, and M. Pollefeys. 2010. Handling urban location recognition as a 2D homothetic problem. In Proceedings of the European Conference on Computer Vision (ECCV’10). 266–279.
    3. C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Goldman. 2009. PatchMatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 3, 24:1–24:11.
    4. A. Bartoli and P. Sturm. 2003. Constrained structure and motion from multiple uncalibrated views of a piecewise planar scene. Int. J. Comput. Vis. 52, 45–64.
    5. Y. Boykov, O. Veksler, and R. Zabih. 2001. Fast approximate energy minimization via graph cuts. IEEE Pattern Anal. Mach. Intell. 23, 11, 1222–1239.
    6. N. D. Campbell, G. Vogiatzis, C. Hernandez, and R. Cipolla. 2008. Using multiple hypotheses to improve depth-maps for multi-view stereo. In Proceedings of the European Conference on Computer Vision (ECCV’08). 766–779.
    7. D. Ceylan, N. J. Mitra, H. Li, T. Weise, and M. Pauly. 2012. Factored façade acquisition using symmetric line arrangements. Comput. Graph. Forum 31, 1, 671–680.
    8. A. Cohen, C. Zach, S. Sinha, and M. Pollefeys. 2012. Discovering and exploiting 3D symmetries in structure from motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12). 1514–1521.
    9. CVX Research. 2012. CVX: Matlab software for disciplined convex programming, version 2.0 beta. http://cvxr.com/cvx
    10. Y. Furukawa and J. Ponce. 2007. Accurate, dense, and robust multi-view stereopsis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’07). 1–8.
    11. A. Gil, O. Reinoso, O. Mozos, C. Stachnissi, and W. Burgard. 2006. Improving data association in vision-based slam. In Intelligent Robots and Systems, 2076–2081.
    12. V. Govindu. 2004. Lie-algebraic averaging for globally consistent motion estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’04). 684–691.
    13. V. M. Govind. 2006. Robustness in motion averaging. In Proceedings of the Asian Conference on Computer Vision (ACCV’06). 457–466.
    14. M. Grant and S. Boyd. 2008. Graph implementations for nonsmooth convex programs. In Recent Advances in Learning and Control, Lecture Notes in Control and Information Sciences, vol. 371, Springer, 95–110.
    15. K. Heath, N. Gelfand, M. Ovsjanikov, M. Aanjaneya, and L. Guibas. 2010. Image webs: Computing and exploiting connectivity in image collections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10). 3432–3439.
    16. N. Jiang, P. Tan, and L.-F. Cheong. 2009. Symmetric architecture modeling with a single image. ACM Trans. Graph. 28, 5, 113:1–113:8.
    17. N. Jiang, P. Tan, and L.-F. Cheong. 2011. Multi-view repetitive structure detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’11). 535–542.
    18. N. Jiang, P. Tan, and L.-F. Cheong. 2012. Seeing double without confusion: Structure-from-motion in highly ambiguous scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12). 1458–1465.
    19. M. Klopschitz, A. Irschara, G. Reitmayr, and D. Schmalstieg. 2010. Robust incremental structure from motion. In Proceedings of the International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT’10).
    20. Y. Li, Q. Zheng, A. Sharf, D. Cohen-Or, B. Chen, and N. J. Mitra. 2011. 2D-3D fusion for layer decomposition of urban facades. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’11). 882–889.
    21. M. A. Lourakis and A. Argyros. 2009. SBA: A software package for generic sparse bundle adjustment. ACM Trans. Math. Softw. 36, 1, 1–30.
    22. D. Martinec and T. Pajdla. 2007. Robust rotation and translation estimation in multiview reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (ICPR’07). 1–8.
    23. B. Micusik, H. Wildenauer, and J. Kosecka. 2008. Detection and matching of rectilinear structures. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08). 1–7.
    24. N. J. Mitra, M. Pauly, M. Wand, and D. Ceylan. 2013. Symmetry in 3D geometry: Extraction and applications. Comput. Graph. Forum 32, 6.
    25. P. Musialski, P. Wonka, D. G. Aliaga, M. Wimmer, I. Van Gool, and W. Purgathofer. 2012. A survey of urban reconstruction. Comput. Graph. Forum 32, 6.
    26. L. Nan, A. Sharf, H. Zhang, D. Cohen-Or, and B. Chen. 2010. Smartboxes for interactive urban reconstruction. ACM Trans. Graph. 29, 93:1–93:10.
    27. A. Nguyen, M. Ben-Chen, K. Welnicka, Y. Ye, and L. Guibas. 2011. An optimization approach to improving collections of shape maps. Comput. Graph. Forum 30, 5.
    28. M. Pauly, N. J. Mitra, J. Wallner, H. Pottmann, and L. Guibas. 2008. Discovering structural regularity in 3D geometry. ACM Trans. Graph. 27, 3, 43:1–43:11.
    29. P. Perez, M. Gangnet, and A. Blake. 2003. Poisson image editing. ACM Trans. Graph. 22, 3, 313–318.
    30. M. Pollefeys, D. Nister, J. M. Frahm, A. Akbarzadeh, P. Mordohai, B. Cliff, C. Engels, D. Gallup, S. J. Kim, P. Merrell, C. Salmi, S. Sinha, B. Talton, L. Wang, Q. Yang, H. Stewenius, R. Yang, G. Welch, and H. Towles. 2008. Detailed real-time urban 3D reconstruction from video. Int. J. Comput. Vis. 78, 143–167.
    31. L. Quan and T. Kanade. 2010. Image-Based Modeling 1st Ed. Springer.
    32. R. Roberts, S. Sinha, R. Szeliski, and D. Steedly. 2011. Structure from motion for scenes with large duplicate structures. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 3137–3144.
    33. C. Rother, V. Kolmogorov, and A. Blake. 2004. “GrabCut”: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23, 309–314.
    34. K. Sasaki, S. Kameda, and A. Iwata. 2006. Stereo matching algorithm using a weighted average of costs aggregated by various window sizes. In Proceedings of the Asian Conference on Computer Vision (ACCV’06). 771–780.
    35. S. N. Sinha, D. Steedly, and R. Szeliski. 2010. A multi-stage linear approach to structure from motion. In Proceedings of the European Conference on Computer Vision Workshop on Reconstruction and Modeling of Large-Scale 3D Virtual Environments. 267–281.
    36. S. N. Sinha, D. Steedly, R. Szeliski, M. Agrawala, and M. Pollefeys. 2008. Interactive 3D architectural modeling from unordered photo collections. ACM Trans. Graph. 27, 5, 159:1–159:10.
    37. N. Snavely. 2011. Scene reconstruction and visualization from internet photo collections: A survey. IPSJ Trans. Comput. Vis. Appl. 3, 44–66.
    38. N. Snavely, S. M. Seitz, and R. Szeliski. 2006. Photo tourism: Exploring photo collections in 3D. ACM Trans. Graph. 25, 3, 835–846.
    39. R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother. 2008. A comparative study of energy minimization methods for markov random fields with smoothness-based priors. IEEE Pattern Anal. Mach. Intell. 30, 6, 1068–1080.
    40. T. Tuytelaars, L. Van Gool, M. Proesmans, and T. Moons. 1998. The cascaded hough transform as an aid in aerial image interpretation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’98). 736–739.
    41. G. Wan, N. Snavely, D. Cohen-Or, Q. Zheng, B. Chen, and S. Li. 2012. Sorting unorganized photo sets for urban reconstruction. Graph. Models 74, 1, 14–28.
    42. C. Wu, J. Frahm, and M. Pollefeys. 2011. Repetition-based dense single-view reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 3113–3120.
    43. C. Wu, J.-M. Frahm, and M. Pollefeys. 2010a. Detecting large repetitive structures with salient boundaries. In Proceedings of the European Conference on Computer Vision (ECCV’10). 142–155.
    44. H. Wu, Y.-S. Wang, K.-C. Feng, T.-T. Wong, T.-Y. Lee, and P.-A. Heng. 2010b. Resizing by symmetry-summarization. ACM Trans. Graph. 29, 6, 159:1–159:9.
    45. J. Xiao, T. Fang, P. Tan, P. Zhao, E. Opek, and L. Quan. 2008. Image-based façade modeling. ACM Trans. Graph. 27, 5, 161:1–161:10.
    46. C. Zach, A. Irschara, and H. Bischof. 2008. What can missing correspondences tell us about 3D structure and motion? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08). 1–8.
    47. C. Zach, M. Klopschitz, and M. Pollefeys. 2010. Disambiguating visual relations using loop constraints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10). 1426–1433.
    48. Q. Zheng, A. Sharf, G. Wan, Y. Li, N. J. Mitra, D. Cohen-Or, and B. Chen. 2010. Non-local scan consolidation for 3D urban scenes. ACM Trans. Graph. 29, 4, 94:1–94:9.
    49. Y. Zheng, X. Chen, M.-M. Cheng, K. Zhou, S.-M. Hu, and N. J. Mitra. 2012. Interactive images: Cuboid proxies for smart image manipulation. ACM Trans. Graph. 31, 4, 99:1–99:11.

ACM Digital Library Publication:

Overview Page: