Action-driven 3D indoor scene evolution

We introduce a framework for action-driven evolution of 3D indoor scenes, where the goal is to simulate how scenes are altered by human actions, and specifically, by object placements necessitated by the actions. To this end, we develop an action model with each type of action combining information about one or more human poses, one or more object categories, and spatial configurations of objects belonging to these categories which summarize the object-object and object-human relations for the action. Importantly, all these pieces of information are learned from annotated photos. Correlations between the learned actions are analyzed to guide the construction of an action graph. Starting with an initial 3D scene, we probabilistically sample a sequence of actions from the action graph to drive progressive scene evolution. Each action triggers appropriate object placements, based on object co-occurrences and spatial configurations learned for the action model. We show results of our scene evolution that lead to realistic and messy 3D scenes, as well as quantitative evaluations by user studies which compare our method to manual scene creation and state-of-the-art, data-driven methods, in terms of scene plausibility and naturalness.

References:

1. Akaike, H. 1973. Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory, vol. 1, 267–281.
2. Chen, K., Lai, Y.-K., Wu, Y.-X., Martin, R., and Hu, S.-M. 2014. Automatic semantic modeling of indoor scenes from low-quality RGB-D data using contextual information. ACM Trans. on Graph. 33, 6, 208:1–12.
3. Fisher, M., Ritchie, D., Savva, M., Funkhouser, T., and Hanrahan, P. 2012. Example-based synthesis of 3D object arrangements. ACM Trans. on Graph. 31, 6, 135:1–11.
4. Fisher, M., Li, Y., Savva, M., Hanrahan, P., and Niessner, M. 2015. Activity-centric scene synthesis for functional 3D scene modeling. ACM Trans. on Graph. 34, 6, 212:1–10.
5. Fisher, N. I. 1993. Statistical Analysis of Circular Data. Cambridge University Press, Cambridge.
6. Fouhey, D. F., Delaitre, V., Gupta, A., Efros, A. A., Laptev, I., and Sivic, J. 2012. People watching: Human actions as a cue for single-view geometry. In ECCV, 732–745.
7. Germer, T., and Schwarz, M. 2009. Procedural arrangement of furniture for real-time walkthroughs. Computer Graphics Forum 28, 8, 2068–2078. Cross Ref
8. Hu, R., van Kaick, O., Wu, B., Huang, H., Shamir, A., and Zhang, H. 2016. Learning how objects function via co-analysis of interactions. ACM Trans. on Graph. 35, 4, 47:1–12.
9. Jiang, Y., Lim, M., and Saxena, A. 2012. Learning object arrangements in 3d scenes using human context. In Proc. Int. Conf. on Machine Learning (ICML).
10. Jiang, Y., Koppula, H., and Saxena, A. 2013. Hallucinated humans as the hidden context for labeling 3d scenes. In IEEE CVPR, 2993–3000.
11. Kim, Y. M., Mitra, N. J., Yan, D.-M., and Guibas, L. 2012. Acquiring 3D indoor environments with variability and repetition. ACM Trans. on Graph. 31, 6, 138:1–138:11.
12. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollr, P., and Zitnick, C. L. 2014. Microsoft COCO: Common objects in context. In ECCV, 740–755.
13. Liu, Z., Zhang, Y., Wu, W., Liu, K., and Sun, Z. 2015. Model-driven indoor scenes modeling from a single image. In Proc. of Graphics Interface, 25–32.
14. Majerowicz, L., Shamir, A., Sheffer, A., and Hoos, H. H. 2014. Filling your shelves: Synthesizing diverse style-preserving artifact arrangements. IEEE Trans. Visualization & Computer Graphics 20, 11, 1507–1518. Cross Ref
15. Merrell, P., Schkufza, E., Li, Z., Agrawala, M., and Koltun, V. 2011. Interactive furniture layout using interior design guidelines. ACM Trans. on Graph. 30, 4, 87:1–10.
16. Ronchi, M. R., and Perona, P. 2015. Describing common human visual actions in images. In Proc. of the British Machine Vision Conference (BMVC), 52:1–12.
17. Sadeghipour, Z., Liao, Z., Tan, P., and Zhang, H. 2016. Learning 3D scene synthesis from annotated RGB-D images. Computer Graphics Forum (SGP) 35, 5.
18. Savva, M., Chang, A. X., Hanrahan, P., Fisher, M., and Niessner, M. 2014. SceneGrok: Inferring action maps in 3D environments. ACM Trans. on Graph. 33, 6, 212:1–10.
19. Savva, M., Chang, A. X., Hanrahan, P., Fisher, M., and Niessner, M. 2016. PiGraphs: Learning interaction snapshots from observations. ACM Trans. on Graph. 35, 4.
20. Shao, T., Xu, W., Zhou, K., Wang, J., Li, D., and Guo, B. 2012. An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Trans. on Graph. 31, 6, 136:1–11.
21. Sharf, A., Huang, H., Liang, C., Zhang, J., Chen, B., and Gong, M. 2013. Mobility-trees for indoor scenes manipulation. Computer Graphics Forum 32, 1–13.
22. Xu, K., Chen, K., Fu, H., Sun, W.-L., and Hu, S.-M. 2013. Sketch2Scene: Sketch-based co-retrieval and co-placement of 3D models. ACM Trans. on Graph. 32, 4, 123:1–10.
23. Yu, L.-F., Yeung, S. K., Tang, C.-K., Terzopoulos, D., Chan, T. F., and Osher, S. 2011. Make it home: automatic optimization of furniture arrangement. ACM Trans. on Graph. 30, 4, 86:1–12.
24. Zhou, X., Leonardos, S., Hu, X., and Daniilidis, K. 2015. 3D shape estimation from 2D landmarks: A convex relaxation approach. In IEEE CVPR, 4447–4455.

ACM Digital Library Publication:

Overview Page:

SIGGRAPH Asia 2016: Technical Papers

Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org

ACM SIGGRAPH HISTORY ARCHIVES

“Action-driven 3D indoor scene evolution” by Ma, Li, Zou, Liao, Tong, et al. …

Conference:

Type(s):

Title:

Session/Category Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Submit a story:

Sponsored by: