“PopStage: The Generation of Stage Cross-Editing Video Based on Spatio-Temporal Matching” by Lee, Yoo, Cho, Kim, Im, et al. … – ACM SIGGRAPH HISTORY ARCHIVES

“PopStage: The Generation of Stage Cross-Editing Video Based on Spatio-Temporal Matching” by Lee, Yoo, Cho, Kim, Im, et al. …

  • 2022 SA Technical Papers_Lee_PopStage: The Generation of Stage Cross-Editing Video based on Spatio-Temporal Matching

Conference:


Type(s):


Title:

    PopStage: The Generation of Stage Cross-Editing Video Based on Spatio-Temporal Matching

Session/Category Title:   Image Generation


Presenter(s)/Author(s):



Abstract:


    StageMix is a mixed video that is created by concatenating the segments from various performance videos of an identical song in a visually smooth manner by matching the main subject’s silhouette presented in the frame. We introduce PopStage, which allows users to generate a StageMix automatically. PopStage is designed based on the StageMix Editing Guideline that we established by interviewing creators as well as observing their workflows. PopStage consists of two main steps: finding an editing path and generating a transition effect at a transition point. Using a reward function that favors visual connection and the optimality of transition timing across the videos, we obtain the optimal path that maximizes the sum of rewards through dynamic programming. Given the optimal path, PopStage then aligns the silhouettes of the main subject from the transitioning video pair to enhance the visual connection at the transition point. The virtual camera view is next optimized to remove the black areas that are often created due to the transformation needed for silhouette alignment, while reducing pixel loss. In this process, we enforce the view to be the maximum size while maintaining the temporal continuity across the frames. Experimental results show that PopStage can generate a StageMix of a similar quality to those produced by professional creators in a highly reduced production time.

References:


    1. Ido Arev, Hyun Soo Park, Yaser Sheikh, Jessica Hodgins, and Ariel Shamir. 2014. Automatic editing of footage from multiple social cameras. ACM Transactions on Graphics (TOG) 33, 4 (2014), 1–11.
    2. Jiamin Bai, Aseem Agarwala, Maneesh Agrawala, and Ravi Ramamoorthi. 2014. User-Assisted Video Stabilization. In Computer Graphics Forum, Vol. 33. Wiley Online Library, 61–70.
    3. Sophia Bano and Andrea Cavallaro. 2016. ViComp: composition of user-generated videos. Multimedia tools and applications 75, 12 (2016), 7187–7210.
    4. Thaddeus Beier and Shawn Neely. 1992. Feature-based image metamorphosis. ACM SIGGRAPH computer graphics 26, 2 (1992), 35–42.
    5. Dimitri P Bertsekas et al. 2000. Dynamic programming and optimal control: Vol. 1. Athena scientific Belmont.
    6. Seunghoon Cha, Jungjin Lee, Seunghwa Jeong, Younghui Kim, and Junyong Noh. 2020. Enhanced Interactive 360° Viewing via Automatic Guidance. ACM Transactions on Graphics (TOG) 39, 5 (2020), 1–15.
    7. Jianhui Chen, Lili Meng, and James J Little. 2018. Camera selection for broadcasting soccer games. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 427–435.
    8. Abe Davis and Maneesh Agrawala. 2018. Visual rhythm and beat. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–11.
    9. Vineet Gandhi, Remi Ronfard, and Michael Gleicher. 2014. Multi-clip video editing from a single viewpoint. In Proceedings of the 11th European Conference on Visual Media Production. 1–10.
    10. Michael L Gleicher and Feng Liu. 2008. Re-cinematography: Improving the camerawork of casual video. ACM transactions on multimedia computing, communications, and applications (TOMM) 5, 1 (2008), 1–28.
    11. Matthias Grundmann, Vivek Kwatra, and Irfan Essa. 2011. Auto-directed video stabilization with robust l1 optimal camera paths. In CVPR 2011. IEEE, 225–232.
    12. Jianzhu Guo, Xiangyu Zhu, Yang Yang, Fan Yang, Zhen Lei, and Stan Z Li. 2020. Towards fast, accurate and stable 3d dense face alignment. In Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIX 16. Springer, 152–168.
    13. Rachel Heck, Michael Wallick, and Michael Gleicher. 2007. Virtual videography. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 3, 1 (2007), 4–es.
    14. Ronald A Howard. 1960. Dynamic programming and markov processes. (1960).
    15. Eakta Jain, Yaser Sheikh, Ariel Shamir, and Jessica Hodgins. 2015. Gaze-driven video re-editing. ACM Transactions on Graphics (TOG) 34, 2 (2015), 1–12.
    16. Kyoungkook Kang and Sunghyun Cho. 2019. Interactive and automatic navigation for 360 video playback. ACM Transactions on Graphics (TOG) 38, 4 (2019), 1–11.
    17. Cigdem Koçberber and Albert Ali Salah. 2014. Video retargeting: video saliency and optical flow based hybrid approach. In Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence. Citeseer.
    18. Lucas Kovar, Michael Gleicher, and Frédéric Pighin. 2008. Motion graphs. In ACM SIGGRAPH 2008 classes. 1–10.
    19. Dieter Kraft. 1994. Algorithm 733: TOMP-Fortran modules for optimal control calculations. ACM Transactions on Mathematical Software (TOMS) 20, 3 (1994), 262–281.
    20. Dieter Kraft et al. 1988. A software package for sequential quadratic programming. (1988).
    21. Moneish Kumar, Vineet Gandhi, Rémi Ronfard, and Michael Gleicher. 2017. Zooming on all actors: Automatic focus+ context split screen video generation. In Computer Graphics Forum, Vol. 36. Wiley Online Library, 455–465.
    22. Mackenzie Leake, Abe Davis, Anh Truong, and Maneesh Agrawala. 2017. Computational video editing for dialogue-driven scenes. ACM Trans. Graph. 36, 4 (2017), 130–1.
    23. Di Liu, Zhaogai Wu, Xianming Lin, and Rongrong Ji. 2016. Towards perceptual video cropping with curve fitting. Multimedia Tools and Applications 75, 20 (2016), 12465–12475.
    24. Feng Liu and Michael Gleicher. 2006. Video retargeting: automating pan and scan. In Proceedings of the 14th ACM international conference on Multimedia. 241–250.
    25. Qiong Liu, Don Kimber, Jonathan Foote, Lynn Wilcox, and John Boreczky. 2002. FLY-SPEC: A multi-user video camera system with hybrid human and automatic control. In Proceedings of the tenth ACM international conference on Multimedia. 484–492.
    26. KL Bhanu Moorthy, Moneish Kumar, Ramanathan Subramanian, and Vineet Gandhi. 2020. GAZED-Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–11.
    27. Anyi Rao, Jiaze Wang, Linning Xu, Xuekun Jiang, Qingqiu Huang, Bolei Zhou, and Dahua Lin. 2020. A unified framework for shot type classification based on subject centric lens. In European Conference on Computer Vision. Springer, 17–34.
    28. Yong Rui, Anoop Gupta, Jonathan Grudin, and Liwei He. 2004. Automating lecture capture and broadcast: technology and videography. Multimedia Systems 10, 1 (2004), 3–15.
    29. Mukesh Kumar Saini, Raghudeep Gadde, Shuicheng Yan, and Wei Tsang Ooi. 2012. Movimash: online mobile video mashup. In Proceedings of the 20th ACM international conference on Multimedia. 139–148.
    30. Mohamed Sayed, Robert Cinca, Enrico Costanza, and Gabriel Brostow. 2022. LookOut! Interactive Camera Gimbal Controller for Filming Long Takes. ACM Transactions on Graphics (TOG) 41, 3 (2022), 1–16.
    31. Fuhao Shi, Sung-Fang Tsai, Youyou Wang, and Chia-Kai Liang. 2019. Steadiface: Real-Time Face-Centric Stabilization On Mobile Phones. In 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 4599–4603.
    32. Prarthana Shrestha, Peter HN de With, Hans Weda, Mauro Barbieri, and Emile HL Aarts. 2010. Automatic mashup generation from multiple-camera concert recordings. In Proceedings of the 18th ACM international conference on Multimedia. 541–550.
    33. Yu-Chuan Su and Kristen Grauman. 2017. Making 360 video watchable in 2d: Learning videography for click free viewing. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 1368–1376.
    34. Yu-Chuan Su, Dinesh Jayaraman, and Kristen Grauman. 2016. Pano2Vid: Automatic Cinematography for Watching 360° Videos. In Asian Conference on Computer Vision. Springer, 154–171.
    35. Min Sun, Ali Farhadi, Ben Taskar, and Steve Seitz. 2014. Salient montages from unconstrained videos. In European Conference on Computer Vision. Springer, 472–488.
    36. Yoshinao Takemae, Kazuhiro Otsuka, and Naoki Mukawa. 2004. Impact of video editing based on participants’ gaze in multiparty conversation. In CHI’04 extended abstracts on Human Factors in Computing Systems. 1333–1336.
    37. Adobe Photoshop. 2022. Adobe Photoshop. https://www.adobe.com/products/photoshop
    38. Adobe Premiere Pro. 2022. Adobe Premiere Pro. https://www.adobe.com/products/premiere
    39. Adrien Treuille, Yongjoon Lee, and Zoran Popović. 2007. Near-optimal character animation with continuous control. In ACM SIGGRAPH 2007 papers. 7–es.
    40. Anh Truong, Floraine Berthouzoz, Wilmot Li, and Maneesh Agrawala. 2016. Quickcut: An interactive tool for editing narrated video. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. 497–507.
    41. Yu-Shuen Wang, Hui-Chih Lin, Olga Sorkine, and Tong-Yee Lee. 2010. Motion-based video retargeting with optimized crop-and-warp. In ACM SIGGRAPH 2010 papers. 1–9.
    42. Jung Eun Yoo, Kwanggyoon Seo, Sanghun Park, Jaedong Kim, Dawon Lee, and Junyong Noh. 2021. Virtual Camera Layout Generation using a Reference Video. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–11.
    43. Fang-Lue Zhang, Jue Wang, Han Zhao, Ralph R Martin, and Shi-Min Hu. 2015. Simultaneous camera path optimization and distraction removal for improving amateur video. IEEE Transactions on Image Processing 24, 12 (2015), 5982–5994.
    44. Lei Zhang, Xiao-Quan Chen, Xin-Yi Kong, and Hua Huang. 2017. Geodesic video stabilization in transformation space. IEEE Transactions on Image Processing 26, 5 (2017), 2219–2229.
    45. Xuaner Zhang, Kevin Matzen, Vivien Nguyen, Dillon Yao, You Zhang, and Ren Ng. 2019. Synthetic defocus and look-ahead autofocus for casual videography. ACM Transactions on Graphics (TOG) 38, 4 (2019), 1–16.


ACM Digital Library Publication:



Overview Page:



Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org