“Stylizing video by example” by Jamriška, Sochorová, Texler, Lukáč, Fišer, et al. …

  • ©Ondřej Jamriška, Šárka Sochorová, Ondřej Texler, Michal Lukáč, Jakub Fišer, Jingwan Lu, and Daniel Sýkora

Conference:


Type:


Session Title:

    Video

Title:

    Stylizing video by example

Presenter(s)/Author(s):



Abstract:


    We introduce a new example-based approach to video stylization, with a focus on preserving the visual quality of the style, user controllability and applicability to arbitrary video. Our method gets as input one or more keyframes that the artist chooses to stylize with standard painting tools. It then automatically propagates the stylization to the rest of the sequence. To facilitate this while preserving visual quality, we developed a new type of guidance for state-of-art patch-based synthesis, that can be applied to any type of video content and does not require any additional information besides the video itself and a user-specified mask of the region to be stylized. We further show a temporal blending approach for interpolating style between keyframes that preserves texture coherence, contrast and high frequency details. We evaluate our method on various scenes from real production setting and provide a thorough comparison with prior art.

References:


    1. Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. 2009. PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing. ACM Transactions on Graphics 28, 3 (2009), 24. Google ScholarDigital Library
    2. Pierre Bénard, Forrester Cole, Michael Kass, Igor Mordatch, James Hegarty, Martin Sebastian Senn, Kurt Fleischer, Davide Pesare, and Katherine Breeden. 2013. Stylizing Animation by Example. ACM Transactions on Graphics 32, 4 (2013), 119. Google ScholarDigital Library
    3. Pierre Bénard, Ares Lagae, Peter Vangorp, Sylvain Lefebvre, George Drettakis, and Joëlle Thollot. 2010. A Dynamic Noise Primitive for Coherent Stylization. Computer Graphics Forum 29, 4 (2010), 1497–1506. Google ScholarDigital Library
    4. Pravin Bhat, Brian Curless, Michael Cohen, and C. Lawrence Zitnick. 2008. Fourier Analysis of the 2D Screened Poisson Equation for Gradient Domain Problems. In Proceedings of European Conference on Computer Vision. 114–128. Google ScholarDigital Library
    5. Sai Bi, Xiaoguang Han, and Yizhou Yu. 2015. An L1 Image Transform for Edge-preserving Smoothing and Scene-level Intrinsic Decomposition. ACM Transactions on Graphics 34, 4 (2015), 78. Google ScholarDigital Library
    6. Adrien Bousseau, Matthew Kaplan, Joëlle Thollot, and François Sillion. 2006. Interactive Watercolor Rendering with Temporal Coherence and Abstraction. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering. 141–149. Google ScholarDigital Library
    7. Adrien Bousseau, Fabrice Neyret, Joëlle Thollot, and David Salesin. 2007. Video Watercolorization Using Bidirectional Texture Advection. ACM Transactions on Graphics 26, 3 (2007), 104. Google ScholarDigital Library
    8. Mark Browning, Connelly Barnes, Samantha Ritter, and Adam Finkelstein. 2014. Stylized Keyframe Animation of Fluid Simulations. In Proceedings of the Workshop on Non-Photorealistic Animation and Rendering. 63–70. Google ScholarDigital Library
    9. Dongdong Chen, Jing Liao, Lu Yuan, Nenghai Yu, and Gang Hua. 2017. Coherent Online Video Style Transfer. In Proceedings of IEEE International Conference on Computer Vision. 1114–1123.Google ScholarCross Ref
    10. Cassidy J. Curtis, Sean E. Anderson, Joshua E. Seims, Kurt W. Fleischer, and David H. Salesin. 1997. Computer-Generated Watercolor. In SIGGRAPH Conference Proceedings. 421–430. Google ScholarDigital Library
    11. Soheil Darabi, Eli Shechtman, Connelly Barnes, Dan B. Goldman, and Pradeep Sen. 2012. Image Melding: Combining Inconsistent Images Using Patch-Based Synthesis. ACM Transactions on Graphics 31, 4 (2012), 82. Google ScholarDigital Library
    12. Marek Dvorožňák, Wilmot Li, Vladimir G. Kim, and Daniel Sýkora. 2018. ToonSynth: Example-Based Synthesis of Hand-Colored Cartoon Animations. ACM Transactions on Graphics 37, 4 (2018), 167. Google ScholarDigital Library
    13. Jakub Fišer, Ondřej Jamriška, Michal Lukáč, Eli Shechtman, Paul Asente, Jingwan Lu, and Daniel Sýkora. 2016. StyLit: Illumination-Guided Example-Based Stylization of 3D Renderings. ACM Transactions on Graphics 35, 4 (2016), 92. Google ScholarDigital Library
    14. Jakub Fišer, Ondřej Jamriška, David Simons, Eli Shechtman, Jingwan Lu, Paul Asente, Michal Lukáč, and Daniel Sýkora. 2017. Example-Based Synthesis of Stylized Facial Animations. ACM Transactions on Graphics 36, 4 (2017). Google ScholarDigital Library
    15. Jakub Fišer, Michal Lukáč, Ondřej Jamriška, Martin Čadík, Yotam Gingold, Paul Asente, and Daniel Sýkora. 2014. Color Me Noisy: Example-Based Rendering of Hand-Colored Animations with Temporal Noise Control. Computer Graphics Forum 33, 4 (2014), 1–10.Google ScholarCross Ref
    16. Oriel Frigo, Neus Sabater, Julie Delon, and Pierre Hellier. 2016. Split and Match: Example-Based Adaptive Patch Sampling for Unsupervised Style Transfer. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 553–561.Google ScholarCross Ref
    17. Oriel Frigo, Neus Sabater, Julie Delon, and Pierre Hellier. 2019. Video Style Transfer by Consistent Adaptive Patch Sampling. The Visual Computer 35, 3 (2019), 429–443. Google ScholarDigital Library
    18. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image Style Transfer Using Convolutional Neural Networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2414–2423.Google ScholarCross Ref
    19. Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, Aaron Hertzmann, and Eli Shechtman. 2017. Controlling Perceptual Factors in Neural Style Transfer. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 3730–3738.Google ScholarCross Ref
    20. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems. 2672–2680. Google ScholarDigital Library
    21. Shuyang Gu, Congliang Chen, Jing Liao, and Lu Yuan. 2018. Arbitrary Style Transfer with Deep Feature Reshuffle. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 8222–8231.Google ScholarCross Ref
    22. Agrim Gupta, Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2017. Characterizing and Improving Stability in Neural Style Transfer. In Proceedings of IEEE International Conference on Computer Vision. 4087–4096.Google ScholarCross Ref
    23. Yoav HaCohen, Eli Shechtman, Dan Goldman, and Dani Lischinski. 2011. Non-rigid Dense Correspondence with Applications for Image Enhancement. ACM Transactions on Graphics 30, 4 (2011), 70. Google ScholarDigital Library
    24. William Van Haevre, Tom Van Laerhoven, Fabian Di Fiore, and Frank Van Reeth. 2007. From Dust Till Drawn: A Real-Time Bidirectional Pastel Simulation. The Visual Computer 23, 9–11 (2007), 925–934. Google ScholarDigital Library
    25. James Hays and Irfan A. Essa. 2004. Image and Video Based Painterly Animation. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering. 113–120. Google ScholarDigital Library
    26. Eric Heitz and Fabrice Neyret. 2018. High-Performance By-Example Noise Using a Histogram-Preserving Blending Operator. Proceedings of the ACM on Computer Graphics and Interactive Techniques 1, 2 (2018), 31. Google ScholarDigital Library
    27. Aaron Hertzmann, Charles E Jacobs, Nuria Oliver, Brian Curless, and David H Salesin. 2001. Image Analogies. In SIGGRAPH Conference Proceedings. 327–340. Google ScholarDigital Library
    28. Xun Huang and Serge Belongie. 2017. Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. In Proceedings of IEEE International Conference on Computer Vision. 1510–1519.Google ScholarCross Ref
    29. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. 5967–5976.Google Scholar
    30. Ondřej Jamriška, Jakub Fišer, Paul Asente, Jingwan Lu, Eli Shechtman, and Daniel Sýkora. 2015. LazyFluids: Appearance Transfer for Fluid Animations. ACM Transactions on Graphics 34, 4 (2015), 92. Google ScholarDigital Library
    31. Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In Proceedings of European Conference on Computer Vision. 694–711.Google ScholarCross Ref
    32. Alexandre Kaspar, Boris Neubert, Dani Lischinski, Mark Pauly, and Johannes Kopf. 2015. Self Tuning Texture Optimization. Computer Graphics Forum 34, 2 (2015), 349–360. Google ScholarDigital Library
    33. Wei-Sheng Lai, Jia-Bin Huang, Oliver Wang, Eli Shechtman, Ersin Yumer, and Ming-Hsuan Yang. 2018. Learning Blind Video Temporal Consistency. In Proceedings of European Conference on Computer Vision. 179–195.Google ScholarCross Ref
    34. Chuan Li and Michael Wand. 2016. Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis.. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2479–2486.Google ScholarCross Ref
    35. Wenbin Li, Fabio Viola, Jonathan Starck, Gabriel J. Brostow, and Neill D. F. Campbell. 2016. Roto++: Accelerating Professional Rotoscoping Using Shape Manifolds. ACM Transactions on Graphics 35, 4 (2016), 62. Google ScholarDigital Library
    36. Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, and Ming-Hsuan Yang. 2017. Universal Style Transfer via Feature Transforms. In Advances in Neural Information Processing Systems. 386–396. Google ScholarDigital Library
    37. Jing Liao, Rodolfo Lima, Diego Nehab, Hugues Hoppe, Pedro Sander, and Jinhui Yu. 2014. Automating Image Morphing Using Structural Similarity on a Halfway Domain. ACM Transactions on Graphics 33, 5 (2014), 168. Google ScholarDigital Library
    38. Jing Liao, Yuan Yao, Lu Yuan, Gang Hua, and Sing Bing Kang. 2017. Visual Attribute Transfer Through Deep Image Analogy. ACM Transactions on Graphics 36, 4 (2017), 120. Google ScholarDigital Library
    39. Peter Litwinowicz. 1997. Processing Images and Video for an Impressionist Effect. In SIGGRAPH Conference Proceedings. 407–414. Google ScholarDigital Library
    40. Ce Liu, Jenny Yuen, and Antonio Torralba. 2011. SIFT Flow: Dense Correspondence across Scenes and Its Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 5 (2011), 978–994. Google ScholarDigital Library
    41. Cewu Lu, Li Xu, and Jiaya Jia. 2012. Combining Sketch and Tone for Pencil Drawing Production. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering. 65–73. Google ScholarDigital Library
    42. Santiago E Montesdeoca, Hock Soon Seah, Amir Semmo, Pierre Bénard, Romain Vergne, Joëlle Thollot, and Davide Benvenuti. 2018. MNPR: A Framework for Real-Time Expressive Non-Photorealistic Rendering of 3D Computer Graphics. In Proceedings of The Joint Symposium on Computational Aesthetics and Sketch Based Interfaces and Modeling and Non-Photorealistic Animation and Rendering. 11. Google ScholarDigital Library
    43. Alexandrina Orzan, Adrien Bousseau, Holger Winnemöller, Pascal Barla, Joëlle Thollot, and David Salesin. 2008. Diffusion Curves: A Vector Representation for Smooth-Shaded Images. ACM Transactions on Graphics 27, 3 (2008), 92. Google ScholarDigital Library
    44. Emil Praun, Hugues Hoppe, Matthew Webb, and Adam Finkelstein. 2001. Real-Time Hatching. In SIGGRAPH Conference Proceedings. 581–586. Google ScholarDigital Library
    45. Manuel Ruder, Alexey Dosovitskiy, and Thomas Brox. 2018. Artistic Style Transfer for Videos and Spherical Images. International Journal of Computer Vision 126, 11 (2018), 1199–1219. Google ScholarDigital Library
    46. Roland Ruiters, Ruwen Schnabel, and Reinhard Klein. 2010. Patch-Based Texture Interpolation. Computer Graphics Forum 29, 4 (2010), 1421–1429. Google ScholarDigital Library
    47. Michael P. Salisbury, Michael T. Wong, John F. Hughes, and David H. Salesin. 1997. Orientable Textures for Image-Based Pen-and-Ink Illustration. In SIGGRAPH Conference Proceedings. 401–406. Google ScholarDigital Library
    48. Artsiom Sanakoyeu, Dmytro Kotovenko, Sabine Lang, and Björn Ommer. 2018. A Style-Aware Content Loss for Real-Time HD Style Transfer. In Proceedings of European Conference on Computer Vision. 715–731.Google ScholarCross Ref
    49. Johannes Schmid, Martin Sebastian Senn, Markus Gross, and Robert Sumner. 2011. OverCoat: An Implicit Canvas for 3D Painting. ACM Transactions on Graphics 30, 4 (2011), 28. Google ScholarDigital Library
    50. Eli Shechtman, Alex Rav-Acha, Michal Irani, and Steven M. Seitz. 2010. Regenerative Morphing. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 615–622.Google Scholar
    51. Yi-Chang Shih, Sylvain Paris, Connelly Barnes, William T. Freeman, and Frédo Durand. 2014. Style Transfer for Headshot Portraits. ACM Transactions on Graphics 33, 4 (2014), 148. Google ScholarDigital Library
    52. Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014).Google Scholar
    53. Noah Snavely, C. Lawrence Zitnick, Sing Bing Kang, and Michael F. Cohen. 2006. Stylizing 2.5-D video. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering. 63–69. Google ScholarDigital Library
    54. Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, and Jan Kautz. 2018. MoCoGAN: Decomposing Motion and Content for Video Generation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 1526–1535.Google ScholarCross Ref
    55. Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, and Victor S. Lempitsky. 2016a. Texture Networks: Feed-Forward Synthesis of Textures and Stylized Images. In ICML, Vol. 48. 1349–1357. Google ScholarDigital Library
    56. Dmitry Ulyanov, Andrea Vedaldi, and Victor S. Lempitsky. 2016b. Instance Normalization: The Missing Ingredient for Fast Stylization. CoRR abs/1607.08022 (2016).Google ScholarDigital Library
    57. Dmitry Ulyanov, Andrea Vedaldi, and Victor S. Lempitsky. 2017. Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 4105–4113.Google Scholar
    58. Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Nikolai Yakovenko, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. Video-to-Video Synthesis. In Advances in Neural Information Processing Systems. 1152–1164. Google ScholarDigital Library
    59. Xin Wang, Geoffrey Oxholm, Da Zhang, and Yuan-Fang Wang. 2017. Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 7178–7186.Google ScholarCross Ref
    60. Yonatan Wexler, Eli Shechtman, and Michal Irani. 2007. Space-Time Completion of Video. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 3 (2007), 463–476. Google ScholarDigital Library
    61. Pierre Wilmot, Eric Risser, and Connelly Barnes. 2017. Stable and Controllable Neural Texture Synthesis and Style Transfer Using Histogram Losses. CoRR abs/1701.08893 (2017).Google Scholar
    62. Li Xu, Cewu Lu, Yi Xu, and Jiaya Jia. 2011. Image Smoothing via L0 Gradient Minimization. ACM Transactions on Graphics 30, 6 (2011), 174. Google ScholarDigital Library
    63. Kaan Yücer, Alec Jacobson, Alexander Sorkine-Hornung, and Olga Sorkine-Hornung. 2012. Transfusive Image Manipulation. ACM Transactions on Graphics 31, 6 (2012), 176. Google ScholarDigital Library
    64. Mingtian Zhao and Song-Chun Zhu. 2011. Portrait Painting Using Active Templates. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering. 117–124. Google ScholarDigital Library
    65. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017a. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of IEEE International Conference on Computer Vision. 2242–2251.Google Scholar
    66. Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, and Eli Shechtman. 2017b. Toward Multimodal Image-to-Image Translation. In Advances in Neural Information Processing Systems. 465–476. Google ScholarDigital Library


ACM Digital Library Publication:



Overview Page: