Bringing portraits to life

We present a technique to automatically animate a still portrait, making it possible for the subject in the photo to come to life and express various emotions. We use a driving video (of a different subject) and develop means to transfer the expressiveness of the subject in the driving video to the target portrait. In contrast to previous work that requires an input video of the target face to reenact a facial performance, our technique uses only a single target image. We animate the target image through 2D warps that imitate the facial transformations in the driving video. As warps alone do not carry the full expressiveness of the face, we add fine-scale dynamic details which are commonly associated with facial expressions such as creases and wrinkles. Furthermore, we hallucinate regions that are hidden in the input target face, most notably in the inner mouth. Our technique gives rise to reactive profiles, where people in still images can automatically interact with their viewers. We demonstrate our technique operating on numerous still portraits from the internet.

References:

1. Jiamin Bai, Aseem Agarwala, Maneesh Agrawala, and Ravi Ramamoorthi. 2013. Automatic cinemagraph portraits. In Computer Graphics Forum, Vol. 32. Wiley Online Library, 17–25.
2. Volker Blanz, Curzio Basso, Tomaso Poggio, and Thomas Vetter. 2003. Reanimating faces in images and video. In Computer graphics forum, Vol. 22. Wiley Online Library, 641–650.
3. Volker Blanz and Thomas Vetter. 1999. A morphable model for the synthesis of 3D faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 187–194.
4. Jean-Yves Bouguet. 2001. Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corporation 5, 1–10 (2001), 4.
5. Pia Breuer, Kwang-In Kim, Wolf Kienzle, Bernhard Scholkopf, and Volker Blanz. 2008. Automatic 3D face reconstruction from single images or video. In Automatic Face & Gesture Recognition, 2008. FG’08. 8th IEEE International Conference on. IEEE, 1–8. Cross Ref
6. Chen Cao, Derek Bradley, Kun Zhou, and Thabo Beeler. 2015. Real-time high-fidelity facial performance capture. ACM Transactions on Graphics (TOG) 34, 4 (2015), 46.
7. Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. 2014. Faceware-house: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics 20, 3 (2014), 413–425.
8. Chen Cao, Hongzhi Wu, Yanlin Weng, Tianjia Shao, and Kun Zhou. 2016. Real-time facial animation with image-based dynamic avatars. ACM Transactions on Graphics (TOG) 35, 4 (2016), 126.
9. Erika Chuang and Christoph Bregler. 2005. Mood swings: expressive speech animation. ACM Transactions on Graphics (TOG) 24, 2 (2005), 331–347.
10. T. F. Cootes. Talking face video. http://www-prima.inrialpes.fr/FGnet/data/01-TalkingFace/talking_face.html (????).
11. Kevin Dale, Kalyan Sunkavalli, Micah K Johnson, Daniel Vlasic, Wojciech Matusik, and Hanspeter Pfister. 2011. Video face replacement. ACM Transactions on Graphics (TOG) 30, 6 (2011), 130.
12. Changxing Ding and Dacheng Tao. 2016. A comprehensive survey on pose-invariant face recognition. ACM Transactions on Intelligent Systems and Technology (TIST) 7, 3 (2016), 37.
13. Ohad Fried, Eli Shechtman, Dan B Goldman, and Adam Finkelstein. 2016. Perspective-aware Manipulation of Portrait Photos. (2016).
14. Yaroslav Ganin, Daniil Kononenko, Diana Sungatullina, and Victor Lempitsky. 2016. DeepWarp: Photorealistic image resynthesis for gaze manipulation. In European Conference on Computer Vision. Springer, 311–326. Cross Ref
15. Pablo Garrido, Levi Valgaerts, Ole Rehmsen, Thorsten Thormahlen, Patrick Perez, and Christian Theobalt. 2014. Automatic face reenactment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4217–4224.
16. Pablo Garrido, Levi Valgaerts, Hamid Sarmadi, Ingmar Steiner, Kiran Varanasi, Patrick Perez, and Christian Theobalt. 2015. Vdub: Modifying face video of actors for plausible visual alignment to a dubbed audio track. In Computer Graphics Forum, Vol. 34. Wiley Online Library, 193–204.
17. Tal Hassner, Shai Harel, Eran Paz, and Roee Enbar. 2015. Effective face frontalization in unconstrained images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4295–4304. Cross Ref
18. Alexander Hornung, Ellen Dekkers, and Leif Kobbelt. 2007. Character animation from 2D pictures and 3D motion data. ACM Transactions on Graphics (TOG) 26, 1 (2007).
19. Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, and Shigeo Morishima. 2013. Photorealistic inner mouth expression in speech animation. In ACM SIGGRAPH 2013 Posters. ACM, 9.
20. Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, and Shigeo Morishima. 2014. Data-driven speech animation synthesis focusing on realistic inside of the mouth. Journal of information processing 22, 2 (2014), 401–409. Cross Ref
21. Ira Kemelmacher-Shlizerman, Aditya Sankar, Eli Shechtman, and Steven M Seitz. 2010. Being john malkovich. In European Conference on Computer Vision. 341–353.
22. Davis E King. 2009. Dlib-ml: A machine learning toolkit. J. Mach. Learning Research 10 (2009), 1755–1758.
23. Iryna Korshunova, Wenzhe Shi, Joni Dambre, and Lucas Theis. 2016. Fast face-swap using convolutional neural networks. arXiv preprint arXiv:1611.09577 (2016).
24. Claudia Kuster, Tiberiu Popa, Jean-Charles Bazin, Craig Gotsman, and Markus Gross. 2012. Gaze correction for home video conferencing. ACM Transactions on Graphics (TOG) 31, 6 (2012), 174.
25. Tommer Leyvand, Daniel Cohen-Or, Gideon Dror, and Dani Lischinski. 2008. Data-driven enhancement of facial attractiveness. In ACM Transactions on Graphics (TOG), Vol. 27. ACM, 38.
26. Kai Li, Feng Xu, Jue Wang, Qionghai Dai, and Yebin Liu. 2012. A data-driven approach for facial expression synthesis in video. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 57–64.
27. Zicheng Liu, Ying Shan, and Zhengyou Zhang. 2001. Expressive expression mapping with ratio images. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques. ACM, 271–276.
28. Iacopo Masi, Anh Tuan Tran, Jatuporn Toy Leksut, Tal Hassner, and Gérard G. Medioni. 2016. Do We Really Need to Collect Millions of Faces for Effective Face Recognition? CoRR abs/1603.07057 (2016). http://arxiv.org/abs/1603.07057
29. Maja Pantic, Michel Valstar, Ron Rademaker, and Ludo Maat. 2005. Web-based database for facial expression analysis. In 2005 IEEE international conference on multimedia and Expo. IEEE, 5–pp. Cross Ref
30. Patrick Pérez, Michel Gangnet, and Andrew Blake. 2003. Poisson image editing. In ACM Transactions on Graphics (TOG), Vol. 22. ACM, 313–318.
31. Marcel Piotraschke and Volker Blanz. 2016. Automated 3d face reconstruction from multiple images using quality measures. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3418–3427. Cross Ref
32. Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. 2004. Grabcut: Interactive foreground extraction using iterated graph cuts. In ACM transactions on graphics (TOG), Vol. 23. ACM, 309–314.
33. Shunsuke Saito, Tianye Li, and Hao Li. 2016. Real-time facial segmentation and performance capture from rgb input. In European Conference on Computer Vision. Springer, 244–261. Cross Ref
34. Jason M Saragih, Simon Lucey, and Jeffrey F Cohn. 2011. Real-time avatar animation from a single image. In Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on. IEEE, 117–124.
35. Xiaoyong Shen, Aaron Hertzmann, Jiaya Jia, Sylvain Paris, Brian Price, Eli Shechtman, and Ian Sachs. 2016. Automatic Portrait Segmentation for Image Stylization. In Computer Graphics Forum, Vol. 35. Wiley Online Library, 93–102.
36. Zhixin Shu, Eli Shechtman, Dimitris Samaras, and Sunil Hadap. 2016. EyeOpener: Editing Eyes in the Wild. ACM Transactions on Graphics (TOG) 36, 1 (2016), 1.
37. Yaniv Taigman, Adam Polyak, and Lior Wolf. 2016. Unsupervised Cross-Domain Image Generation. arXiv preprint arXiv:1611.02200 (2016).
38. Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2016. Face2face: Real-time face capture and reenactment of rgb videos. Proc. Computer Vision and Pattern Recognition (CVPR), IEEE 1 (2016). Cross Ref
39. Michel Valstar and Maja Pantic. 2010. Induced disgust, happiness and surprise: an addition to the mmi facial expression database. In Proc. 3rd Intern. Workshop on EMOTION (satellite of LREC): Corpora for Research on Emotion and Affect. 65.
40. Daniel Vlasic, Matthew Brand, Hanspeter Pfister, and Jovan Popović. 2005. Face transfer with multilinear models. In ACM Transactions on Graphics (TOG), Vol. 24. ACM, 426–433.
41. Fei Yang, Lubomir Bourdev, Eli Shechtman, Jue Wang, and Dimitris Metaxas. 2012. Facial expression editing in video using a temporally-smooth factorization. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 861–868.
42. Fei Yang, Jue Wang, Eli Shechtman, Lubomir Bourdev, and Dimitri Metaxas. 2011. Expression flow for 3D-aware face component transfer. In ACM Transactions on Graphics (TOG), Vol. 30. ACM, 60.
43. Raymond Yeh, Ziwei Liu, Dan B Goldman, and Aseem Agarwala. 2016. Semantic Facial Expression Editing using Autoencoded Flow. arXiv preprint arXiv:1611.09961 (2016).
44. Shizhe Zhou, Hongbo Fu, Ligang Liu, Daniel Cohen-Or, and Xiaoguang Han. 2010. Parametric reshaping of human bodies in images. ACM Transactions on Graphics (TOG) 29, 4 (2010), 126.

ACM Digital Library Publication:

Overview Page:

SIGGRAPH Asia 2017: Technical Papers

Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org

ACM SIGGRAPH HISTORY ARCHIVES

“Bringing portraits to life” by Averbuch-Elor, Cohen-Or, Kopf and Cohen

Conference:

Type(s):

Title:

Session/Category Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Submit a story:

Sponsored by: