“CariGANs: unpaired photo-to-caricature translation” – ACM SIGGRAPH HISTORY ARCHIVES

“CariGANs: unpaired photo-to-caricature translation”

  • 2018 SA Technical Papers_Cao_CariGANs: unpaired photo-to-caricature translation

Conference:


Type(s):


Title:

    CariGANs: unpaired photo-to-caricature translation

Session/Category Title:   Image processing


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    Facial caricature is an art form of drawing faces in an exaggerated way to convey humor or sarcasm. In this paper, we propose the first Generative Adversarial Network (GAN) for unpaired photo-to-caricature translation, which we call “CariGANs”. It explicitly models geometric exaggeration and appearance stylization using two components: CariGeoGAN, which only models the geometry-to-geometry transformation from face photos to caricatures, and CariStyGAN, which transfers the style appearance from caricatures to face photos without any geometry deformation. In this way, a difficult cross-domain translation problem is decoupled into two easier tasks. The perceptual study shows that caricatures generated by our CariGANs are closer to the hand-drawn ones, and at the same time better persevere the identity, compared to state-of-the-art methods. Moreover, our CariGANs allow users to control the shape exaggeration degree and change the color/texture style by tuning the parameters or giving an example caricature.

References:


    1. Ergun Akleman. 1997. Making caricatures with morphing. In Proc. ACM SIGGRAPH. ACM, 145. Google ScholarDigital Library
    2. Ergun Akleman, James Palmer, and Ryan Logan. 2000. Making extreme caricatures with a new interactive 2D deformation technique with simplicial complexes. In Proc. Visual. 165–170.Google Scholar
    3. Susan E Brennan. 2007. Caricature generator: The dynamic exaggeration of faces by computer. Leonardo 40, 4 (2007), 392–400.Google ScholarCross Ref
    4. Dongdong Chen, Jing Liao, Lu Yuan, Nenghai Yu, and Gang Hua. 2017a. Coherent online video style transfer. In Proc. ICCV.Google ScholarCross Ref
    5. Dongdong Chen, Lu Yuan, Jing Liao, Nenghai Yu, and Gang Hua. 2017b. Stylebank: An explicit representation for neural image style transfer. In Proc. CVPR.Google ScholarCross Ref
    6. Hong Chen, Nan-Ning Zheng, Lin Liang, Yan Li, Ying-Qing Xu, and Heung-Yeung Shum. 2002. PicToon: a personalized image-based cartoon system. In Proc. ACM international conference on Multimedia. ACM, 171–178. Google ScholarDigital Library
    7. Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, and William T Freeman. 2017. Synthesizing normalized faces from facial identity features. In Proc. CVPR. 3386–3395.Google ScholarCross Ref
    8. Jakub Fišer, Ondřej Jamriška, David Simons, Eli Shechtman, Jingwan Lu, Paul Asente, Michal Lukáč, and Daniel Sỳkora. 2017. Example-based synthesis of stylized facial animations. ACM Trans. Graph. (Proc. of SIGGRAPH) 36, 4 (2017), 155. Google ScholarDigital Library
    9. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2015. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015).Google Scholar
    10. Bruce Gooch, Erik Reinhard, and Amy Gooch. 2004. Human facial illustrations: Creation and psychophysical evaluation. ACM Trans. Graph. (Proc. of SIGGRAPH) 23, 1 (2004), 27–44. Google ScholarDigital Library
    11. Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. science 313, 5786 (2006), 504–507.Google Scholar
    12. Xun Huang, Ming-Yu Liu, Serge Belongie, and Jan Kautz. 2018. Multimodal Unsupervised Image-to-image Translation. arXiv preprint arXiv:1804.04732 (2018).Google Scholar
    13. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. In Proc. CVPR.Google ScholarCross Ref
    14. Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In Proc. ECCV. Springer, 694–711.Google Scholar
    15. Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).Google Scholar
    16. Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jungkwon Lee, and Jiwon Kim. 2017. Learning to discover cross-domain relations with generative adversarial networks. arXiv preprint arXiv:1703.05192 (2017).Google ScholarDigital Library
    17. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
    18. Hiroyasu Koshimizu, Masafumi Tominaga, Takayuki Fujiwara, and Kazuhito Murakami. 1999. On KANSEI facial image processing for computerized facial caricaturing system PICASSO. In Proc. IEEE International Conference on Systems, Man, and Cybernetics, Vol. 6. IEEE, 294–299.Google ScholarCross Ref
    19. Nguyen Kim Hai Le, Yong Peng Why, and Golam Ashraf. 2011. Shape stylized face caricatures. In Proc. International Conference on Multimedia Modeling. Springer, 536–547. Google ScholarDigital Library
    20. Lin Liang, Hong Chen, Ying-Qing Xu, and Heung-Yeung Shum. 2002. Example-based caricature generation with exaggeration. In Proc. Pacific Conference on Computer Graphics and Applications. IEEE, 386–393. Google ScholarDigital Library
    21. Jing Liao, Yuan Yao, Lu Yuan, Gang Hua, and Sing Bing Kang. 2017. Visual attribute transfer through deep image analogy. arXiv preprint arXiv:1705.01088 (2017). Google ScholarDigital Library
    22. Pei-Ying Chiang Wen-Hung Liao and Tsai-Yen Li. 2004. Automatic caricature generation by analyzing facial features. In Proc. ACCV, Vol. 2.Google Scholar
    23. Junfa Liu, Yiqiang Chen, and Wen Gao. 2006. Mapping learning in eigenspace for harmonious caricature generation. In Proc. ACM international conference on Multimedia. ACM, 683–686. Google ScholarDigital Library
    24. Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised image-to-image translation networks. In Advances in Neural Information Processing Systems. 700–708. Google ScholarDigital Library
    25. Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In Proc. ICCV. Google ScholarDigital Library
    26. Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, Zhen Wang, and Stephen Paul Smolley. 2017. Least squares generative adversarial networks. In Proc. ICCV. IEEE, 2813–2821.Google ScholarCross Ref
    27. Zhenyao Mo, John P Lewis, and Ulrich Neumann. 2004. Improved automatic caricature by feature normalization and exaggeration. In ACM SIGGRAPH Sketches. ACM, 57. Google ScholarDigital Library
    28. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proc. of NIPS.Google Scholar
    29. Lenn Redman. 1984. How to draw caricatures. Vol. 1. Contemporary Books Chicago, IL.Google Scholar
    30. Ahmed Selim, Mohamed Elgharib, and Linda Doyle. 2016. Painting style transfer for head portraits using convolutional neural networks. ACM Trans. Graph. (Proc. of SIGGRAPH) 35, 4 (2016), 129. Google ScholarDigital Library
    31. Rupesh N Shet, Ka H Lai, Eran A Edirisinghe, and Paul WH Chung. 2005. Use of neural networks in automatic caricature generation: an approach based on drawing style capture. (2005).Google Scholar
    32. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
    33. Tamás Szirányi and Josiane Zerubia. 1997. Markov random field image segmentation using cellular neural network. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications 44, 1 (1997), 86–89.Google ScholarCross Ref
    34. Yaniv Taigman, Adam Polyak, and Lior Wolf. 2016. Unsupervised cross-domain image generation. arXiv preprint arXiv:1611.02200 (2016).Google Scholar
    35. Chien-Chung Tseng and Jenn-Jier James Lien. 2007. Synthesis of exaggerative caricature with inter and intra correlations. In Proc. ACCV. Springer, 314–323. Google ScholarDigital Library
    36. Fei Yang, Lubomir Bourdev, Eli Shechtman, Jue Wang, and Dimitris Metaxas. 2012. Facial expression editing in video using a temporally-smooth factorization. In Proc. CVPR. IEEE, 861–868. Google ScholarDigital Library
    37. Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. 2017. Dualgan: Unsupervised dual learning for image-to-image translation. arXiv preprint (2017).Google Scholar
    38. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017a. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networkss. In Proc. ICCV.Google ScholarCross Ref
    39. Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Efros, Oliver Wang, and Eli Shechtman. 2017b. Toward multimodal image-to-image translation. In Advances in Neural Information Processing Systems. 465–476. Google ScholarDigital Library
    40. Shizhan Zhu, Cheng Li, Chen-Change Loy, and Xiaoou Tang. 2016. Unconstrained face alignment via cascaded compositional learning. In Proc. CVPR. 3409–3417.Google ScholarCross Ref


ACM Digital Library Publication:



Overview Page:



Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org