EyeOpener: Editing Eyes in the Wild

Closed eyes and look-aways can ruin precious moments captured in photographs. In this article, we present a new framework for automatically editing eyes in photographs. We leverage a user’s personal photo collection to find a “good” set of reference eyes and transfer them onto a target image. Our example-based editing approach is robust and effective for realistic image editing. A fully automatic pipeline for realistic eye editing is challenging due to the unconstrained conditions under which the face appears in a typical photo collection. We use crowd-sourced human evaluations to understand the aspects of the target-reference image pair that will produce the most realistic results. We subsequently train a model that automatically selects the top-ranked reference candidate(s) by narrowing the gap in terms of pose, local contrast, lighting conditions, and even expressions. Finally, we develop a comprehensive pipeline of three-dimensional face estimation, image warping, relighting, image harmonization, automatic segmentation, and image compositing in order to achieve highly believable results. We evaluate the performance of our method via quantitative and crowd-sourced experiments.

References:

A. Agarwala. 2007. Efficient gradient-domain compositing using quadtrees. ACM Trans. Graph. 26, 3, 94.
A. Agarwala, M. Dontcheva, M. Agrawala, S. Drucker, A. Colburn, B. Curless, D. Salesin, and M. Cohen. 2004. Interactive digital photomontage. ACM Trans. Graph. 23, 3, 294–302.
S. Bakhshi, D. A. Shamma, and E. Gilbert. 2014. Faces engage us: Photos with faces attract more likes and comments on instagram. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 965–974.
J. C. Bazin, D. Q. Pham, I. Kweon, and K. J. Yoon. 2009. Automatic closed eye correction. In 2009 16th IEEE International Conference on Image Processing (ICIP). IEEE, 2433–2436.
P. Bhat, C. L. Zitnick, M. Cohen, and B. Curless. 2010. Gradientshop: A gradient-domain optimization framework for image and video filtering. ACM Trans. Graph. 29, 2, 10.
D. Bitouk, N. Kumar, S. Dhillon, P. Belhumeur, and S. K. Nayar. 2008. Face swapping: Automatically replacing faces in photographs. ACM Trans. Graph. 27, 3, 39.
V. Blanz and T. Vetter. 1999. A morphable model for the synthesis of 3d faces. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniquesk. ACM Press/Addison-Wesley Publishing Co., 187–194.
L. Breiman. 2001. Random forests. Mach. Learn. 45, 1, 5–32.
T. Brox and J. Malik. 2011. Large displacement optical flow: Descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 3, 500–513.
X. Chen, M. Chen, X. Jin, and Q. Zhao. 2011. Face illumination transfer through edge-preserving filters. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 281–287.
N. Dalal and B. Triggs. 2005. Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 886–893.
K. Dale, K. Sunkavalli, M. K. Johnson, D. Vlasic, W. Matusik, and H. Pfister. 2011. Video face replacement. ACM Trans. Graph. 30, 6, 130.
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and Fei-L. Fei. 2009. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 248–255.
Z. Farbman, G. Hoffer, Y. Lipman, Cohen-D. Or, and D. Lischinski. 2009. Coordinates for instant image cloning. In ACM Transaction on Graphics (TOG). 28, 67.
P. Garrido, L. Valgaerts, O. Rehmsen, T. Thormaehlen, P. Perez, and C. Theobalt. 2014. Automatic face reenactment. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 4217–4224.
P. Garrido, L. Valgaerts, H. Sarmadi, I. Steiner, K. Varanasi, P. Perez, and C. Theobalt. 2015. Vdub: Modifying face video of actors for plausible visual alignment to a dubbed audio track. In Eurographics 2015.
P. Garrido, L. Valgaerts, C. Wu, and C. Theobalt. 2013. Reconstructing detailed dynamic face geometry from monocular video. In ACM Trans. Graph. (Proceedings of SIGGRAPH Asia 2013). Vol. 32. 158:1–158:10.
D. Guo and T. Sim. 2009. Digital face makeup by example. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 73–79.
J. Hays and A. A. Efros, 2007. Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH 2007) 26, 3.
K. He, J. Sun, and X. Tang. 2013. Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell. 35, 6, 1397–1409.
N. Joshi, W. Matusik, E. H. Adelson, and D. J. Kriegman. 2010. Personal photo enhancement using example images. ACM Trans. Graph. 29, 2, 12.
I. Kemelmacher-Shlizerman, S. Suwajanakorn, and S. M. Seitz. 2014. Illumination-aware age progression. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 3334–3341.
M. H. Kiapour, K. Yamaguchi, A. C. Berg, and T. L. Berg. 2014. Hipster wars: Discovering elements of fashion styles. In Computer Vision–ECCV 2014. Springer, 472–488.
V. Kolmogorov and R. Zabin. 2004. What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26, 2, 147–159.
D. Kononenko and V. Lempitsky. 2015. Learning to look up: Realtime monocular gaze correction using machine learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4667–4675.
C. Kuster, T. Popa, J.-C. Bazin, C. Gotsman, and M. Gross. 2012. Gaze correction for home video conferencing. ACM Trans. Graph. 31, 6, 174.
P.-Y. Laffont, Z. Ren, X. Tao, C. Qian, and J. Hays. 2014. Transient attributes for high-level understanding and editing of outdoor scenes. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 33, 4.
T. Leyvand, D. Cohen-Or, G. Dror, and D. Lischinski. 2008. Data-driven enhancement of facial attractiveness. ACM Trans. Graph. 27, 3, 38.
C. Liu, J. Yuen, and A. Torralba. 2011. Sift flow: Dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33, 5, 978–994.
Z. Liu, Y. Shan, and Z. Zhang. 2001. Expressive expression mapping with ratio images. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. ACM, 271–276.
M. H. Nguyen, J.-F. Lalonde, A. Efros, and F. De la Torre. 2008. Image-based shaving. Comput. Graph. Forum 27, 2, 627–635.
P. O’Donovan, J. Lībeks, A. Agarwala, and A. Hertzmann. 2014. Exploratory font selection using crowdsourced attributes. ACM Trans. Graph. 33, 4, 92.
D. Parikh and K. Grauman. 2011a. Interactively building a discriminative vocabulary of nameable attributes. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 1681–1688.
D. Parikh and K. Grauman. 2011b. Relative attributes. In Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 503–510.
P. Pérez, M. Gangnet, and A. Blake. 2003. Poisson image editing. In ACM Transactions on Graphics (TOG). Vol. 22. ACM, 313–318.
F. Pitié, A. Kokaram, and R. Dahyot. 2005. N-dimensional probability density function transfer and its application to color transfer. In ICCV 2005. Vol. 2. 1434–1439 Vol. 2.
J. Saragih. 2011. Principal regression analysis. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2881–2888.
F. Shi, H.-T. Wu, X. Tong, and J. Chai. 2014. Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans. Graph. 33, 6, 222.
J. Shi and J. Malik. 2000. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 8, 888–905.
Y. Shih, S. Paris, C. Barnes, W. T. Freeman, and F. Durand. 2014. Style transfer for headshot portraits. ACM Trans. Graph. 33, 4, 148.
A. J. Smola and B. Schölkopf. 2004. A tutorial on support vector regression. Stat. Comput. 14, 3, 199–222.
K. Sunkavalli, M. K. Johnson, W. Matusik, and H. Pfister. 2010. Multi-scale image harmonization. In ACM Trans. Graph. 29, 125.
S. Suwajanakorn, I. Kemelmacher-Shlizerman, and S. M. Seitz. 2014. Total moving face reconstruction. In Computer Vision–ECCV 2014. Springer, 796–812.
S. Suwajanakorn, S. M. Seitz, and I. Kemelmacher-Shlizerman. 2015. What makes tom hanks look like tom hanks. In Proceedings of the IEEE International Conference on Computer Vision. 3952–3960.
M. W. Tao, M. K. Johnson, and S. Paris. 2013. Error-tolerant image compositing. Int. J. Comput. Vis. 103, 2, 178–189.
A. Torralba, R. Fergus, and W. T. Freeman. 2008. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30, 11, 1958–1970.
Y. Wang, Z. Liu, G. Hua, Z. Wen, Z. Zhang, and D. Samaras. 2007. Face re-lighting from a single image under harsh lighting conditions. IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR’07, 1–8.
Y. Wang, L. Zhang, Z. Liu, G. Hua, Z. Wen, Z. Zhang, and D. Samaras. 2009. Face relighting from a single image under arbitrary unknown lighting conditions. IEEE Trans. Pattern Anal. Mach. Intell. 31, 11 (Nov.), 1968–1984.
P. Welinder, S. Branson, P. Perona, and S. J. Belongie. 2010. The multidimensional wisdom of crowds. In Advances in Neural Information Processing Systems. 2424–2432.
Z. Wen, Z. Liu, and T. S. Huang. 2003. Face relighting with radiance environment maps. In Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on. Vol. 2. IEEE, II–158.
L. Wolf, Z. Freund, and S. Avidan. 2010. An eye for an eye: A single camera gaze-replacement method. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 817–824.
F. Yang, J. Wang, E. Shechtman, L. Bourdev, and D. Metaxas. 2011. Expression flow for 3d-aware face component transfer. ACM Trans. Graph. 30, 4, 60.
R. Yang and Z. Zhang. 2002. Eye gaze correction with stereovision for video-teleconferencing. In Computer VisionECCV 2002. Springer, 479–494.
L. Zhang and D. Samaras. 2006. Face recognition from a single training image under arbitrary unknown lighting using spherical harmonics. IEEE Trans. Pattern Anal. Mach. Intell. 28, 3, 351–363.
J.-Y. Zhu, A. Agarwala, A. A. Efros, E. Shechtman, and J. Wang. 2014. Mirror mirror: Crowdsourcing better portraits. ACM Trans. Graph. 33, 6, 234.

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2016: Technical Papers

“EyeOpener: Editing Eyes in the Wild” by Shu, Shechtman, Samaras and Hadap

Conference:

Type(s):

Title:

Session/Category Title: PHOTO ORGANIZATION & MANIPULATION

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: