PatchNet: a patch-based image representation for interactive library-driven image editing

We introduce PatchNets, a compact, hierarchical representation describing structural and appearance characteristics of image regions, for use in image editing. In a PatchNet, an image region with coherent appearance is summarized by a graph node, associated with a single representative patch, while geometric relationships between different regions are encoded by labelled graph edges giving contextual information. The hierarchical structure of a PatchNet allows a coarse-to-fine description of the image. We show how this PatchNet representation can be used as a basis for interactive, library-driven, image editing. The user draws rough sketches to quickly specify editing constraints for the target image. The system then automatically queries an image library to find semantically-compatible candidate regions to meet the editing goal. Contextual image matching is performed using the PatchNet representation, allowing suitable regions to be found and applied in a few seconds, even from a library containing thousands of images.

References:

1. Arbelaez, P., Maire, M., Fowlkes, C., and Malik, J. 2011. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 5 (May).
2. Baeza-Yates, R., and Valiente, G. 2000. An image similarity measure based on graph matching. In Proc. International Symposium on String Processing and Information Retrieval, IEEE, 28–38.
3. Bagon, B., and Irani, M. 2008. What is a good image segment? a unified approach to segment extraction. In Proc. ECCV.
4. Barnes, C., Shechtman, E., Finkelstein, A., and Goldman, D. B. 2009. Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 3 (July), 24:1–24:11.
5. Barnes, C., Shechtman, E., Goldman, D. B., and Finkelstein, A. 2010. The generalized PatchMatch correspondence algorithm. In Proc. ECCV, 29–43.
6. Belongie, S., Malik, J., and Puzicha, J. 2002. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24, 4, 509–522.
7. Besse, F., Rother, C., Fitzgibbon, A., and Kautz, J. 2012. Pmbp: Patchmatch belief propagation for correspondence field estimation. In Proc. BMVC.
8. Bosch, A., Zisserman, A., and Munoz, X. 2007. Representing shape with a spatial pyramid kernel. In Proc. CIVR.
9. Brox, T., Bregler, C., and Malik, J. 2009. Large displacement optical flow. In Proc. CVPR, 41–48.
10. Buades, A., and Coll, B. 2005. A non-local algorithm for image denoising. In Proc. CVPR, 60–65.
11. Chen, T., Cheng, M.-M., Tan, P., Shamir, A., and Hu, S.-M. 2009. Sketch2photo: internet image montage. ACM Trans. Graph. 28, 5 (Dec.), 124:1–124:10.
12. Chevalier, F., Domenger, J.-P., Benois-Pineau, J., and Delest, M. 2007. Retrieval of objects in video by similarity based on graph matching. Pattern Recognition Letters 28, 8, 939–949.
13. Comaniciu, D., Meer, P., and Member, S. 2002. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619.
14. Darabi, S., Shechtman, E., Barnes, C., Goldman, D. B., and Sen, P. 2012. Image melding: combining inconsistent images using patch-based synthesis. ACM Trans. Graph. 31, 4 (July), 82:1–82:10.
15. Efros, A. A., and Freeman, W. T. 2001. Image quilting for texture synthesis and transfer. In Proc. SIGGRAPH, ACM, 341–346.
16. Eitz, M., Richter, R., Hildebrand, K., Boubekeur, T., and Alexa, M. 2011. Photosketcher: interactive sketch-based image synthesis. IEEE Computer Graphics and Applications.
17. Fei-Fei, L., and Perona, P. 2005. A bayesian hierarchical model for learning natural scene categories. In Proc. CVPR, 524–531.
18. Fisher, M., and Hanrahan, P. 2010. Context-based search for 3d models. ACM Trans. Graph. 29, 6, 182:1–182:10.
19. Freedman, G., and Fattal, R. 2011. Image and video upscaling from local self-examples. ACM Trans. Graph. 30, 2 (Apr.), 12:1–12:11.
20. Galun, M., Sharon, E., Basri, R., and Brandt, A. 2003. Texture segmentation by multiscale aggregation of filter responses and shape elements. In Proc. ICCV.
21. Gould, S., and Zhang, Y. 2012. Patchmatchgraph: building a graph of dense patch correspondences for label transfer. In Proc. ECCV, Springer, 439–452.
22. HaCohen, Y., Shechtman, E., Goldman, D. B., and Lischinski, D. 2011. Non-rigid dense correspondence with applications for image enhancement. ACM Trans. Graph. 30, 4 (July), 70:1–70:10.
23. Hays, J., and Efros, A. A. 2007. Scene completion using millions of photographs. ACM Trans. Graph. 26, 3 (July), 4.
24. Hlaoui, A., and Wang, S. 2002. A new algorithm for graph matching with application to content-based image retrieval. In Structural, Syntactic, and Statistical Pattern Recognition. Springer, 291–300.
25. Hu, R., Barnard, M., and Collomosse, J. 2010. Gradient field descriptor for sketch based retrieval and localization. In Proc. ICIP, IEEE, 1025–1028.
26. Hu, S.-M., Chen, T., Xu, K., Cheng, M.-M., and Martin, R. R. 2013. Internet visual media processing: a survey with graphics and vision applications. The Visual Computer 29, 5, 393–405.
27. Jain, A., Thormählen, T., Ritschel, T., and Seidel, H.-P. 2012. Material memex: automatic material suggestions for 3d objects. ACM Trans. Graph. 31, 6, 143:1–143:8.
28. Johnson, M., Brostow, G. J., Shotton, J., Arandjelovic, O., Kwatra, V., and Cipolla, R. 2006. Semantic photo synthesis. Computer Graphics Forum 25, 3, 407–413.
29. Johnson, M. K., Dale, K., Avidan, S., Pfister, H., Freeman, W. T., and Matusik, W. 2011. Cg2real: Improving the realism of computer generated images using a large collection of photographs. IEEE Transactions on Visualization and Computer Graphics 17, 9, 1273–1285.
30. Jojic, N. 2003. Epitomic analysis of appearance and shape. In Proc. ICCV.
31. Kannan, A., Winn, J., and Rother, C. 2006. Clustering appearance and shape by learning jigsaws. In Proc. NIPS.
32. Kopf, J., Kienzle, W., Drucker, S., and Kang, S. B. 2012. Quality prediction for image completion. ACM Trans. Graph. 31, 6 (Nov.), 131:1–131:8.
33. Kuettel, D., Guillaumin, M., and Ferrari, V. 2012. Figure-ground segmentation by transferring window masks. In Proc. CVPR.
34. Kwatra, V., Schödl, A., Essa, I., Turk, G., and Bobick, A. 2003. Graphcut textures: image and video synthesis using graph cuts. ACM Trans. Graph. 22, 3, 277–286.
35. Lalonde, J.-F., Hoiem, D., Efros, A. A., Rother, C., Winn, J., and Criminisi, A. 2007. Photo clip art. ACM Trans. Graph. 26, 3, 3.
36. Lasram, A., and Lefebvre, S. 2012. Parallel patch-based texture synthesis. In Proceedings of the Fourth ACM SIGGRAPH/Eurographics conference on High-Performance Graphics, 115–124.
37. Lee, Y. J., and Grauman, K. 2010. Object-graphs for context-aware category discovery. In Proc. CVPR, IEEE, 1–8.
38. Lee, Y. J., Zitnick, C. L., and Cohen, M. F. 2011. Shadowdraw: real-time user guidance for freehand drawing. ACM Trans. Graph. 30, 4 (July), 27:1–27:10.
39. Levin, A., Lischenski, D., and Weiss, Y. 2008. A closed-form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30, 2, 228–242.
40. Lim, J. J., Arbeláez, P., Gu, C., and Malik, J. 2009. Context by region ancestry. In Proc. ICCV, IEEE, 1978–1985.
41. Liu, Y., and Yu, Y. 2011. Free appearance-editing with improved poisson image cloning. Journal of Computer Science and Technology 26, 6, 1011–1016.
42. Liu, Y., and Yu, Y. 2012. Interactive image segmentation based on level sets of probabilities. IEEE Transactions on Visualization and Computer Graphics 18, 2, 202–213.
43. Liu, C., Yuen, J., Torralba, A., Sivic, J., and Freeman, W. T. 2008. Sift flow: Dense correspondence across different scenes. In Proc. ECCV, vol. 3, 28–42.
44. Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 2, 91–110.
45. Malisiewicz, T., and Efros, A. 2009. Beyond categories: The visual memex model for reasoning about object relationships. In Proc. NIPS.
46. Oliva, A., and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42, 3, 145–175.
47. Perazzi, F., Krahenbuhl, P., Pritch, Y., and Hornung, A. 2012. Saliency filters: Contrast based filtering for salient region detection. In Proc. CVPR.
48. Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graph. 22, 3 (July), 313–318.
49. Risser, E., Han, C., Dahyot, R., and Grinspun, E. 2010. Synthesizing structured image hybrids. ACM Trans. Graph. 29 (July), 85:1–85:6.
50. Russakovsky, O., Lin, Y., Yu, K., and Fei-Fei, L. 2012. Object-centric spatial pooling for image classification. In Proc. ECCV, Springer, 1–15.
51. Shrivastava, A., Malisiewicz, T., Gupta, A., and Efros, A. A. 2011. Data-driven visual similarity for cross-domain image matching. ACM Trans. Graph. 30, 6, 154.
52. Sivic, J., and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In Proc. ICCV, IEEE, 1470–1477.
53. Sun, J., Yuan, L., Jia, J., and Shum, H.-Y. 2005. Image completion with structure propagation. ACM Trans. Graph. 24, 3, 861–868.
54. Wei, L.-Y., Han, J., Zhou, K., Bao, H., Guo, B., and Shum, H.-Y. 2008. Inverse texture synthesis. ACM Trans. Graph. 27, 3 (Aug.), 52:1–52:9.
55. Wei, L. Y., Lefebvre, S., Kwatra, V., and Turk, G. 2009. State of the art in example-based texture synthesis. In EG STAR, 93–117.
56. Wexler, Y., Shechtman, E., and Irani, M. 2007. Spacetime completion of video. IEEE Trans. Pattern Anal. Mach. Intell. 29, 3, 463–476.
57. Wilczkowiak, M., Brostow, G. J., Tordoff, B., and Cipolla, R. 2005. Hole filling through photomontage. In Proc. BMVC.
58. Wu, Y., and Fan, J. 2009. Contextual flow. In Proc. CVPR, IEEE, 33–40.
59. Xiao, C., Liu, M., Nie, Y., and Dong, Z. 2011. Fast exact nearest patch matching for patch-based image editing and processing. IEEE Transactions on Visualization and Computer Graphics 17, 8, 1122–1134.
60. Zhang, Y., and Tong, R. 2011. Environment-sensitive cloning in images. The Visual Computer 27, 6–8, 739–748.
61. Zhang, S.-H., Tong, Q., Hu, S.-M., and Martin, R. R. 2011. Painting patches: reducing flicker in painterly re-rendering of video. Science China-Information Sciences 54, 12, 2592–2601.
62. Zhang, F.-L., Cheng, M.-M., Jia, J., and Hu, S.-M. 2012. Imageadmixture: Putting together dissimilar objects from groups. IEEE Transactions on Visualization and Computer Graphics 18, 11, 1849–1857.

ACM Digital Library Publication:

Overview Page:

SIGGRAPH Asia 2013: Technical Papers

Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org

ACM SIGGRAPH HISTORY ARCHIVES

“PatchNet: a patch-based image representation for interactive library-driven image editing”

Conference:

Type(s):

Title:

Session/Category Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Submit a story:

Sponsored by: