“Generating photo manipulation tutorials by demonstration” by Grabler, Agrawala, Li, Dontcheva and Igarashi

  • ©Floraine Grabler, Maneesh Agrawala, Wilmot Li, Mira Dontcheva, and Takeo Igarashi




    Generating photo manipulation tutorials by demonstration



    We present a demonstration-based system for automatically generating succinct step-by-step visual tutorials of photo manipulations. An author first demonstrates the manipulation using an instrumented version of GIMP that records all changes in interface and application state. From the example recording, our system automatically generates tutorials that illustrate the manipulation using images, text, and annotations. It leverages automated image labeling (recognition of facial features and outdoor scene structures in our implementation) to generate more precise text descriptions of many of the steps in the tutorials. A user study comparing our automatically generated tutorials to hand-designed tutorials and screen-capture video recordings finds that users are 20–44% faster and make 60–95% fewer errors using our tutorials. While our system focuses on tutorial generation, we also present some initial work on generating content-dependent macros that use image recognition to automatically transfer selection operations from the example image used in the demonstration to new target images. While our macros are limited to transferring selection operations we demonstrate automatic transfer of several common retouching techniques including eye recoloring, whitening teeth and sunset enhancement.


    1. Agrawala, M., Phan, D., Heiser, J., Haymaker, J., Klingner, J., Hanrahan, P., and Tversky, B. 2003. Designing effective step-by-step assembly instructions. Proc. SIGGRAPH, 828–837. Google ScholarDigital Library
    2. Bae, S., Paris, S., and Durand, F. 2006. Two-scale tone management for photographic look. ACM Trans. Graph. (Proc. SIGGRAPH) 25, 3, 637–645. Google ScholarDigital Library
    3. Bergman, L., Castelli, V., Lau, T., and Oblinger, D. 2005. DocWizards: A system for authoring follow-me documentation wizards. In Proc. UIST, 191–200. Google ScholarDigital Library
    4. Bolin, M., Webber, M., Rha, P., Wilson, T., and Miller, R. C. 2005. Automation and customization of rendered web pages. In Proc. UIST, 163–172. Google ScholarDigital Library
    5. Booher, H. 1975. Relative comprehensibility of pictoral information and printed words in proceduralized instructions. In Human Factors, vol. 17, 266–277.Google ScholarCross Ref
    6. Cypher, A., and Halbert, D. 1993. Watch What I Do: Programming by Demonstration. MIT Press. Google ScholarDigital Library
    7. Efros, A., and Freeman, W. 2001. Image quilting for texture synthesis and transfer. In Proc. SIGGRAPH, 341–346. Google ScholarDigital Library
    8. Feiner, S. K. 1988. A grid-based approach to automating display layout. In Proc. Graphics interface, 192–197. Google ScholarDigital Library
    9. Harrison, S. 1995. A comparison of still, animated, or nonillustrated on-line help with written or spoken instructions in a graphical user interface. In Proc. CHI, 82–89. Google ScholarDigital Library
    10. Heiser, J., Phan, D., Agrawala, M., Tversky, B., and Hanrahan, P. 2004. Identification and validation of cognitive design principles for automated generation of assembly instructions. Proc. AVI, 311–319. Google ScholarDigital Library
    11. Hertzmann, A., Jacobs, C., Oliver, N., Curless, B., and Salesin, D. 2001. Image analogies. In Proc. SIGGRAPH, 327–340. Google ScholarDigital Library
    12. Hoiem, D., Efros, A., and Hebert, M. 2005. Geometric context from a single image. In Proc. ICCV, 654–661. Google ScholarDigital Library
    13. Huang, G., Ramesh, M., Berg, T., and Learned-Miller, E. 2007. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. UMass, Amherst Technical Report 07–49.Google Scholar
    14. Huggins, B. 2005. Photoshop: Retouching Cookbook for Digital Photographers. O’Reilly. Google ScholarDigital Library
    15. Jacobs, C., Li, W., Schrier, E., Bargeron, D., and Salesin, D. 2003. Adaptive grid-based document layout. ACM Trans. Graph. (Proc. SIGGRAPH) 22, 3, 838–847. Google ScholarDigital Library
    16. Kelby, S. 2007. The Adobe Photoshop CS3 book for digital photographers. Voices That Matter. Google ScholarDigital Library
    17. Kelleher, C., and Pausch, R. 2005. Stencils-based tutorials: design and evaluation. In Proc. CHI, 541–550. Google ScholarDigital Library
    18. Knabe, K. 1995. Apple guide: A case study in user-aided design of online help. In Proc. CHI, 286–287. Google ScholarDigital Library
    19. Kosbie, D. S., and Myers, B. A. 1993. A system-wide macro facility based on aggregate events: A proposal. In Watch what I do: Programming by demonstration. MIT Press, 433–444. Google ScholarDigital Library
    20. Kurlander, D., and Feiner, S. 1992. A history-based macro by example system. In Proc, UIST, 99–106. Google ScholarDigital Library
    21. Lau, T., Bergman, L., Castelli, V., and Oblinger, D. 2004. Sheepdog: Learning procedures for technical support. In Proc. IUI, 109–116. Google ScholarDigital Library
    22. Lieberman, H. 1993. Mondrian: A teachable graphical editor. In Watch what I do: Programming by demonstration. 341–358. Google ScholarDigital Library
    23. Lieberman, H. 2001. Your Wish is My Command: Giving Users the Power to Instruct their Software. Morgan Kaufmann.Google Scholar
    24. Little, G., Lau, T., Cypher, A., Lin, J., Haber, E., and Kandogan, E. 2007. Koala: Capture, share, automate, personalize business processes on the web. In Proc. CHI, 943–946. Google ScholarDigital Library
    25. Meng, C., Yasue, M., Imamiya, A., and Mao, X. 1998. Visualizing histories for selective undo and redo. 459–464.Google Scholar
    26. Modugno, F., and Myers, B. 1994. Pursuit: Graphically representing programs in a demonstrational visual shell. In Proc. CHI, 455–456. Google ScholarDigital Library
    27. Nakamura, T., and Igarashi, T. 2008. An application independent system for visualizing user operation history. In Proc. UIST, 23–32. Google ScholarDigital Library
    28. Novick, L. R., and Morse, D. L. 2000. Folding a fish, making a mushroom: The role of diagrams in executing assembly procedures. Memory and Cognition 28, 7, 1242–56.Google ScholarCross Ref
    29. Palmiter, S., and Elkerton, J. 1991. An evaluation of animated demonstrations of learning computer-based tasks. In Proc. CHI, 257–263. Google ScholarDigital Library
    30. Su, S. 2007. Visualizing, editing, and inferring structure in 2D graphics. In UIST 2007 Doctoral Symposium.Google Scholar
    31. Terry, M., Kay, M., Vugt, B. V., Slack, B., and Park, T. 2008. Ingimp: Introducing instrumentation to an end-user open source application. In Proc. CHI, 607–616. Google ScholarDigital Library
    32. Varis, L. 2006. Skin. Wiley Publishing.Google Scholar
    33. Zhou, Y., Gu, L., and Zhang, H. 2003. Bayesian tangent shape model: Estimating shape and pose parameters via bayesian inference. In Proc. CVPR, 109–116. Google ScholarDigital Library

ACM Digital Library Publication: