“Visual transcripts: lecture notes from blackboard-style lecture videos”
Conference:
Type(s):
Title:
- Visual transcripts: lecture notes from blackboard-style lecture videos
Session/Category Title: Cinematography and Video Processing
Presenter(s)/Author(s):
Abstract:
Blackboard-style lecture videos are popular, but learning using existing video player interfaces can be challenging. Viewers cannot consume the lecture material at their own pace, and the content is also difficult to search or skim. For these reasons, some people prefer lecture notes to videos. To address these limitations, we present Visual Transcripts, a readable representation of lecture videos that combines visual information with transcript text. To generate a Visual Transcript, we first segment the visual content of a lecture into discrete visual entities that correspond to equations, figures, or lines of text. Then, we analyze the temporal correspondence between the transcript and visuals to determine how sentences relate to visual entities. Finally, we arrange the text and visuals in a linear layout based on these relationships. We compare our result with a standard video player, and a state-of-the-art interface designed specifically for blackboard-style lecture videos. User evaluation suggests that users prefer our interface for learning and that our interface is effective in helping them browse or search through lecture videos.
References:
1. Agnihotri, L., Devara, K. V., McGee, T., and Dimitrova, N. 2001. Summarization of video programs based on closed captions. In Photonics West 2001-Electronic Imaging, International Society for Optics and Photonics, 599–607.
2. Barnes, C., Goldman, D. B., Shechtman, E., and Finkelstein, A. 2010. Video tapestries with continuous temporal zoom. ACM Transactions on Graphics (TOG) 29, 4, 89.
3. Boreczky, J., Girgensohn, A., Golovchinsky, G., and Uchihashi, S. 2000. An interactive comic book presentation for exploring video. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems, ACM, 185–192.
4. Borgo, R., Chen, M., Daubney, B., Grundy, E., Janicke, H., Heidemann, G., Hoferlin, B., Hoferlin, M., Weiskopf, D., and Xie, X. 2011. A survey on video-based graphics and video visualization. In Proc. of the EuroGraphics conf., State of the Art Report, Citeseer, 1–23.
5. Chi, P.-Y., Ahn, S., Ren, A., Dontcheva, M., Li, W., and Hartmann, B. 2012. MixT: automatic generation of step-by-step mixed media tutorials. In Proceedings of the 25th annual ACM symposium on User interface software and technology, ACM, 93–102.
6. Chi, P.-Y., Liu, J., Linder, J., Dontcheva, M., Li, W., and Hartmann, B. 2013. DemoCut: generating concise instructional videos for physical demonstrations. In Proceedings of the 26th annual ACM symposium on User interface software and technology, ACM, 141–150.
7. Choudary, C., and Liu, T. 2007. Summarization of visual content in instructional videos. Multimedia, IEEE Transactions on 9, 7, 1443–1455.
8. Christel, M. G., and Warmack, A. S. 2001. The effect of text in storyboards for video navigation. In Acoustics, Speech, and Signal Processing, 2001. Proceedings.(ICASSP’01). 2001 IEEE International Conference on, vol. 3, IEEE, 1409–1412.
9. Christel, M. G., Hauptmann, A. G., Wactlar, H. D., and Ng, T. D. 2002. Collages as dynamic summaries for news video. In Proceedings of the tenth ACM international conference on Multimedia, ACM, 561–569.
10. Chun, B.-K., Ryu, D.-S., Hwang, W.-I., and Cho, H.-G. 2006. An automated procedure for word balloon placement in cinema comics. In Advances in Visual Computing. Springer, 576–585.
11. Ekin, A., Tekalp, A. M., and Mehrotra, R. 2003. Automatic soccer video analysis and summarization. Image Processing, IEEE Transactions on 12, 7, 796–807.
12. He, L., Sanocki, E., Gupta, A., and Grudin, J. 1999. Auto-summarization of audio-video presentations. In Proceedings of the seventh ACM international conference on Multimedia (Part 1), ACM, 489–498.
13. Hu, Y., Kautz, J., Yu, Y., and Wang, W. 2015. Speaker-following video subtitles. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 11, 2, 32.
14. Hwang, W.-I., Lee, P.-J., Chun, B.-K., Ryu, D.-S., and Cho, H.-G. 2006. Cinema comics: Cartoon generation from video stream. In GRAPP, 299–304.
15. Jackson, D., Nicholson, J., Stoeckigt, G., Wrobel, R., Thieme, A., and Olivier, P. 2013. Panopticon: A parallel video overview system. In proceedings of the 26th annual ACM symposium on User interface software and technology, ACM, 123–130.
16. Kim, J., Guo, P. J., Cai, C. J., Li, S.-W. D., Gajos, K. Z., and Miller, R. C. 2014. Data-driven interaction techniques for improving navigation of educational videos. In Proceedings of the 27th annual ACM symposium on User interface software and technology, ACM, 563–572.
17. Kim, J., Nguyen, P. T., Weir, S., Guo, P. J., Miller, R. C., and Gajos, K. Z. 2014. Crowdsourcing step-by-step information extraction to enhance existing how-to videos. In Proceedings of the 32nd annual ACM conference on Human factors in computing systems, ACM, 4017–4026.
18. Knuth, D. E., and Plass, M. F. 1981. Breaking paragraphs into lines. Software: Practice and Experience 11, 11, 1119–1184.
19. Kurlander, D., Skelly, T., and Salesin, D. 1996. Comic chat. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, ACM, 225–236.
20. Large, A., Beheshti, J., Breuleux, A., and Renaud, A. 1995. Multimedia and comprehension: The relationship among text, animation, and captions. Journal of the American Society for Information Science 46, 5, 340–347.
21. Li, F. C., Gupta, A., Sanocki, E., He, L.-W., and Rui, Y. 2000. Browsing digital video. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems, ACM, 169–176.
22. Lu, Z., and Grauman, K. 2013. Story-driven summarization for egocentric video. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, IEEE, 2714–2721.
23. Monserrat, T.-J. K. P., Zhao, S., McGee, K., and Pandey, A. V. 2013. NoteVideo: Facilitating navigation of blackboard-style lecture videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ACM, 1139–1148.
24. Mynatt, E. D., Igarashi, T., Edwards, W. K., and Lamarca, A. 1999. Flatland: new dimensions in office white-boards. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems, ACM, 346–353.
25. Ngo, C.-W., Ma, Y.-F., and Zhang, H.-J. 2005. Video summarization and scene detection by graph modeling. Circuits and Systems for Video Technology, IEEE Transactions on 15, 2, 296–305.
26. Pavel, A., Hartmann, B., and Agrawala, M. 2014. Video digests: a browsable, skimmable format for informational lecture videos. In Proceedings of the 27th annual ACM symposium on User interface software and technology, ACM, 573–582.
27. Pickering, M. J., Wong, L., and Rüger, S. M. 2003. ANSES: Summarisation of news video. In Image and Video Retrieval. Springer, 425–434.
28. Rubin, S., Berthouzoz, F., Mysore, G. J., Li, W., and Agrawala, M. 2013. Content-based tools for editing audio stories. In Proceedings of the 26th annual ACM symposium on User interface software and technology, ACM, 113–122.
29. Shah, D., 2014. “MOOCs in 2014: Breaking down the numbers (edsurge news)”.
30. Shahraray, B., and Gibbon, D. C. 1995. Automatic generation of pictorial transcripts of video programs. In IS&T/SPIE’s Symposium on Electronic Imaging: Science & Technology, International Society for Optics and Photonics, 512–518.
31. Shahraray, B., and Gibbon, D. C. 1997. Pictorial transcripts: Multimedia processing applied to digital library creation. In Multimedia Signal Processing, 1997., IEEE First Workshop on, IEEE, 581–586.
32. Smith, M. A., and Kanade, T. 1998. Video skimming and characterization through the combination of image and language understanding. In Content-Based Access of Image and Video Database, 1998. Proceedings., 1998 IEEE International Workshop on, IEEE, 61–70.
33. Truong, B. T., and Venkatesh, S. 2007. Video abstraction: A systematic review and classification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 3, 1, 3.
34. Uchihashi, S., Foote, J., Girgensohn, A., and Boreczky, J. 1999. Video manga: generating semantically meaningful video summaries. In Proceedings of the seventh ACM international conference on Multimedia (Part 1), ACM, 383–392.


