“StructInbet: Integrating Explicit Structural Guidance Into Inbetween Frame Generation” by Pan and Xie
Conference:
Type(s):
Title:
- StructInbet: Integrating Explicit Structural Guidance Into Inbetween Frame Generation
Session/Category Title:
- Images, Video & Computer Vision
Presenter(s)/Author(s):
Abstract:
StructInbet is an skeleton-based inbetweening system that achieves controllable, structure-aware interpolation generation with improved pose clarity and motion alignment to user intent, surpassing prior point-based methods in reducing ambiguity.
References:
[1] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2021. High-resolution image synthesis with latent diffusion models, 2021.
[2] Yujun Shi, Chuhui Xue, Jun Hao Liew, Jiachun Pan, Hanshu Yan, Wenqing Zhang, Vincent YF Tan, and Song Bai. 2024. Dragdiffusion: Harnessing diffusion models for interactive point-based image editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8839–8849.
[3] Wen Wang, Qiuyu Wang, Kecheng Zheng, Hao Ouyang, Zhekai Chen, Biao Gong, Hao Chen, Yujun Shen, and Chunhua Shen. 2025. Framer: Interactive frame interpolation. International Conference on Learning Representations (ICLR) (2025).
[4] Jinbo Xing, Hanyuan Liu, Menghan Xia, Yong Zhang, Xintao Wang, Ying Shan, and Tien-Tsin Wong. 2024. Tooncrafter: Generative cartoon interpolation. ACM Transactions on Graphics (TOG) 43, 6 (2024), 1–11.
[5] Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. 2023. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF international conference on computer vision. 3836–3847.


