Summary of Open-sora Plan: Open-source Large Video Generation Model, by Bin Lin et al.
Open-Sora Plan: Open-Source Large Video Generation Model
by Bin Lin, Yunyang Ge, Xinhua Cheng, Zongjian Li, Bin Zhu, Shaodong Wang, Xianyi He, Yang Ye, Shenghai Yuan, Liuhan Chen, Tanghui Jia, Junwu Zhang, Zhenyu Tang, Yatian Pang, Bin She, Cen Yan, Zhiheng Hu, Xiaoyi Dong, Lin Chen, Zhang Pan, Xing Zhou, Shaoling Dong, Yonghong Tian, Li Yuan
First submitted to arxiv on: 28 Nov 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The Open-Sora Plan is an open-source project that aims to generate high-resolution videos with long durations based on user inputs. The project consists of multiple components, including a Wavelet-Flow Variational Autoencoder, a Joint Image-Video Skiparse Denoiser, and condition controllers. To achieve efficient training and inference, the team designed various assistant strategies and proposed a multi-dimensional data curation pipeline for obtaining high-quality data. The Open-Sora Plan achieves impressive video generation results in both qualitative and quantitative evaluations, making it a valuable contribution to the video generation research community. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The Open-Sora Plan is a new way to make videos that are really good quality and can be any length. It’s like having a super smart movie maker! The team used some fancy computer algorithms to make it work. They also had to figure out how to train the system so it could generate videos quickly and efficiently. Now, people can use this technology to make their own amazing videos. |
Keywords
» Artificial intelligence » Inference » Variational autoencoder