🔧 Dependencies and Installation

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

[Paper] [Project Page] [Jittor Version] [🤗 Comic Generation Demo ]

Official implementation of StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation.

Demo Video

28a42f-2cf8-a5b-ec52-f3aaba486f.mp4

Update History

You can visit here to visit update history.

🌠 Key Features:

StoryDiffusion can create a magic story by generating consistent images and videos. Our work mainly has two parts:

Consistent self-attention for character-consistent image generation over long-range sequences. It is hot-pluggable and compatible with all SD1.5 and SDXL-based image diffusion models. For the current implementation, the user needs to provide at least 3 text prompts for the consistent self-attention module. We recommend at least 5 - 6 text prompts for better layout arrangement.
Motion predictor for long-range video generation, which predicts motion between Condition Images in a compressed image semantic space, achieving larger motion prediction.

🔥 Examples

Comics generation

Image-to-Video generation （Results are HIGHLY compressed for speed）

Leveraging the images produced through our Consistent Self-Attention mechanism, we can extend the process to create videos by seamlessly transitioning between these images. This can be considered as a two-stage long video generation approach.

Note: results are highly compressed for speed, you can visit our website for the high-quality version.

Two-stage Long Videos Generation (New Update)

Combining the two parts, we can generate very long and high-quality AIGC videos.

Video1	Video2	Video3

Long Video Results using Condition Images

Our Image-to-Video model can generate a video by providing a sequence of user-input condition images.

Video1	Video2	Video3

Video4	Video5	Video6

Short Videos

Video1	Video2	Video3

Video4	Video5	Video6

🚩 TODO/Updates

Comic Results of StoryDiffusion.
Video Results of StoryDiffusion.
Source code of Comic Generation
Source code of gradio demo
Source code of Video Generation Model
Pretrained weight of Video Generation Model

🔧 Dependencies and Installation

Python >= 3.8 (Recommend to use Anaconda or Miniconda)
PyTorch >= 2.0.0

conda create --name storydiffusion python=3.10
conda activate storydiffusion
pip install -U pip
# Install requirements
pip install -r requirements.txt

How to use

Currently, we provide two ways for you to generate comics.

Use the jupyter notebook

You can open the Comic_Generation.ipynb and run the code.

Start a local gradio demo

Run the following command:

(Recommend) We provide a low GPU Memory cost version, it was tested on a machine with 24GB GPU-memory(Tesla A10) and 30GB RAM, and expected to work well with >20 G GPU-memory.

python gradio_app_sdxl_specific_id_low_vram.py

Contact

If you have any questions, you are very welcome to email ypzhousdu@gmail.com and zhoudaquan21@gmail.com

Disclaimer

This project strives to impact the domain of AI-driven image and video generation positively. Users are granted the freedom to create images and videos using this tool, but they are expected to comply with local laws and utilize it responsibly. The developers do not assume any responsibility for potential misuse by users.

Related Resources

Following are some third-party implementations of StoryDiffusion.

API

runpod.io serverless worker provided by BeS.
Replicate worker provided by camenduru.

BibTeX

If you find StoryDiffusion useful for your research and applications, please cite using this BibTeX:

@article{zhou2024storydiffusion,
  title={StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation},
  author={Zhou, Yupeng and Zhou, Daquan and Cheng, Ming-Ming and Feng, Jiashi and Hou, Qibin},
  journal={NeurIPS 2024},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
config		config
examples		examples
fonts		fonts
images		images
oldversion		oldversion
results_examples		results_examples
utils		utils
.gitignore		.gitignore
Comic_Generation.ipynb		Comic_Generation.ipynb
LICENSE		LICENSE
README.md		README.md
app.py		app.py
cog.yaml		cog.yaml
gradio_app_sdxl_specific_id_low_vram.py		gradio_app_sdxl_specific_id_low_vram.py
predict.py		predict.py
requirements.txt		requirements.txt
storydiffusionpipeline.py		storydiffusionpipeline.py
update.md		update.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Demo Video

Update History

🌠 Key Features:

🔥 Examples

Comics generation

Image-to-Video generation （Results are HIGHLY compressed for speed）

Two-stage Long Videos Generation (New Update)

Long Video Results using Condition Images

Short Videos

🚩 TODO/Updates

🔧 Dependencies and Installation

How to use

Use the jupyter notebook

Start a local gradio demo

Contact

Disclaimer

Related Resources

API

BibTeX

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 14

Languages

License

HVision-NKU/StoryDiffusion

Folders and files

Latest commit

History

Repository files navigation

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Demo Video

Update History

🌠 Key Features:

🔥 Examples

Comics generation

Image-to-Video generation （Results are HIGHLY compressed for speed）

Two-stage Long Videos Generation (New Update)

Long Video Results using Condition Images

Short Videos

🚩 TODO/Updates

🔧 Dependencies and Installation

How to use

Use the jupyter notebook

Start a local gradio demo

Contact

Disclaimer

Related Resources

API

BibTeX

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 14

Languages

Packages