Creating Videos with Stable Video Diffusion

Jeremy Morgan - Feb 12 - - Dev Community

If you're interested in generative AI, you've likely seen tools like Runway that generate video from an image. They're cool tools. Some of you, like me, want to run these tools for ourselves and have them on our personal machines.

If that's you, you've come to the right place. We're going to install a tool to generate AI videos today. Stable Video Diffusion was just released, and we're going to try it out.

What we'll do:

  • Install some pre-requisites
  • Configure our system for Stable Diffusion Video
  • Install the Stable Diffusion tools and checkpoints, and run it all with Streamlit.
  • Generate this video

In this tutorial, I'm using Linux and Anaconda to run this. The directions should be the same for Windows or Mac.

If this is your first time doing this kind of thing, don't worry. We'll start by setting up your environment with the right tools and dependencies and installing a few things. We aren't going to dive too deep into how this product works. We'll just install it and try it out.

Step 1: Install Dependencies

We're going to grab this repo from Stability AI.

git clone https://github.com/Stability-AI/generative-models.git && cd generative-models
Enter fullscreen mode Exit fullscreen mode

Next, we'll create a virtual environment. It's essential to use Python 3.10. For some reason, it was fussy with the newer versions.

conda create --name stablevideodiff python=3.10.0
conda activate stablevideodiff
Enter fullscreen mode Exit fullscreen mode

And then install all the requirements:

pip3 install -r requirements/pt2.txt
Enter fullscreen mode Exit fullscreen mode

And you'll see all the dependencies get installed:

"How to make AI generated Videos"

But you may see an error like this:

ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects
Enter fullscreen mode Exit fullscreen mode

"How to make AI generated Videos"

This is sometimes due to the lack of a RUST compiler. You can install it with this command (in Linux):

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Enter fullscreen mode Exit fullscreen mode

Once it's done installing you should see no errors.

"How to make AI generated Videos"

Step 2: A couple of finishing touches

First, in your application folder (generative-models) type in the following.

pip3 install .
Enter fullscreen mode Exit fullscreen mode

Then we want to install sdata for training:

pip3 install -e git+https://github.com/Stability-AI/datapipelines.git@main#egg=sdata
Enter fullscreen mode Exit fullscreen mode

And now we're just about ready to let this thing cook.

Step 3: Running the Demo

Ok, now you want a demo.

Log in to Hugging Face. You can create an account for free if you don't have one.

You will need to download svd.safetensors to the /checkpoints directory. If it's not there, create a directory named checkpoints at the base of the app.

Note: I am running an RTX 4090 with 24G VRAM and have 64GB of RAM on the PC. I still had to do this:

export 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512'
Enter fullscreen mode Exit fullscreen mode

This helps with memory allocation as you run it, so you aren't as likely to run out of memory. I did a few times before playing with it.

Now we can fire up the web app:

PYTHONPATH=. streamlit run scripts/demo/video_sampling.py --server.port 8080
Enter fullscreen mode Exit fullscreen mode

In your browser, you should see this:

"How to make AI generated Videos"

So let's try it out. Click "load model".

It will start running.

Then you'll see this message:

"How to make AI generated Videos"

Awesome,

Now I'll load up an image sample using the "Browse files" button you see above.

And I want to set this to a lower value of 2, so I don't run out of VRAM. Again, I have 24G to work with, but I must be careful. Pytorch is happy to use all the memory for this.

"How to make AI generated Videos"

and then click sample. You will see it start to work:

"How to make AI generated Videos"

And there we have it!! We produced a video.

"How to make AI generated Videos"

It's only 2 seconds long but pretty awesome. You can download it here

I love it and can't wait to do more of these.

What did we make?

So here, I generated exactly 14 frames from an image. It's 2 seconds, 7 frames per second.

You can change that setting here:

"How to make AI generated Videos"

You can set how many frames total you can generate. I've managed to get up to 3 seconds but ran out of VRAM.

Your results may vary.

Conclusion

Congratulations, you did it! This tutorial taught us how to set up an environment for Stable Video Diffusion, install it, and run it. This is an excellent way to get familiar with generative AI models and how to tune Stable Diffusion. You can produce some awesome stuff by getting to know your way around. Experiment with different settings, frames, and images to see how you can push the boundaries of this thing. Your results may vary based on hardware limitations. I'm running a 4090, as I mentioned, and it's still getting taxed. But like with all these things, it's about the learning experience.

What are you doing with Stable Diffusion or Generative AI today? Let me know! Let's talk.

Also, if you have any questions or comments, feel free to reach out.

Happy hacking!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .