Creating AI Videos with Mov2Mov using Stable Diffusion
Note: This guide assumes you’re using AUTOMATIC1111 and have the required software installed.
Step 1: Installation
Start by installing the Mov2Mov extension from the provided link. You can do this by going to the “Extensions” tab in AUTOMATIC1111 and entering the URL for the extension’s Git repository. Additionally, make sure you have FFmpeg installed on your system.
Step 2: Video Downscaling (if needed)
If your input video has a high resolution that might strain your GPU, consider downscaling it. A resolution of 720p is recommended if your system has sufficient VRAM and processing power. You can use FFmpeg to do this with the following command:
Replace input.mp4
and output.mp4
with your actual file names.
Step 3: Creating the Flipbook
- Drag and drop your video into the Mov2Mov tab in AUTOMATIC1111.
- Set the width and height to match the resolution of your input video.
- Configure the sampler and specify the number of steps you want for your AI video.
- Set prompts/negative prompts. While you can use LoRAs and embeddings as usual, please note that the interface doesn’t display them on this tab. You may need to switch between the txt2img and mov2mov tabs to copy the model names.
Optional (but recommended): Canny Controlnet for Image Generation
If you want to improve the quality of your AI-generated video, you can use Canny controlnet to generate the image.
- Controlnet Settings (for a 1280×720 input video):
- Enable: Checked
- Guess Mode: Checked
- Preprocessor: Canny
- Model: control_canny-fp16 (available for download)
- Annotator Resolution: 768
- Canvas Width: 1024 (use 720 for 720×1280 video)
- Canvas Height: 720 (use 1024 for 720×1280 video)
Other Canny settings are beyond the scope of this guide, but the denoising slider generally controls the strength of the prompt. Higher values mean a stronger prompt.
Step 4: Generate the Flipbook
Click the “Generate” button. A preview of each frame will be generated and saved in the specified directory, typically under \stable-diffusion-webui\outputs\mov2mov-images\<date>
. If you interrupt the generation process, a video will be created with the progress made up to that point.
Be prepared for the generation process to take a while, especially if you’re working with a high-resolution video and have a powerful GPU. A 30-second video at 720p can take several hours to complete.
Step 5: Final Video Render
Once the flipbook generation is complete, you can view the final results without sound. The Mov2Mov video without sound is typically saved in \stable-diffusion-webui\outputs\mov2mov-videos
.
To add audio to your generated video, you can use FFmpeg with the following command:

Here’s what each part of this command does:
- Take the video stream from the first input (
generatedVideo.mp4
) and the audio stream from the second input (originalVideo.mp4
). -c:v copy
tells FFmpeg to copy the video stream directly from the source without re-encoding.-c:a aac
instructs FFmpeg to encode the audio stream in AAC format.output.mp4
is the name of the output file.
Now you can enjoy your AI-generated video with audio!