ffmpeg commands for image sequences

converting sequences of images to video & gif files using ffmpeg on the command line

Tags: #generative art #video #command line

Last updated: Apr 10, 2023

Scenario: I have a canvas animation, which I'd like to convert to a gif or video file. While Canvas does have a native captureStream() method which enables you to convert directly to video (more about recording Canvas animations with captureStream() here) I found that the quality was pretty limited if the animation required a bunch of calculations and things happening in the background - once I started recording the stream, the page slowed down a whole lot, and so did the video. On my slow laptop, at least. So, instead I wrote some code that saved one frame of the drawing at a time as an image, then let me download a whole series of images all at once.

Once I had this series of images, I used ffmpeg to pack them all together into a gif or a video. I have the sense that ffmpeg is quite a powerful tool, but it's a really complex one. Most of my use of it has been limited to trial & error with answers I dig up on stackoverflow. So this was a bit of a challenge but I managed to figure some things out!

images to video

A couple variations on a command to convert a series of numbered images to video:

ffmpeg -r 60 -f image2 -i image-%d.png -crf 23 -r 30 -pix_fmt yuv420p video.mp4

ffmpeg -framerate 30 -i image-%d.png -pix_fmt yuv420p video.mp4

What the options mean:

-r or -framerate: frame rate (fps)
- It doesn't seem these mean the exact same thing. Recent ffmpeg docs say "If in doubt use -framerate instead of the input option -r."
- In the first example (using -r): the first instance refers to the input fps and the second to the output.
  - if the output rate is higher than the input rate, ffmpeg will duplicate frames to create that frame rate. Probably not ideal.
-f: format... maybe optional?
-i: input. image-%d.png means just look for a series of images with ascending digits (image-1.png, image-2.png, etc). There are other ways to write this depending on your filenames. For example, if you have a list of images padded with zeroes to 4 digits (img0001.png, img0002.png...) use img%04d.png.
-crf quality. 0 is lossless, max is 51. I believe the default is 23 and most examples I found online kept the number somewhere around there.
-pix_fmt pixel format. I found that without adding this option and yuv420p the video was created okay and worked online, but quicktime couldn't open it.
last argument is the output: video.mp4

adjustments: selecting images

use every 3rd image (or actually maybe it's the inverse, discard every 3rd?)

ffmpeg -framerate 30 -i Image%d.jpg -vf "select='not(mod(n,3))',setpts=N/30/TB" -crf 23 Output.mp4

This stackexchange question explains the setpts part is to retime the output. Since ffmpeg will read the digits of each image as seconds, without that line, it'll just duplicate frames to get to the necessary 30 fps.

See ffmpeg filters documentation: select. (there are examples below the list of options)

images numbered greater than 2000 (without setpts here you just get a long black screen at the beginning):

ffmpeg -framerate 40 -i image-%d.png -vf "select='gt(n,2000)',setpts=N/30/TB" -crf 23 -pix_fmt yuv420p output.mp4

If greater than 50, only use every other frame, otherwise use all frames (I think this is what's happening here, I tested a bunch of commands and it seems to be working...):

ffmpeg -framerate 30 -i lines-%d.png -vf "select='if(gt(n,50),not(mod(n,2)),1)',setpts=N/30/TB" -pix_fmt yuv420p output.mp4

Generally, use n to refer to the nth frame.

images to gif

Base ffmpeg command to convert a series of numbered images to gif:

ffmpeg -f image2 -r 60 -i image-%d.png output.gif

Many of these options are similar to those above.

using filters

The command above on its own creates a file that's completely unoptimized and HUGE. So you can add a bunch of other adjustments through a special filter command.

The filter command is preceded by -filter_complex or -vf. The difference (I think?) is about how many inputs/outputs you have, with -filter_complex being used for multiple inputs/outputs with complex filter chains, vs -vf which is a simpler, linear filter chain. (reddit comment about this).

ffmpeg -f image2 -r 60 -i image-%d.png -filter_complex "FILTER_HERE" output.gif

The filter was by far the most confusing part for me, so I'm going to break it into pieces.

ffmpeg filter syntax

The filter is separated with commas and semicolons. Commas separate filters, semicolons separate chains of filters. Between semicolons you can also specify the names of inputs & outputs to the chain. If don't specify a name, it's assumed that the output is just coming from the preceding item.

set scale & frame rate

fps=10,scale=320:-1:flags=lanczos,

fps=10 sets output frame rate to 10fps
scale=320:-1 scales images down to 320px wide. -1 keeps it at the same aspect ratio
flags=lanczos scales using lanczos scaling algorithm. There are a bunch of other scaling algortihms you can experiment with if you want.

add a boomerang effect

split[main][back];[back]reverse[r];[main][r]concat=n=2:v=1:a=0,

For this particular project I wanted the animation to play forward and then backward. If you don't want this effect, you can leave this part out.

split beginning copies the input into 2 segments, one named main and one named back
reverse takes the back segment as input, reverses it, and outputs it as another item labeled r.
concat concatenates main (the normal, forward sequence) and r (the reversed one)
- n=2 = there are 2 input segments
- v=1 = one video stream per segment
- a=0 = 0 audio streams per segment

use a custom palette

split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse

GIF is limited to 256 colors. By default ffmpeg will just use a generic palette that attempts to work with a variety of content. In my case, I was working with black & white images, so a palette attempting to cover the entire space would be mostly wasted, and I'd be missing lots of nuance in the shades.

This filter step creates a palette specifically tailored to your content.

split again copies your input into 2 segments, this time named s0 and s1
palettegen creates a palette from the s0 segment
we designate the created palette as p
paletteuse tells ffmpeg that when building our output gif from s1 (the sequence with all our previous filters performed on it), use palette p instead of the default

If you want, you can separate these steps out and generate a palette from your images, save the palette as a png, and then use that saved png as a second input when you create the gif. That looks something like:

filters=fps=10,scale=320:-1:flags=lanczos # or whatever other filters you're using
ffmpeg -f image2 -i image-%d.png -vf "$filters,palettegen" palette.png  # first command to create the palette
ffmpeg -i image-%d.png -i palette.png -filter_complex "$filters[x];[x][1:v] paletteuse" output.gif

But the first way, you get it done all at once without having to deal with an extra palette file.

There are a bunch of options and details here if you want to go into them; this blog post discusses palettes in more detail.

example commands

Using boomerang effect, generating a palette, using filter_complex:

ffmpeg -f image2 -r 60 -i image-%d.png -filter_complex "fps=10,scale=320:-1:flags=lanczos,split[main][back];[back]reverse[r];[main][r]concat=n=2:v=1:a=0,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" output.gif

Split into variables, no boomerang, using -vf:

filters="fps=10,scale=320:-1:flags=lanczos"
palette="split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse"
ffmpeg -f image2 -r 60 -i image-%d.png -vf "$filters,$palette" output.gif

you can also put an output frame rate using -vf, I sort of fiddle with this vs the input to make the speed different, which erm.. sometimes works?

ffmpeg -f image2 -r 60 -i image-%d.png -vf "$filters,$palette" -r 30 output.gif

resources

pkh.me: high quality gif with ffmpeg
ffmpeg wiki: filtering guide
code project: how to use ffmpeg filters to jazz up your audio and video (explains some of the filter syntax details)
hammad mazhar: using ffmpeg to convert a set of images into video
a bunch of stackexchange type posts:

← all notes