I can’t imagine the scale that FFMPEG operates at. A small improvement has to be thousands and thousands of hours of compute saved. Insanely useful project.
prisenco · 45m ago
Their commitment to performance is a beautiful thing.
Imagine all projects were similarly committed.
byteknight · 34m ago
Seems so easy! You only need the entire world even tangentially related to video to rely solely on your project for a task and you too can have all the developers you need to work on performance!
ackfoobar · 2m ago
I seem to recall that they lamented on twitter the low amount of (monetary or code) contribution they got, despite how heavily they are used.
zahlman · 44m ago
It'd be nice, though, to have a proper API (in the traditional sense, not SaaS) instead of having to figure out these command lines in what's practically its own programming language....
codys · 40m ago
FFMpeg does have an API. It ships a few libraries (libavcodec, libavformat, and others) which expose a C api that is used in the ffmpeg command line tool.
What is the actual process of identifying hotspots caused suboptimal compiler generated assembly?
Would it ever make sense to write handwritten compiler intermediate representation like LLVM IR instead of architecture-specific assembly?
molticrystal · 12m ago
It would be interesting to look into this to see if anybody has every hand tuned LLVM IR.
My best guess is you were doing codegen for several different instruction sets and the optimization or side channel prevention is something that would be too difficult or specialized to automate so you have to do it by hand.
Alifatisk · 55m ago
How do they make these assembly instructions portable across different cpus?
CannotCarrot · 34m ago
I think there's a generic C fallback, which can also serve as a baseline. But for the big (targeted) architectures, there one handwritten assembly version per arch.
faluzure · 4m ago
Yup.
On startup, it runs cpuid and assigns each operation the most optimal function pointer for that architecture.
In addition to things like ‘supports avx’ or ‘supports sse4’ some operations even have more explicit checks like ‘is a fifth generation celeron’. The level of optimization in that case was optimizing around the cache architecture on the cpu iirc.
Source: I did some dirty things with chromes native client and ffmpeg 10 years ago.
Imagine all projects were similarly committed.
They publish doxygen generated documentation for the APIs, available here: https://ffmpeg.org/doxygen/trunk/
Would it ever make sense to write handwritten compiler intermediate representation like LLVM IR instead of architecture-specific assembly?
My best guess is you were doing codegen for several different instruction sets and the optimization or side channel prevention is something that would be too difficult or specialized to automate so you have to do it by hand.
On startup, it runs cpuid and assigns each operation the most optimal function pointer for that architecture.
In addition to things like ‘supports avx’ or ‘supports sse4’ some operations even have more explicit checks like ‘is a fifth generation celeron’. The level of optimization in that case was optimizing around the cache architecture on the cpu iirc.
Source: I did some dirty things with chromes native client and ffmpeg 10 years ago.
https://github.com/FFmpeg/FFmpeg/blob/master/libavutil/x86/x...