I stumbled on this using whisper in my app and found out that ffmpeg is already installed on the Anvil server. You don’t actually need ffmpeg to use whisper but I needed it to convert different audio file formats.
You can also use the whisper API by just installing the openai package:
https://platform.openai.com/docs/guides/speech-to-text
Here’s a post that is helpful when using ffmpeg: