Processing audio on the server

peter.bleackley · February 28, 2022, 2:44pm

I’m currently building an app that records some audio from the user’s microphone, sends it to the server, where it is analysed, and the results of the analysis returned to the client.

So far, I can capture the audio in a Media object (16kHz, 16 bits/sample), send it to the server, store it in a DataTable, and play the audio from the DataTable. What I;m stuck on, however, is how to use the audio on the server side.

What I need is an iterable of raw audio samples. If the audio were wav, ie uncompressed, this would be simple. However, looking at the media objects stored in my DataTable, they appear to be Ogg Opus, which seems to indicate that I would need to decode them first. How would I do this? I can see that the pyAudio library is available, but I’m not very familiar with it and don’t know if it could decode the format.

ianb · February 28, 2022, 3:54pm

This thread ended up with many solutions, some of which may be what you are looking for.