Using Pydub to make mp3 file from audio file

koontz2k4 · November 30, 2021, 10:04pm

What I’m trying to do:

I’m trying to convert an audio file that has been (successfully) uploaded to my data table to mp3 format. I’d like to convert “call_audio.mp3” to an mp3. Despite the name, it is not being recognized as a .mp3 file upstream. I believe it is still being treated as a wav file.

What I’ve tried and what’s not working:
I’m using pydub, and so far I’ve tried:

Code Sample:

call_recording = AudioSegment.from_wav(app_tables.recorded_audio.get(audio=q.not_(None)))

The error I am getting is:

anvil.tables.TableError: Cannot use a Media object as a search query

My question is: If I cannot use a Media object as a search query, how would pydub know what file to transfer to mp3?

Clone link:
share a copy of your app

jshaffstall · November 30, 2021, 10:33pm

You would need to search by something other than the media field. You have a field named audio_uuid that looks like a unique id. You could search by the audio_uuid for the file you want to convert.

koontz2k4 · November 30, 2021, 11:08pm

Thanks Jeff. I tried using search on the UUID with a full_text_match, and it gave me an attribute error: “seek”.

Code Sample:

@anvil.server.callable
def get_blobmedia(b64str,mediatype):
    binary_content = base64.standard_b64decode(b64str)
    my_media = anvil.BlobMedia(content_type="audio/mp3", content=binary_content, name="call_audio.mp3")
    #mp3_recording = AudioSegment.from_wav(app_tables.recorded_audio.search(audio=q.full_text_match("call_audio.mp3")))
    #mp3_recording = AudioSegment.from_wav(app_tables.recorded_audio.get(audio=q.not_(None)))
    mp3_recording = AudioSegment.from_wav(app_tables.recorded_audio.search(audio_uuid=q.full_text_match("c2023cc2-ab9e-4516-84cc-789e49aff4b6")))
    mp3_recording.export("call_audio.mp3", format="mp3")
    return my_media

The error:

jshaffstall · December 1, 2021, 12:02am

Just use the value, e.g.:

app_tables.recorded_audio.search(audio_uuid="c2023cc2-ab9e-4516-84cc-789e49aff4b6")

You should also be using get instead of search, since you know you’re only going to get one result out of it.

Your larger issue, though, is that you’re calling AudioSegment.from_wav, which expects a file name, and passing it a data table row.

I’ve never had to deal with getting media from a data table into pydub, so can’t help you with that, but a lot of other folks on the forum have done similar things. Do some searching around about how to create a temporary file from a media object, or maybe see if pydub has a function that will take the actual bytes of the file instead of the file name.

ianb · December 1, 2021, 8:00pm

A quick check on StackOverflow about a similar question about pydub says you should be able to pass a file like object to the .from_wav() method. This means you could use BytesIO

So putting everything so far all together:

import io

row = app_tables.recorded_audio.get(audio_uuid="c2023cc2-ab9e-4516-84cc-789e49aff4b6")

if row is not None:
  f_stream =  io.BytesIO(  row['audio'].get_bytes()  )
  
  call_recording = AudioSegment.from_wav( f_stream )

  #etc etc
  ...

koontz2k4 · December 3, 2021, 4:37am

Thank you for that @ianb. The code returned the below error. I’m not sure if it doesn’t like the use of io, or if it’s a problem with my media file.

  
@anvil.server.callable
def get_blobmedia(b64str,mediatype):
    binary_content = base64.standard_b64decode(b64str)
    my_media = anvil.BlobMedia(content_type="audio/mp3", content=binary_content, name="call_audio111.mp3")
    row = app_tables.recorded_audio.get(audio_uuid="db3bec0c-9bc9-48a9-8eb2-9e20d0d25b0d")
    if row is not None:
      f_stream = io.BytesIO(row['audio'].get_bytes())
      call_recording = AudioSegment.from_wav(f_stream)
      call_recording.export("call_audio.mp3", format="mp3")
      #app_tables.recorded_audio.add_row(timestamp=now, audio_uuid=my_uuid, audio="test.mp3")
    return my_media

CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Output from ffmpeg/avlib:

ffmpeg version 4.1-static John Van Sickle - FFmpeg Static Builds Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc-6 --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg
libavutil 56. 22.100 / 56. 22.100
libavcodec 58. 35.100 / 58. 35.100
libavformat 58. 20.100 / 58. 20.100
libavdevice 58. 5.100 / 58. 5.100
libavfilter 7. 40.101 / 7. 40.101
libswscale 5. 3.100 / 5. 3.100
libswresample 3. 3.100 / 3. 3.100
libpostproc 55. 3.100 / 55. 3.100
[wav @ 0x61cbcc0] invalid start code [26]E[223][163] in RIFF header
[cache @ 0x61cc540] Statistics, cache hits:0 cache misses:1
cache:pipe:0: Invalid data found when processing input

at /usr/local/lib/python3.7/site-packages/pydub/audio_segment.py, line 725
called from /usr/local/lib/python3.7/site-packages/pydub/audio_segment.py, line 750
called from AudioServer, line 44
called from AudioRecorder, line 24

AudioServer Line 44:

ianb · December 3, 2021, 2:28pm

So it looks like you are having an error with whatever you are passing to pydub, I think earlier in this thread you mentioned something about not knowing if it was a wav or not.

It looks like the Anvil part is working ok.

Maybe check the pydub docs? I’m not sure why a library is trying to decode audio data using ffmpeg, but I don’t know that much about what pydub does.

The error mentions [wav @ 0x61cbcc0] invalid start code [26]E[223][163] in RIFF header so Im guessing your assumption that you are passing it a full .wav file and not just the audio data from that file may be incorrect?

I think you are going to have to revisit your code that leads you here.

koontz2k4 · December 4, 2021, 4:36pm

I think I see the issue. When I print the audio type, I get “audio/webm;codecs=opus”. So, it doesn’t appear to be a .wav file like I thought. So now, I (think?) I just need to figure out how to convert from “opus” to mp3. I’ll post again when I figure it out.

koontz2k4 · December 4, 2021, 7:21pm

A couple of simple tweaks got me past that issue. For example, it needed to be “AudioSegment.from_file”, instead of “AudioSegment.from_wav”. I was incorrectly assuming it was a wav file.

And I needed to change it from:

AudioSegment.from_file(f_stream, “webm”)

Instead of:

AudioSegment.from_file(f_stream)

However, when I try to save the newly converted mp3 file to the data table, I get the below error:

anvil.server.SerializationError: Cannot serialize arguments to function. Cannot serialize <class ‘_io.BufferedRandom’> object at msg[‘kwargs’][‘audio’]

Any idea how I should save the mp3 file to the data table if “add_row” won’t work?

@anvil.server.callable
def get_blobmedia(b64str,mediatype):
    binary_content = base64.standard_b64decode(b64str)
    my_media = anvil.BlobMedia(content_type="audio/mp3", content=binary_content, name="call_audio.mp3")
    row = app_tables.recorded_audio.get(audio_uuid="19126b6c-30d7-4118-8488-aad316666f57")
    if row is not None:
      f_stream = io.BytesIO(row['audio'].get_bytes())
      call_recording = AudioSegment.from_file(f_stream, "webm")
      call_recording = call_recording.export("call_audio.mp3", format="mp3")
      print(call_recording.name)
      app_tables.recorded_audio.add_row(timestamp=datetime.datetime.now(), audio_uuid=str(uuid.uuid4()), audio=call_recording)
    return my_media

jshaffstall · December 4, 2021, 8:31pm

add_row works fine, you’re just passing it something it isn’t prepared to deal with (whatever the return from call_recording.export is).

You need to take that return value and build a Media object from it, and put the Media object into the table. Look at the BlobMedia example in the docs: Anvil Docs | Files, Media and Binary Data

If the return from call_recording.export is a byte stream, then you should be good just following the BlobMedia example. If it’s something else, you’ll need to convert it to a byte stream.

smuts1989 · December 4, 2021, 9:27pm

You could try the code in the following anvil thread:

I would suggest saving the audio files to Dropbox then reading them from there.

ianb · December 6, 2021, 3:41pm

koontz2k4:

@anvil.server.callable
def get_blobmedia(b64str,mediatype):
    binary_content = base64.standard_b64decode(b64str)
    my_media = anvil.BlobMedia(content_type="audio/mp3", content=binary_content, name="call_audio.mp3")
    row = app_tables.recorded_audio.get(audio_uuid="19126b6c-30d7-4118-8488-aad316666f57")
    if row is not None:
      f_stream = io.BytesIO(row['audio'].get_bytes())
      call_recording = AudioSegment.from_file(f_stream, "webm")
      call_recording = call_recording.export("call_audio.mp3", format="mp3")
      print(call_recording.name)
      app_tables.recorded_audio.add_row(timestamp=datetime.datetime.now(), audio_uuid=str(uuid.uuid4()), audio=call_recording)
    return my_media

Im guessing this would also work (keeping everything in memory, why not keep the same theme all the way through):

@anvil.server.callable
def get_blobmedia(b64str,mediatype):
    binary_content = base64.standard_b64decode(b64str)
    my_media = anvil.BlobMedia(content_type="audio/mp3", content=binary_content, name="call_audio.mp3")
    row = app_tables.recorded_audio.get(audio_uuid="19126b6c-30d7-4118-8488-aad316666f57")
    if row is not None:
      f_stream = io.BytesIO(row['audio'].get_bytes())
      call_recording = AudioSegment.from_file(f_stream, "webm")
      with io.BytesIO(b"") as f_out_stream:
        call_recording.export( f_out_stream, format="mp3")
        print(call_recording.name) #  Does this have a name? Are we ever passing it one using .get_bytes? (no) should it be ""call_audio.mp3" ?
        call_recording = anvil.BlobMedia('mp3', f_out_stream.read(), name=call_recording.name) #  See question above about the name
        
        app_tables.recorded_audio.add_row(timestamp=datetime.datetime.now(), audio_uuid=str(uuid.uuid4()), audio=call_recording)

    return my_media