How to split a Media object in two equal parts?

crul · August 26, 2023, 10:07pm

I’m trying to build a mail-service YouTube transcription app, the user enters the YouTube video link, and after some time, when the transcription is ready, the app mails the user the transcript of the video.

To do that, as soon as the user enters Enter on the text box, a background task is launched that streams that YouTube video to buffer, returns it as a Media object; which I then read using PyDub’s AudioSegment and is then transcribed.

The issue I’m facing here is that the server uses too much memory, when it’s loading the Media object as a AudioSegment object. A possible solution to that would be to split the media object in two or more parts, but I don’t know how to do that.
Can someone please help me regarding this?

stefano.menci · August 26, 2023, 11:43pm

I haven’t tried, but after a quick googleing i would try with pip install moviepy.

crul · August 27, 2023, 8:09am

Would you mind telling me what you searched?

stefano.menci · August 28, 2023, 1:29pm

Sure! It wasn’t Google, it was chatgpt, but googleing sounded better than chatgpting. I’m attaching a snapshot rather than sharing the link because I will delete that chat from my chat history. You can right+click and open in a new tab to read the content.

That’s the second attempt at the second question. The first attempt didn’t have the final “in python” and was talking about editing software rather than python libraries.

ianb · August 28, 2023, 1:59pm

If you are only using the audio from a youtube video you can use youtube-dl which supports splitting the streams and just getting an audio stream to multiple formats.

Do not use youtube-dl if your end product is going to go directly back on a youtube video, they don’t like that it exists, but there isn’t anything wrong or illegal about using it for anything else.

pip install youtube-dl , its just a python library. Google how to gather the stream info from the youtube url and then pick an audio stream out of the results, its quite easy to do programmatically.

crul · August 28, 2023, 5:02pm

Thanks for sharing. The reason I asked was because moviepy did work (it’s slow but atleast, it works). Just wanted to know what you searched for because I spent 3 days searching on Google, but seems like you found it quickly.

crul · August 28, 2023, 5:03pm

I’m using PyTube because youtube-dl doesn’t work on my computer, so I can’t test the code locally.

ianb · August 28, 2023, 5:07pm

So are you already getting the audio only versions (if available)?
https://pytube.io/en/latest/user/streams.html#filtering-for-audio-only-streams

They should be much smaller than the entire video.

crul · August 28, 2023, 5:08pm

Yeah, I was using the audio streams only. moviepy worked because it has something called “AudioFileClip”