I’m trying to build a mail-service YouTube transcription app, the user enters the YouTube video link, and after some time, when the transcription is ready, the app mails the user the transcript of the video.
To do that, as soon as the user enters Enter on the text box, a background task is launched that streams that YouTube video to buffer, returns it as a Media object; which I then read using PyDub’s AudioSegment and is then transcribed.
The issue I’m facing here is that the server uses too much memory, when it’s loading the Media object as a AudioSegment object. A possible solution to that would be to split the media object in two or more parts, but I don’t know how to do that.
Can someone please help me regarding this?
Sure! It wasn’t Google, it was chatgpt, but googleing sounded better than chatgpting. I’m attaching a snapshot rather than sharing the link because I will delete that chat from my chat history. You can right+click and open in a new tab to read the content.
That’s the second attempt at the second question. The first attempt didn’t have the final “in python” and was talking about editing software rather than python libraries.
If you are only using the audio from a youtube video you can use youtube-dl which supports splitting the streams and just getting an audio stream to multiple formats.
Do not use youtube-dl if your end product is going to go directly back on a youtube video, they don’t like that it exists, but there isn’t anything wrong or illegal about using it for anything else.
pip install youtube-dl , its just a python library. Google how to gather the stream info from the youtube url and then pick an audio stream out of the results, its quite easy to do programmatically.
Thanks for sharing. The reason I asked was because moviepy did work (it’s slow but atleast, it works). Just wanted to know what you searched for because I spent 3 days searching on Google, but seems like you found it quickly.