Unzipping Files

A client of mine recently sent a zipped archive of files which needed to be extracted to a data table.

I didn’t want to use the file system and I also didn’t want the entire archive extracted within memory. Here’s what I came up with - a generator function to yield each zipped file as a media object:

import io
import zipfile


def get_zipped_files(media):
    """A generator of files contained within a zip archive

    Parameters
    ----------
    media : anvil.media instance

    Returns
    -------
    anvil.BlobMedia instance
    """
    with zipfile.ZipFile(BytesIO(media.get_bytes())) as zipped:
        for zipinfo in [zi for zi in zipped.infolist() if not zi.is_dir()]:
            name = zipinfo.filename.split("/")[-1]
            with zipped.open(zipinfo) as file:
                yield anvil.BlobMedia(
                    content_type="application/pdf", content=file.read(), name=name
                )

In my case, each of the files is a PDF, so I could hard code the content type. They were also in a directory structure, but I only want the files themselves.

9 Likes

Very useful! Thanks for sharing!

1 Like