How to manipulate files and folders?

tommy1 · October 15, 2023, 2:31pm

I have an app that writes to a file from a directory consisting of many subfolders and files, then I need to zip the entire directory.

How can I solve this best?
First I thought this needed to be done using Assets, however that doesn’t seem to work(?).
The best bet I found reading some posts and docs is using Data Files (Table), however how can I then zip the dir and all its subfolders and files?

Any pointers would be greatly appreciated.

david.wylie · October 15, 2023, 8:53pm

Your post is not very specific, so here’s some general info …

your server functions have ephemeral access to the /tmp directory, that may or may not survive between server calls. You should be able to create subdirectories there and zip them, so long as that all happens within one server call. A search of this forum will point you to various posts using zip on /tmp.

Another alternative is to use the uplink mechanism with server functions running on your own servers. There you have unlimited access to run anything you require. Again, searching this forum and the docs will give you lots of info on that.

p.colbert · October 16, 2023, 1:11pm

Where is this directory? On which machine? The Client’s PC? An Anvil Server? Some other PC?

tommy1 · October 16, 2023, 2:27pm

Sorry for not being clear. All the data is located in Anvil, and will be processed by my server-function.
I’ve uploaded the directory to Assets, and I’ve also added it to Data files thereby creating a table consisting of 3 columns: path (text) showing the dir-structure, file (media) containing the files themselves, and file_version (text).
So the directory is now located in 2 places as I’m not sure where it would be best to store it for my purpose.
My server-function will grab one file in the directory, edit it, return it back to the correct place in the directory, then the entire directory will be zipped and renamed, and then returned for download by the client.

@david, thanks, I’ll look more into /tmp.
Would this be the best location to store the directory as well?

p.colbert · October 16, 2023, 3:07pm

The contents of /tmp can be erased between server calls. Moreover, subsequent server calls might be executed on completely different hardware (for load-balancing). /tmp should not be considered storage, but as a throwaway workspace, valid only during the execution of a single server call.

Conversely, several server calls may be executing at the same time, in different server instances, but happen to share the same /tmp folder. If you’re not careful to use unique file/folder names within /tmp, you can easily have these different executions step on each others’ data.

tommy1 · October 16, 2023, 6:51pm

Conversely, several server calls may be executing at the same time, in different server instances, but happen to share the same /tmp folder. If you’re not careful to use unique file/folder names within /tmp, you can easily have these different executions step on each others’ data.

Thanks, I thought about this. I’ll make sure to implement unique identifiers during the manipulation-sequence if this turns out to be the best solution.

hugetim · October 16, 2023, 9:21pm

I’d remove them from Assets and use Data Files/Tables for the long-term storage. Beyond that, your question is:

how can I then zip the dir and all its subfolders and files?

These docs should help with the beginning and end of the process:

For the middle, creating a zip file with Python code on the server, that’s not so much an Anvil question as a more general Python question, but it looks like you will find some discussion on this forum if you search.

tommy1 · November 5, 2023, 7:39pm

I managed to make this work using server.callable, but when switching to a background_task I constantly get:

An error occurred: 'NoneType' object has no attribute 'get'

Can I still use the tmp-dir when running it as a background_task?
Could this be the problem?:

zip_path = f"/tmp/{uuid.uuid4().hex}.zip"

david.wylie · November 5, 2023, 8:03pm

My guess is that the background task is its own process (or thread, I never really know which one), so it’s probably possible that if you write to /tmp in your main process then fire off a background task, the /tmp data may not still be there.

That’s all speculation, by the way. I’ve not tested any of that.

tommy1 · November 5, 2023, 8:04pm

All the zipping-logic is part of the background-task though.

david.wylie · November 5, 2023, 8:04pm

Ah, in which case, what I said is probably nonsense, sorry

hugetim · November 5, 2023, 8:27pm

Do you have any lines in the code with a ‘get’ call like some_dict.get('some_key')? That’s what I would check first.

The more general version of david.wylie’s point is just to keep in mind that the background task doesn’t know anything about the server call context beyond what you explicitly pass to it. In particular, anvil.users.get_user() returns None unless you explicitly force-login a user in the background task. That’s something that’s tripped me up before.