Retain pandas in server

I have a background task that prepares data into a pandas dataframe. I need to re-use the dataframe on later instructions from the ui.

I know that I can’t send the dataframe to the client. What is the best way to handle this?

If you have a dedicated plan, you can enable “persistent server”, which will keep the server running most of the time, but not 100%.

You can search for it in the forum and see other discussions taking about it.

If you don’t have it, then you will need to reload it at every request, either from a file or datatable.

Our you could use a different server, with uplink for example.

All of those are great ideas, one more suggestion would be to pickle the dataframe using pandas .to_pickle() , and save it using the new data files service.

When you need to re-use it, use the data files service to unpickle the file directly back into a pandas dataframe. If you are using this frequently without changing it, the data files service will cashe the file, making its access quicker.

1 Like

Thanks, for this idea. On investigation it seems I can’t create a new file from data_file object. I can see methods to open and methods to edit. How can I create a file from within the server?

It’s not supported yet, but I created a workaround. It’s in the modules found in the clone link in this post below:
( this may or may not be an advisable practice, so ymmv )

Creating new files all the time in data files with code is probably not the best practice, it’s primary purpose is for large, used-many times files that do not change frequently.

You could simply read from / write to a text column. A table with two columns, one with the id, i.e. the file name and one with the file content.

1 Like

That’s essentially what I did in the end. I stored the task_id, task_date, result (dataframe) in a table. Then at the right point in the UI, when I no longer needed the results, I deleted them from the table.

In some apps I don’t even bother deleting this kind of rows from the table. Instead I have a nightly scheduled task that deletes rows with timestamp older than x days.

1 Like