Can I use anvil for an app that processes large amounts of data?

Hi everyone, I’m trying to understand if I can use Anvil to develop my personal project. It’s about building an app that processes large amounts of data (we’re talking .csv files in the GB range, with millions of rows), without having to spend a fortune on a custom plan. However, I have seen that there are limits in the size and number of rows of the tables. Is there any solution, while maintaining the basic plan (personal plan)? If so, what solutions could I adopt? For example, could I carry out the processing in Jupyter files on my personal PC, or should I find a hosting site where I can host the files and scripts and then connect them to the Anvil app? What is the best way?

Thanks in advance to anyone who can help me!

Both of those options are available via Uplink.

However, “best” always depends on the circumstances: goals plus constraints. Usually, we don’t know enough about your constraints to make that decision for you.

Hi @Finoz,

Yeah, if you’re shipping gigabytes of data around in your Server Modules, a Personal Plan isn’t going to cut it.

Your alternative proposal (writing code with the Uplink and hosting the crunchy compute outside Anvil) is a reasonable one! You probably won’t want to ship too many gigs of data through anvil.server.call() - if you’re summarising a large dataset for clients, that won’t be a problem, but if you’re doing upload/download you might want to look at going directly into/out of S3 or something like that (boto3 is fairly straightforward to use - the only really tricky bit is uploading from untrusted clients to S3, and we’ve got example code for that in our uppy example).

Of course this approach requires more work and attention from you than having us handle it - there’s more dev complexity to maintain (keeping your Uplink in sync with your anvil app, possibly managing data in S3, etc), and a bit of system administration effort (you’ll need to make that machine available, keep your code running, patch/update it, etc) - but if you want to make the “more of my time, less of my money” tradeoff, that’s the way to go!

Thanks to both of you for the replies. I have one last question: regarding anvil data files, is there a limitation on the number of rows or file size like for data tables? Or could I think about uploading the files and then processing them with pandas? I would like to point out that the data is quite static, in the sense that it can be updated on a monthly basis.
I apologize if the question may seem stupid, but I’m entering the world of anvil now.
Thank you

No worries! Anvil’s Data Files have the same limits as Data Tables (in fact, they use Data Tables under the hood!). So your best bet is to put the files wherever your Uplink code is, and read them straight off the disk there.