I need to store a few million rows of postcode reference data simple key:value format
e.g. Postcode(String):Distance(Integer)
I hit my row limit on Anvil data tables, so went to look at storing in an external mongo database, but can’t find any precedence of people doing this on the forum…
So - how would you do this? Is the only option an external Postgres database?
I’m a big fan of sql databases so I nearly always recommend them for things like this.
But if you’re using key:value, have you considered something like Redis? I guess it depends on how you want to retrieve the data.
Just FYI, one of my clients inserts up to 1 million rows per hour into a 3 node MySQL cluster (well, Galera so MariaDB techincally) but I extract several hundred thousand bits of commonly accessed data into redis for fast lookups (basically a cache). As long as your lookups are simple it works pretty well.
If the data is “small” (meaning key value pairs are just postcodes and itegers as you describe, not long text fields), you can save them as a JSON.txt file and have them loaded when the form opens. I have used this in an Anvil application- if I have time I might try and put together a “sanitized” version open for distribution.
As you may already know: You can use any Python database API in a Server Module. For Mongo, you would use pymongo.
pymongo is not currently installed in the Server Module environment. We like to install whatever anybody needs, so if you do want to try using pymongo, just ask and we’ll install it.
For Postgres, we have psycopg2 installed. Again, just ask if you need something different.
In terms of hosting, AWS Relational Database Service provides a free tier, it may be worth checking if that’s sufficient for your needs (or perhaps a paid RDS instance is within your budget).