I am very new to Anvil and created my first app which is a Name Generator.
On a local Python Notebook on my computer I uploaded different csv-files with names. After cleaning the data eventually there is a DataFrame with over 700’000 names. With a function the user gets a selection of names. With Anvil uplink I connected the Python notebook to the anvil app and it works fine as long as my notebook on my computer is running. However, since I want to share this app with others I need it to work permanently.
How could I have my app running all the time so it does not depend on my local Python notebook? Would I need to create a table in Anvil with a paid plan or is there another option?
This would not work if you need to sort or search the names, but if you need random names an ugly trick could be to create a table with one simple object column, then add 1000 rows, each with a list of 700 names.
…just to be clear, grouping them by first letter is actually not a good way to get a randomized sample, because it will over-sample the names with the smallest first letter frequency, instead of each name having a 1 in 700000 chance to show up.
@stefano.menci 's answer is the real one if you just wanted a low number of rows and just want random names. But if you are doing this, I would have pandas create the 1000 (non overlapping) random samples to be put into the rows, and not just put them in sorted in some ordinal way.
I have a little typing tutor app with a list of ~54000 words. I just store and load that list in the code of a server module - it stays in memory, and I access it with server calls. Is there anything keeping you from uploading your file and keeping the dataframe in memory on the server (you have to run pandas on the server anyway)?
My only concern, would be that this could put some undue strain on memory and CPU use in a shared account.
Using 100% of the 50 k limit is a really bad idea. You will not be able to use data tables for anything else on the free plan across any other apps that you create on your account in totality which is probably really bad for developing anything, including even having a secure users service. (It uses a data table)
This code is asking that you have the server digest the entirety of a 50,000 row data table and turn it into a data frame (I’m not sure where you are doing this actually, but if it’s in the global scope and not a function that’s also not great).
The way this is coded highlights all of the weaknesses of a data table, and uses none of its strengths.
To use the data table strengths you should be using the python random module to pick an index number from 0 to the length of your search iterator, then pull a sample name from that iterators index.
Like:
from random import randint
# Yes, You can refactor this for brevity, I wrote it with clarity in mind.
def get_random_name_from_rows():
rows = app_tables.names_table.search()
random_index_int = randint(0, len(rows) - 1)
return rows[random_index_int]['name_of_name_column']
@nickantonaccio 's method is probably the best way to go if you are always going to use these names in your app.
If you don’t like the clutter of a 50k word list in your server module code, create a second server module containing just this list of names and import it into your main server module code like explained here:
The loading and unloading through the iteration of 50k data table rows every time the function is called is most likely what is timing out, for such a (relatively) small amount of memory to load.
If all you need is one random name, then you could add one name_id column with a sequential number, then pick a random row, so you only read one row:
name = app_tables.names_table.get(name_id=random.randint(0, 50000))
Or, if you have one table with 1000 rows, each with 700(-ish) names:
name_list = app_tables.names_table.get(name_id=random.randint(0, 1000))
name = name_list[random.randint(0, len(name_list)]
EDIT
I just realized that his technique is not really random if the number of names on each row is different.
For example, if you have one row with 700 names and one row with 350 names, each name on the former row has half the chances of being picked than each of the names on the second row.
The solution is simple: just make sure each row has the same number of names (350 vs 700 generates a noticeable bias, but 699 vs 700 is virtually the same number).
Now there are only 179 rows. As you can see there are lists of names in the column Name.
My goal is now to build an app that returns random names based on the parameters gender, length, amount of names and the starting letter. These parameters sould be chosen by the user with a drop down:
How should I write my code in the Server Module to access the Data Table so that I achieve this most efficiently? I tried creating a Pandas DataFrame and return a sample of it but this attempt failed.