So in case anyone else has tried to do this or it might come up I thought I would share. I’m making a mass snake importer for my platform so people can bring their data over from the competition.
It was a mess and wasn’t even formatted right, etc lol. So there were also a lot of nan values I didn’t want to deal with so after some different testing I found this to work pretty well and is pretty fast.
import pandas as pd
def to_dict_dropna(data):
return [{k:v for k,v in m.items() if pd.notnull(v)} for m in data.to_dict(orient='records')]
reptile_df = pd.read_csv('/tmp/reptiscan_import/reptiles.xls',delimiter="\t")
reptiles_data = to_dict_dropna(reptile_df)
Which now gives me nice pretty data which came from a old windows 97 formatted tab delimited excel file lol.
Are you deleting out the tmp dir each time or how you think that should work? My concern is if I have a bunch of people importing data I don’t want the same folders and stuff. The issue is the data this comes from typically will have a same name and a zip file with the same file names like reptiles.xls, etc.
So I’m thinking after I’m done with the import cleaning it up or using some uuid or something to ensure I have a specific folder for the import inside the /tmp if that makes sense.
with anvil.media.TempFile(media_object) as file_name:
# Now there is a file in the filesystem containing the contents of media_object.
# The file_name variable is a string of its full path.
then file_name will be some random string /tmp/4x1fqtt89ktff0gjtwqwimktbkldmtsr