Data tables stripping leading zeros

jason.perez · December 4, 2022, 2:00am

As part of my app I have a file uploader that uploads an excel file and then puts the contents of that file into a data table. The problem is that it is dropping the leading zeros of of a particular column called (SP_ID).
When creating the columns, I set the data type for SP_ID to text but Anvil seems to want to automatically convert the data for that column to an integer.
When I set the data type to string while reading the file, I get null values for all columns.

Code Sample:

@anvil.server.callable
def store_data(file):
  with anvil.media.TempFile(file) as file_name:
    if file.content_type == 'text/csv':
      df = pd.read_csv(file_name, dtype={'SP_ID': str})
      df['SP_ID'] = df['SP_ID'].apply(lambda x:x.zfill(10))
  
    else:
       df = pd.read_excel(file_name, dtype={'SP_ID': str})
       df['SP_ID'] = df['SP_ID'].apply(lambda x:x.zfill(10))
    for d in df.to_dict(orient="records"):
      # d is now a dict of {columnname -> value} for this row
      # We use Python's **kwargs syntax to pass the whole dict as
      # keyword arguments
      app_tables.csv_input.add_row(**d) ```

Clone link:
share a copy of your app

jshaffstall · December 4, 2022, 5:17am

If you print out your d dictionary before adding the row, you should see that the leading zeroes have already been stripped. A text field in a data table will definitely not strip anything. You can verify that yourself by adding a row without going through the CSV import and making sure that there are leading zeroes in the text field.

The stripping has to be happening during the CSV import. This seems to be a known issue with Pandas, that it treats the field as an integer. This post suggests that the syntax for your dtype argument isn’t correct: Keeping leading zeroes when using read_csv in Pandas

jason.perez · December 4, 2022, 11:41pm

Correct, the stripping is definitely happening during the file import, which seems to happen whether using pandas in Anvil or Jupyter Lab/Notebook. However, in Jupyter I can use zfill to add the zeros back in, but in Anvil when I use the exact same code, I get null values for all columns

jshaffstall · December 4, 2022, 11:55pm

According to the post I linked, you shouldn’t need the zfill, just fix the dtype argument. I don’t work with Pandas, though.

jason.perez · December 5, 2022, 12:03am

Yeah, for me it it’s stripping the zeroes even when setting the dtype to “string”. However, I figured out how to make it work with using zfill. I just had to delete the table and add it back in and then run the program. Now it’s working. Thanks for the help.