Limit Background Tasks

KR1 · May 13, 2023, 6:47pm

Does someone know what is the limit for the background task?
I’ve tried to import 30.000 rows of data and it run out of the memory.

Atm. i’m on the personal, does it increase if you upgrade your plan?

anvil.server.ExecutionTerminatedError: Server code execution process was killed. It may have run out of memory: 66a11e6aa5

stefano.menci · May 13, 2023, 7:08pm

What does import 30,000 rows of data mean?

If you keep them all in memory and each row is large, then it may very well run out of memory.

divyeshlakhotia · May 14, 2023, 3:58am

Yes, Anvil offers better CPU/RAM on more expensive plans. Your current plan only gives you Light CPU/RAM.

Although I will first suggest trying to optimise your code (If you can provide more context on it, perhaps someone on the forum can help with that)

KR1 · May 14, 2023, 4:54am

Ok the solution of mine don’t work. I’m running out of memory by performing this loop. It happens after 20.000-22.000 rows

I make chunks from ma list of dicts in 2000 steps.
I do some processing and create a new dict to pass the selected data to add_rows. (adding unique ID from different function, changing str date into date)

My list of dicts has even more data that I will add later to different database and linkt it with the rows created with the code below. So I already fail in the first phase.

    #start imporatation preparation
    id_result_index = 0  # Index to track the current result ID
    # Determine the number of chunks needed
    chunk_size = 2000
    num_chunks = math.ceil(len(list_of_dicts) / 2000)
    i=0
    # Process the list in chunks
    for i in range(num_chunks):
        start_index = i * 2000
        end_index = min(start_index + chunk_size, len(list_of_dicts))
        chunk = list_of_dicts[start_index:end_index]
      
        #make final changes to a list of dicts: add sample index and result index, convert string date to a date, add application link to a database
        final_result_list = []#for keeping the final result list for import
        for row in chunk:#Iterate over the list of dictionaries
          application_value = row.get('application')
          if application_value is not None:
              # Search for rows with matching 'application_code' in the database
              found_rows = app_tables.assay_databank.get(application_code=application_value)
              # Update the corresponding value in the dictionary with the value from the found row
              row['result_id'] = result_ids[id_result_index]  # Assign the ID value
              id_result_index += 1  # Increment the index to move to the next ID
              if found_rows:
                  row['application'] = found_rows#overwrite the value with row for linking
              else: 
                some_failed.append(f"- application not found {application_value}")
                row['application'] = None# overwrite the missing application, to avoid data mismatch
                
          #convert str date into date format
          measurement_date_str = row.get('measurement_date')
          if measurement_date_str is not None:
              # Convert the string date to a datetime object
              measurement_date = datetime.strptime(measurement_date_str, '%Y-%m-%d').date()
              # Update the 'measurement_date' key with the converted date
              row['measurement_date'] = measurement_date
  
          #create a separate list of results including unique sample ID, Results ID, user and datetime
          if application_value is not None:
            new_item = {
                'sample_id': row.get('sample_id'),
                'result_id': row.get('result_id'),
                'application': row.get('application'),#already a row to a database
                'concentration': row.get('concentration'),
                'unit': row.get('unit'),
                'device': row.get('device'),
                'measurement_date': row.get('measurement_date'),#date format
                'updated_by': user,
                'updated_date': datetime.now()
            }
            final_result_list.append(new_item)

        app_tables.results_database.add_rows(final_result_list)

KR1 · May 15, 2023, 8:15am

Ok, I rewrote the whole code just to improve the performance. What was causing running out of memory were single database query for each row.

My solution was:

using slicing to create chunks
Removing all the database query to make one single query: dict of keys + values. Where the key is the column name in database and value searched unique id generated from code.
app_tables.database.search(**dict_search)
and then in the final preparation of a list of rows (final_sample_list) for the import assigning the link to the other data table based on the new list of rows from a single query

          for row in chunk:#assign results to the sample
              # Find matching rows based on the value of row['barcode']
              matching_rows = [found_row for found_row in found_rows if found_row['sample_id'] == row['barcode']]
              # Assign the list of matching rows to row['results']
            
              new_item = {
                  'barcode': row['barcode'],
                  'study_id': row['study_id'],
                  'name': row['name'],
                  'status': row['status'],
                  'matrix': row['matrix'],
                  'ext_barcode': row['ext_barcode'],
                  'volume': row['volume'],
                  'comment': row['comment'],
                  'order_number': row['order_number'],
                  'location_storage': row['location_storage'],
                  'box_bag': row['box_bag'],
                  'position': str(row['position']),
                  'results': matching_rows,
                  'created_by': user,
                  'created_date': datetime.now(),
                  'updated_by': user,
                  'updated_date': datetime.now()
              }
              quick_link_samples.append(row['barcode'])  # Append quick links for the quick search in samples with objects
              final_sample_list.append(new_item)
      
          # Import samples for the current chunk
          app_tables.sample_database.add_rows(final_sample_list)
```