Error with in_transaction decorator

hello,

I have a table with few thousand rows: each row is more or less a key/value pair.
When in production the key/value pair will need to be updated every quarter.
I have a method that update a chunk of rows (800 each time it is called), it is possible that multiple calls are done in parallel, hence I used the decorator in_transaction. However, the method randomly returns an errors about conflicting transaction (even when the method is not called in parallel. I removed the decorator and it seems to work but I think it might fails once in production if multiple user are calling it at the same time.
The method that update the data is a simple for-loop (see below). Is there a better way than for loop for updating multiple rows? Is there a reason why in this case I can’t used in_transaction? Would it be better if I delete all the rows before starting update process and then in my method I could simply do table.add instead of checking i the row exist ? would that be more efficient ?

Thank you in advance for your help

for item in data:
      key = int(item["_key])
      value = int(item["_value"])
      row = table.get(key=key)
      if row is None:
        table.add_row(key=key, value =value)
      else:
        row["key"]=value

This is the code I use in a similar case:

  n_updated = n_deleted = 0
  for row in app_tables.parameters.search():
    name = row['name']
    if name in parameters:
      if row['value'] != parameters[name]:
        row['value'] = parameters[name]
        n_updated += 1
      del parameters[name]
    else:
      row.delete()
      n_deleted += 1
  
  n_added = len(parameters)
  for name, value in parameters.items():
    app_tables.parameters.add_row(name=name, value=value, n_uses={}, tot_n_uses=0)
    
  return f'OK:\nUpdated {n_updated} existing parameters\nDeleted {n_deleted} old parameters\nCreated {len(parameters)} new parameters'

It’s surprisingly fast, perhaps because usually only few rows are updated and this code doesn’t waste time updating rows that don’t change.

I use the in_transaction decorator only in very simple cases.

If I remember I had an app that would never finish the update when something that was not related to the transaction wasn’t working.
So now I use a hand made try - except block and manage the transaction errors by myself.

EDIT

I just checked and this is one of those simple cases where I have used @tables.in_transaction :slight_smile:

1 Like

Hi @thisisakasolace - welcome to the forum!

I have a request - can you provide a bit more detail for us? I have some ideas about what’s going wrong here, but you could really help us out by providing a bit more information. For example:

  • When you say “randomly returns an errors about conflicting transaction” - what’s the error message? Can you provide a copy-and-paste from the Output window?

  • When you provide a code sample, can you provide enough context to show how you’re using the decorator? When you’re asking about the interaction between your code and a decorator, providing just a snippet from the inside of the function might miss something important!

  • Is it possible for you to provide a Clone link? (It’s OK if not - I understand app source code is often confidential - but if you can it really helps!)

You can find some more suggestions about asking a good question on the forum here and here.


In this particular case, my guess (pending enough information to investigate further) is that your particular access pattern is tripping a lot of transaction conflicts in our Postgres backend. In particular, the access patterns from that repeated get() probably walks make Postgres more likely to think you’re conflicting with transactions run by other users. @anvil.tables.in_transaction attempts to retry, but after enough conflicts it will give up. @stefano.menci’s approach does a single scan, which is probably why it works better in a transaction.

Hi @meredydd

When you say “randomly returns an errors about conflicting transaction” - what’s the error message? Can you provide a copy-and-paste from the Output window?

The following errors was thrown at different stage (i.e usually between the 50 and 100 row update) Another transaction has changed this data; aborting

  • When you provide a code sample, can you provide enough context to show how you’re using the decorator? When you’re asking about the interaction between your code and a decorator, providing just a snippet from the inside of the function might miss something important!
  • Is it possible for you to provide a Clone link? (It’s OK if not - I understand app source code is often confidential - but if you can it really helps!)

https://anvil.works/build#clone:UNF6X7OJB7PFLVEQ=YV743TQISY6WOUNHHW4ROD3Z

The issue is in server code tablewrite/storeBlock() method
In order to test it, you would need a sample request.Should I share with you a postman request sample?

Thank you for looking into it !

Thank you @stefano.menci in my case, all the rows are being updated and @tables.in_transaction throws errors …

@meredydd

I could remove completely the get() part from the loop and only push new row instead of updating them. However, the transaction would only be pushed one after the other, is there a way to write to the database only once (i.e outside of the loop) ?
Thank you

One way to write only once is to write only a single value, i.e., a single Simple Object – a dict – instead of multiple rows.

thank you @p.colbert I think this is what I am looking for !

Edit: I actually misread I thought you said it was possible to update multiple row at once by pushing a dictionary …

When you described the table as containing key-value pairs, that really does fit the definition of a Python dict. And a dict can (usually) be stored as a single value, in a single row, in a column of type “Simple Object”. (See https://anvil.works/docs/data-tables#column-types)

Currently, you’re storing each pair in its own table row. But you’re also updating the entire table as a unit, that is, as if it was a single dict:

Of course, if they’re stored as a Simple Object value, then they would also be retrieved as a unit.
This may or may not fit with the way the updated values are ultimately used. So it may or may not be appropriate for your case.

@p.colbert What I actually have is the following.
A table per country.
For each country couple of thousands rows.
Each row, has an id and 2 values let say v1 and v2.

4 time a year the tables are updated in 2 steps.

step 1) id and v1 are pushed per block of 800 key/value pairs (no more because of computation requirement on the client side)

step 2) once all id and v1 have been populated for at a table. A background task is launched and calculate v2 for each id and update the table

once step 2) is finished the tables will be used for read request.

Thing is that step 1 might be done in parallel by different client, both:
- different country (should not be an issue because I have one table per country)
- different chunck of data (i.e. id/v1) for a given table => this is why I thought using
@in_transaction decorator would make sense

maybe the design/logic of the app can be changed and improved … this is beta development !

Is it really necessary to disassemble a block into individual rows? (It might be; it depends on how values are looked up later.)

I just shrunk a table with 3000 rows to 6 rows by condensing tons of stuff in a Simple Object column. One of the value contains a 150kb json object. That json object is a list of dictionary, which in another world would have been a linked table with hundreds of rows. I almost always needed to scan the whole table, so I decided to put it in one single value. It is much faster to read and write 150kb from/to one value than reading/writing the same 150kb by scanning hundreds of rows.

I use columns for the values I need to sort and filter by, then I put everything else inside a list of dictionaries in one (sometimes more) Simple Object columns. It’s an hybrid approach, half SQL and half MongoDB.

I also try to stay away from linking tables and again put as much info as I can inside Simple Object columns.

Apps that would have used 20-30 tables are down to 3-4 tables, and each table has a fraction of the rows.

Obviously I can’t always do that, and sometimes I even use SQL (available with dedicated plans) and play with complex queries. But for most apps I use and abuse Simple Object columns and my life is much easier.

Following @meredydd and @stefano.menci suggestion. I removed the get() from my for-loop and simply push the new row instead of updating them.
With that simple change the @in_transaction decorator is not causing error anymore.

Thank you both for your help

1 Like