So I added more print statements but I still don’t know the cause of this issue. Is there no way to get more detail like a stacktrace for “Server code exited unexpectedly”
Try with smaller files. Perhaps it’s being killed because it’s using too much memory. No, I have no clue how much is too much, but testing with smaller files may help understanding what’s going on.
So I changed the code to batch inserts instead and flush the sql session in between to prevent memory build up. It lasts way longer but still gets killed. There’s 7 tables being produced (inserted into) and it got through the first (and biggest) one but then died on the second (about 40 minutes in).
Is there a limit to the length of a background task? I thought there wasn’t for personal plans. Is there a way to monitor memory usage?
So I added even more logs (too many frankly) and suddenly it runs fine but due to the amount of logging it is now a very slow script.
I reduced logging and it started crashing again. I’ve had messages before where it says it may be due to memory but now I’ve chunked whats written into the db that message is gone and instead I randomly get the message in the title. Sometimes after a few seconds sometimes after 30 minutes.
Is there no stacktrace equivalent to investigate further? The error code is really quite unhelpful.
OK - upon searching the forums it appears this error code is an Anvil specific error.
def xml_to_sql(root: ET.Element, model, element_name: str, data_elements: list, session: SqlAlchemySession):
transaction = session.begin_nested()
batch_size = 1000
print("transactions started")
try:
session.query(model).delete() # Clears out the table
print("table cleared")
column_types = {column.name: column.type for column in model.__table__.columns}
probability_threshold = 0.001
instances_added = 0
element_count = 0
for element in root.findall(f".//{element_name}"):
data = {}
element_count += 1
if random.random() < probability_threshold:
print(f"data element {element_count} - {element_name}")
for data_element in data_elements:
text_value = element.findtext(data_element, default=None) # Extract text value for the data element
# Convert the string to the appropriate type based on the model's column
if data_element in column_types:
column_type = column_types[data_element]
if isinstance(column_type, Integer) and text_value is not None and text_value.isdigit():
data[data_element] = int(text_value)
elif isinstance(column_type, String):
data[data_element] = text_value
elif isinstance(column_type, Date) and text_value is not None:
try:
# Assuming the date format in your XML is 'YYYY-MM-DD'
data[data_element] = datetime.strptime(text_value, '%Y-%m-%d').date()
except ValueError:
# In case the date format is incorrect or the field is empty
print(f"Date conversion error for value '{text_value}'")
data[data_element] = None
else:
# For any other type that doesn't require special handling
data[data_element] = text_value
else:
# If the element is not found in column types, use the default value
data[data_element] = text_value
instance = model(**data)
session.add(instance)
instances_added += 1
batch_size += 1
if instances_added % batch_size == 0:
session.flush() # Flush the batch to the database but do not commit yet
print(f"Flushed a batch of {batch_size} records.")
# Commit the transaction only if all batches are processed successfully
transaction.commit()
print("All batches committed successfully.")
except SQLAlchemyError as e:
transaction.rollback()
print(f"Error occurred: {e}. Rolled back the transaction.")
raise
finally:
session.close()
The code above is always where the issue is.
The output is like this
trying xml_to_sql
transactions started
table cleared
data element 365 - AMP
data element 1018 - AMP
data element 2869 - AMP
data element 3375 - AMP
data element 3998 - AMP
data element 6758 - AMP
data element 7195 - AMP
data element 7447 - AMP
data element 8814 - AMP
data element 9447 - AMP
data element 10511 - AMP
data element 11408 - AMP
data element 14226 - AMP
data element 14367 - AMP
data element 14900 - AMP
data element 15357 - AMP
data element 16497 - AMP
data element 16995 - AMP
data element 17371 - AMP
data element 17735 - AMP
data element 18066 - AMP
anvil.server.ExecutionTerminatedError: Server code exited unexpectedly: 652e9c8fbf
It fails somewhere randomly during the data element count. I’m trying very hard to make this platform work. I even signed up to the professional plan in the hopes increasing CPU/RAM would help but it makes no difference.
Have you contacted Anvil support? Error codes like that are typically something that they need to look up in their logs to see what might be causing it.
Can I make a suggestion? If you haven’t / aren’t already, try running this locally using uplink and a) see if it errors out and b) if it’s an outside anvil problem triggering an anvil error or, indeed, an anvil specific issue. The anvil team are regularly in the forum, so I wouldn’t worry too much about the paid support issue.
Without a paid support plan, you don’t get a guaranteed response time, but I would personally email them about an unhelpful error message like this (especially after you’ve made a good faith attempt to find the issue yourself and asked for help in the forum).
I don’t think so, not for paid plans. But I also haven’t heard of many 40+ minute background tasks. I may be wrong, though.
I’ve now managed to chunk the xml which seems to have shifted the issue along
def xml_to_sql(xml_content: str, model, element_name: str, data_elements: list, session: SqlAlchemySession):
transaction = session.begin_nested()
batch_size = 5000
instances_added = 0
record_count_before_clearing = session.query(model).count()
current_record_count = copy(record_count_before_clearing)
try:
session.query(model).delete() # Clears out the table
print(f"table cleared - {element_name}")
column_types = {column.name: column.type for column in model.__table__.columns}
for event, element in iterparse(io.StringIO(xml_content), events=("start", "end")):
if event == "end" and element.tag == element_name:
data = {}
for data_element in data_elements:
child = element.find(data_element)
text_value = element.findtext(data_element, default=None) # Extract text value for the data element
# Convert the string to the appropriate type based on the model's column
if data_element in column_types:
column_type = column_types[data_element]
if isinstance(column_type, Integer) and text_value is not None and text_value.isdigit():
data[data_element] = int(text_value)
elif isinstance(column_type, String):
data[data_element] = text_value
elif isinstance(column_type, Date) and text_value is not None:
try:
# Assuming the date format in your XML is 'YYYY-MM-DD'
data[data_element] = datetime.strptime(text_value, '%Y-%m-%d').date()
except ValueError:
# In case the date format is incorrect or the field is empty
print(f"Date conversion error for value '{text_value}'")
data[data_element] = None
else:
# For any other type that doesn't require special handling
data[data_element] = text_value
else:
# If the element is not found in column types, use the default value
data[data_element] = text_value
instance = model(**data)
session.add(instance)
instances_added += 1
element.clear()
if instances_added % batch_size == 0:
session.flush() # Flush the batch to the database but do not commit yet
current_record_count -= batch_size
add_to_logs_table(f"Flushed a batch of {batch_size} records - {100-round((current_record_count*100)/record_count_before_clearing)}% done of {model.__tablename__}")
batch_size -= 1
# Commit the transaction only if all batches are processed successfully
session.flush()
print(f"last flush for {element_name} completed")
session.commit()
print(f"All batches committed successfully - {element_name}")
except SQLAlchemyError as e:
transaction.rollback()
print(f"Error occurred: {e}. Rolled back the transaction.")
raise
finally:
session.close()
Now when I run this I get the Server code exited unexpectedly at the point of session.commit().
I don’t believe it’s a memory issue as I have seen the separate memory error from Anvil before, but the fact that chunking the xml shifted the issue implies that it may be memory related.
Can anyone at Anvil look at the logs and point out the issue? Also I highly suggest that the errors are more specific than this so that users can self diagnose issues.
I just finished reading this thread from top to bottom, and you have done everything that I would do, including moving to iterparse().
I am not entirely sure, but my sneaking suspicion is that SQAlchemy here:
Since you are creating a new variable called instance with its own python reference count, then adding it to the session, it is possible that SQAlchemys .flush() is not de-referencing the variable. This keeps it in memory in python even after it is sent to your database, even though SQAlchemy would not normally do this.
Normally it would only retain links to the data in the DB for a rollback or a failure to commit, or its row statuses, etc.
If you want, you could try abandoning the batching alltogether and, since you are already doing iterparse(), try flushing an unreferenced copy of the data to see if that makes a difference.
I can’t say what it will do to overall speed with so many DB insert calls, but you will at least know if it is possible / what the memory problem is.
like:
## Instead of:
# instance = model(**data)
# session.add(instance)
# instances_added += 1
# element.clear()
## Do in each iterated loop:
session.add(model(**data))
session.flush()
element.clear()
This is not ideal, but it might get you further along with the problem solving.
I would add that before I was using flush() I was getting issues with memory from Anvil on smaller xmls before I would process 50% of the data. Once flush() was implemented the issue shifted to only happening on load of the largest xml which is where I introduced interparse() and the problem changed to being generic (i.e. no mention of memory) and happened on the commit.
This only happens on the commit of the largest xml of the 7 that get processed and only at the commit().
The point I was trying to make was that when the session.add() was not being dereferenced the issue happened much earlier and on smaller xmls.
Now you have me wondering if your issue really is the anvil server swallowing up an error from sqlalchemy
There could just be too much data in memory in your database that has not been written to disk? (so before sending the commit command) …and anvil is misreporting the error raised?
The only reason why I know anything about this at all is that at one of my previous employers I ended up automating a pipeline to ETL large, sometimes several GB sized xml files for automotive fitment data and doing essentially what you are trying to do.
I also remember it being a pita with memory and row commit problems upserting it externally…
@ianb - The db is fine, I’m barely touching it’s capabilities. I tried it on my local with larger files. In addition the db performance scales automatically and it’s never had to scale at all.
I tried your approach, it takes about 50% longer to reach the same point where again it crashes, this time with the run out of memory message. However I realised it never reached the print just before the commit. When I went back to my previous iteration again it never reached the print before the commit.
What this tells me is that there is a problem after batch_size -= 1 where it is exiting this loop
for event, element in iterparse(io.StringIO(xml_content), events=("start", "end")):
I’m not sure why that would cause an issue and why the out of memory error is only happening if I use the flush for every instance.
Hmm I am out of ideas I think, I’m guessing your xml is relatively uniform, so I doubt its one particularly large element tree to parse somewhere towards the end of the file… Even if so, then it should have problems with other files regardless of size if they had particularly large elements somewhere within.
So I don’t have anything else, I don’t actually know what your data looks like, but I also try to use a different library wherever I can to handle xml, since I strongly dislike the python builtin one.