[RESOLVED] My app is no longer loading :S

Aaron, did you do the reimage version on your production version of the app?

No, not with the app I am working on right now. I check the others - not of them are working. I will try it now on the production version.

Thx connor - that did the job - but we need to know what is going on

2 Likes

I haven’t used 3.10 or Persistent Server, yet, but this suggests a problem with some of the entries in Anvil’s Virtual Environment cache. IIRC, Anvil does try to reuse existing, built Virtual Environments, rather than rebuild them every time.

I am also unable to connect to my Anvil repositories:

Connection reset by 35.177.218.83 port 2222
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.

Is anyone having the same problem?

Hi folks,

We’ve just resolved a deadlock affecting some applications, matching the symptoms you’re reporting, and @davidtopf2’s app is now loading again.

Can others in this thread confirm we’re back moving again?

EDIT: Yeah, pretty sure that’s what it was. It’s nearly 1am here, so I’m going to grab some shut-eye before working out (a) why that failure happened (spoiler: it’s a deadlock in our intra-cluster communication code, which recently received an upgrade), and (b) why that failure didn’t page us immediately.

A little bit of detail in the meantime: The deadlock effectively took out a node from our central “platform server” cluster. That node got dropped rapidly from the load balancing rotation, so most traffic was routed around it, but if your server code happened to be attached to that machine your app was having a bad day: your server code still registered as “up” and reachable (even if it had actually timed out, responsibility for marking it as dead belonged to…the server that owns it, which was juuust alive enough to keep our failsafes from shooting down the record of that server code, but dead enough to not do anything with its incoming traffic). This is the sort of failure that should set off all the alarms in the world immediately; it didn’t, and a more-awake dev team is going to have to work out how to make that not happen. (And, of course, debug the cause of that deadlock that only occurred after a ~week of production load…).

OK, enough rambling: Sleep now, debug tomorrow.

3 Likes

Thank you so much @meredydd ! It does work this morning and it’s great to see you respond ASAP :slight_smile:

I was experiencing delays in general this last week (maybe even a little longer). Is there any chance that this has been affecting my app for longer and if yes roughly how long? I kept on feeling like Anvil more sluggish than im used to, especially in loading up for the first time.

Cheers!

Yesterday I added one dummy package to the custom environment, a new environment was created and the app finally restarted working.

This created a new commit in the repository, where the only difference is one more item on requirements.txt:

image

Today I tried to reset the production branch back to where it was yesterday, and it still doesn’t work.

I reset it back to the last commit created as described above, and it works.

Keeping the dummy package is not a problem for now, but I think this should be addressed, because one may try to create a new environment identical to the one that stopped working, the broken cached one could be picked (as it seems to be picked by resetting to the second commit), and the app will not work.

Just wanna throw my hat in and say that I may have been affected by this issue today I think