We are experiencing anvil.server.TimeoutError on all of our apps that we use for our workplace. This has been off and on for the past few days causing severe workplace disruptions. We would share an app ID, but all of the apps require a login. Is this related to your outage announcement, and is work being done to address this?
[Moved to new topic]
Sorry to hear that! We’re not aware of any current issues. Please let us know the app IDs of the apps where you’re seeing the issue so we can investigate. Do you get the error reliably for certain Server Functions, or all Server functions, or does it only happen sometimes? If it’s only certain functions, please let us know which ones. Thanks!
I’m having repeated response issues as at 20:00-20:15GMT on Friday.
(I saw the issues with uplinks that occurred yesterday, but this seems to be a different issue, so a new thread. )
On several different tries, the server commands are timing out before the task seems to start. E.G even a simple print("Test is starting")
first line isn’t triggering.
An example session ID is JODHSDDJQC73GHI5FCWQGNNQ5ALKK5YN
App ID is Y26N6JTYNQLGMR7Z
Full console log below:
Application loaded
`anvil.server.TimeoutError: Server code took too long`
* `at /downlink-sources/downlink-2024-01-08-11-08-41/anvil/_threaded_server.py:436`
* `called from /downlink-sources/downlink-2024-01-08-11-08-41/anvil/server.py:55`
* `called from <input>:1`
* `called from /downlink-sources/downlink-2024-01-08-11-08-41/anvil_downlink_worker/__init__.py:213`
* `called from /downlink-sources/downlink-2024-01-08-11-08-41/anvil_downlink_worker/__init__.py:244`
`anvil.server.TimeoutError: Server code took too long`
I’m in the US East Coast if that’s relevant.
Servers are responding now at 22:35GMT.
This appeared to be resolved ~0300GMT but timeouts are recurring again at 1400GMT (0900 my time).
This only seems to be affecting direct server calls as background tasks seem to be running without issue.
anvil.server.callable
tasks that normally complete without issue keep timing out.
I’m also running into this issue. Yesterday and today (1/19/2024 and 1/20/2024).
Any one find a solution yet?
Hi @andrew
Are there any particular functions that time out reliably when called from client-side code? We’re trying to work out what’s going on here, but haven’t been able to reproduce the problem so far. Thanks!
We do not have any specific functions that always timeout. For us, it’s intermittent. However, it’s ALL server functions. Perhaps this is a hint: this timeout error has only occurred on the apps that we have hosted on Anvil. We have some self-hosted anvil apps that never ran into this error.
Hi Ian,
Client-side, I couldn’t log in as the authentication calls were timing out so I don’t have any examples of client-side events that failed earlier. Sorry.
I just tried a couple of server calls, and things are running fine from the command line. I can log to the app and the commands are working there too. (This was at 19:51GMT.)
The logs for the last couple of hours also seem OK and the last set of issues I logged were at 17:03GMT / 12:03 ET, where several runs in a background task timed out.
This was session ID EC7QG6DYCXQSNTGHXEV3VLBBYWOTF7G
`Error updating news for Syria: Server code took too long
Error updating news for Singapore: Server code took too long
Run complete for Switzerland
Error updating news for Brazil: Server code took too long
Run complete for Democratic Republic of the Congo
Error updating news for Egypt: Server code took too long
Error updating news for Algeria: Server code took too long
Not sure if that helps.
Is there anything else I can do to help track this down?
Update to this
By 20:30GMT I was getting timeout errors again on some longer server calls (all of which normally run ok).
FWIW, instead of a flat-out server error, it seems as though something is running slower than normal, meaning that server calls that run under the server timeout
cut-off are now taking too long to execute and are producing the timeout error.
So maybe not an Anvil issue as much as a server issue somewhere?
(But to be clear, I really have no idea what I am talking about here… )
Thanks, we’re continuing to investigate this, but so far we have no leads - everything looks fine in our server metrics, and we haven’t seen any timeouts ourselves.
@mastamatto I think I’m right in saying that sometimes everything works fine for you, but for periods of time every server call times out - is that correct? Do you have any idea how long those outage periods are? Please can you tell me an app ID for an app where this is happening so I can correlate the TimeoutError
s in the logs with other events in our infrastructure?
Hi Ian,
That started from Thu 18th Jan 2024 ~ 22:00 EET.
The most terrible day was Friday ~ 20.00 EET.
Weekend - periodically huge delays without timeout errors.
ID:A2ARJWGGE5WE4BRR
I thought you have a problem with a storage.
Hi @Vadim,
Thanks for providing a specific app and time! Unfortunately, some of the most detailed diagnostics have now aged out of our system (we’re curringly working on retaining more information for longer).
In the meantime, it would be really, really helpful if people experiencing this problem could post:
- An app ID
- One or more session IDs where the error was occurring
- Dates and times wimes where the error has occurred
Without that, we’re looking for a straw in a haystack (the rate of “correct” timeout errors across Anvil is fairly high, and so far we don’t have much to distinguish the “bad” timeout errors from the “good” ones. We’re working on it!)
Hey there,
we are experiencing connection timeouts again in different apps. This started a couple of minutes ago.
AppOfflineError: Connection to server failed (1006)
We are getting in reports that certain endpoints time out or do not work at all.
Also sever calls time out.
Anyone else experiencing issues?
Update:
The server calls just spin forever and don’t even return a timeout exception.
Update II:
Looks like sever calls from the dev environment work - it seems to be happen only for production environment.
Hi everyone,
It looks like one server node in our cluster has stopped responding. We’re in the process of bringing it back now, please stand by for further updates
That server node is now back, but it is suspicious that there was any disruption - the loss of a single node should not cause timeouts or infinite spinners such as those being reported, so there must be more going on here.
Is anyone able to provide instructions for us to be able to trigger one of these infinite spinners ourselves on your app?
Our Infinite Spinners have now stopped.
Will report if we see any more issues.
Well, it was taking place on all of our applications that are published and used daily.
APP ID: G6NB2XYFOWYGVE7P
SESSIONS WITH TIMEOUT ERRORS:
- HCLVZXRIHM35H2F5K7KJCQ6ADZT5SCA4
- Z3USCBU4BI5Q47F3ZD4U4FXY43CAVL6E
- HGVCXRMI7RRK3BOZTQMCRN4KNEY77ARW
- TGPGJFAQYHAK3JXQK4NGVHK7LFFUN7SU
- DWRUVKLWMBZ47267FMYAV4BTRGU6WU2U
- H7EDHNMQRW3K52W5XBYM7ODBEUBANYK6
SESSIONS WITH APPOFFLINE ERRORS:
- P5L5SJI2HBVN2ZKJLRAWBNTM6UDEKJGJ
- PRELSHHBM3PZMFF2N5TO3V6NIPL6NOLE
Oddly enough, we haven’t seen any errors at all today. So perhaps the node that was fixed covered us as well.
I spoke too soon. We are seeing this issue again. Latest sessions on the same app (timeout error):
- US7QNDYCSVWAG5UCXNHA33B4ZEB3HK3G
- CH5YZOO6LGW5RMOJWOGJZKXZMCFSJYWW