Uplink Disconnect Issues

Goal

Keep uplink connected robustly 24/7.

Setup

I have an app that connects to a local uplink program that runs 24/7 on a desktop computer connected by ethernet. On this uplink program, I connect to a Google Cloud postgres database through psycopg2. I typically run the wait_forever() command and wait for Anvil calls from the client throughout the day to query the database. Every day at 4 am, I have a scheduled Anvil task to sync an external inventory system with my cloud database.

Problem

Basically, every day this uplink connection fails. By fail, I mean something a little bit vague because I haven’t been able to find a systematic trace of the error. I usually have a few common things happen.

The uplink program will usually have a complaint like this:

Anvil websocket closed (code 1006, reason=Going away)
Exception in thread Thread-2:
Traceback (most recent call last):
 File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 311, in call
   return _do_call(args, kwargs, fn_name=fn_name)
 File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 303, in _do_call
   return _threaded_server.do_call(args, kwargs, fn_name=fn_name, live_object=live_object)
 File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\_threaded_server.py", line 404, in do_call
   raise error_from_server
anvil._server.AnvilWrappedError: 'Connection to Anvil Uplink server lost'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "C:\Users\gwebe\anaconda3\lib\threading.py", line 932, in _bootstrap_inner
   self.run()
 File "C:\Users\gwebe\anaconda3\lib\threading.py", line 870, in run
   self._target(*self._args, **self._kwargs)
 File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 143, in heartbeat_until_reopened
   call("anvil.private.echo", "keep-alive")
 File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 314, in call
   raise _server._deserialise_exception(e.error_obj)
anvil._server.AnvilWrappedError: 'Connection to Anvil Uplink server lost'
Reconnecting Anvil Uplink...
Connecting to wss://anvil.works/uplink
Anvil websocket open
Connected to "Default environment (dev)" as SERVER

It seems to always “recover” the connection. However, I get weird bugs once the uplink reconnects. These include:

  1. The Anvil server call cannot find my local functions and claims the functions do not exist.
  2. This:
OperationalError: server closed the connection unexpectedly
   This probably means the server terminated abnormally
   before or while processing the request.
  1. The client code just hangs waiting for the return and then eventually gives up - anvil.server.TimeoutError: Server code took too long. The client code usually takes less than 3 seconds to gather the information from my local desktop.

Similar Issues

The first conversation above ended with @shaun asking about a potential bug, but then the conversation died off. I can provide more information if needed for my scenario.

Question

Does anyone have a robust strategy to handle the Anvil uplink connection and an external database cursor? Or does anyone know what I might be doing to cause the daily fragility?

Thanks!

Well the main issue is still persisting, but I figured out one of the sub errors (Problem #2). I realized the OperationalError was actually a psycopg2 exception. I now catch that before making database queries and restart the connection if it pops up (similar to Stack Overflow DB Reconnection).

I tested again today and the main problem still occurs: Anvil claims that it is reconnected on the uplink code; however, I still get client responses like this:

anvil.server.UplinkDisconnectedError: The uplink server for "<FunctionName>" has been disconnected

I do similar things for multiple different uplinks for anvil apps running 24/7 for a retail environment.
image

This means I get called if something breaks.

This is not a solution to the problem, but my workaround that works for me is not running them 24/7, I have a scheduler that ends the scripts after like 23hours and 55 minutes, and the scheduler is set to start every 1 minute if a task ends or crashes.

Reconnecting to the uplink once a day stopped most if not all of the aberrant behavior I had that matched some of what you and others have posted.

4 Likes

Thanks @ianbuywise. Very interesting and seems promising. I’m going to test it out and report back after a few days.

Weighing in here: The Uplink reconnecting is no big deal, but when the Uplink reconnects, all of your server functions should automatically be re-registered! So if it’s behaving as described here, it’s definitely a bug.

How reliable is the loss of server functions? Can you construct a minimal app that replicates this behaviour, and which you could leave in this state while we poke around at it?

A few updates:

  • I think I tried @ianbuywise’s workaround, but I am still getting weird behavior. I go into detail below for what I have been attempting.
  • @meredydd: I believe it is pretty reliably failing. I am going to try to construct a minimal example soon (hopefully later today, without the DB connection), but what I have now is fairly simplistic (it is unfortunately just part of a larger code)

What I have tried recently:

I created a Scheduler.py code on my desktop that just periodically calls a local program called connectDatabaseProduction.py in the same directory. All connectDatabaseProduction.py does is

parametersConnection = {
   'sslmode': '',
   'sslrootcert': '',
   'sslcert': '',
   'sslkey':'',
   'hostaddr': '',
   'port': '',
   'user': '',
   'password': ,
   'dbname': ''
}
connection = psycopg2.connect(**parametersConnection)
cursor = connection.cursor()

def main():
   anvil.server.connect('')
   anvil.server.wait_forever()

main()

There are some other functions in here that get called by the Anvil server through the client. When I first load everything up, it all goes normally. After a few hours, things start to break down.

After @ianbuywise’s suggestion, I made this Scheduler.py code, which:

  • runs connectDatabaseProduction.py
  • prints the time (in EST) every hour
  • between 2 am and 3 am, it terminates connectDatabaseProduction.py and restarts it
  • repeats this for about 5 days

Here is Scheduler.py:

from datetime import datetime, date, timedelta
import time
import subprocess

# sleep every 1 hour and restart between 2-3 am every day

countDay = 0
now = datetime.now()
dateStart = date.today()

print('')
print('=== Scheduler === Start Date: ' + str(dateStart) + ' ' + now.strftime("%H:%M:%S"))
print('')

if datetime.now().hour > 3:
   dateNext = dateStart + timedelta(days = 1)
else:
   dateNext = dateStart

subprocessDatabase = subprocess.Popen(['python', 'connectDatabaseProduction.py'])

print("=== Scheduler === Waiting for 2 am of " + str(dateNext))
print('')
while countDay < 6:

   now = datetime.now()

   print("=== Scheduler === Checked at " + str(date.today()) + ' ' + now.strftime("%H:%M:%S"))
   print('')

   if dateNext == date.today() and now.hour == 2:
      
      print('')
      print("=== Scheduler === Restarted at " + now.strftime("%H:%M:%S"))
      print('')
      
      subprocessDatabase.terminate()
      subprocessDatabase = subprocess.Popen(['python', 'connectDatabaseProduction.py'])
      dateNext += timedelta(days = 1)
      countDay += 1
      
      print("=== Scheduler === Waiting for 2am of " + str(dateNext))
      print('')
   
   time.sleep(3600)

subprocessDatabase.terminate()

I ran this for about 2-3 days (between 6/28 and 6/30) and this was the output:

=== Scheduler === Start Date: 2021-06-28 16:09:49

=== Scheduler === Waiting for 2 am of 2021-06-29

=== Scheduler === Checked at 2021-06-28 16:09:49

Connecting to wss://anvil.works/uplink
Anvil websocket open
Connected to "Default environment (dev)" as SERVER
Anvil websocket closed (code 1006, reason=Going away)
Exception in thread Thread-2:
Traceback (most recent call last):
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 311, in call
    return _do_call(args, kwargs, fn_name=fn_name)
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 303, in _do_call
    return _threaded_server.do_call(args, kwargs, fn_name=fn_name, live_object=live_object)
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\_threaded_server.py", line 404, in do_call
    raise error_from_server
anvil._server.AnvilWrappedError: 'Connection to Anvil Uplink server lost'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\gwebe\anaconda3\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Users\gwebe\anaconda3\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 143, in heartbeat_until_reopened
    call("anvil.private.echo", "keep-alive")
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 314, in call
    raise _server._deserialise_exception(e.error_obj)
anvil._server.AnvilWrappedError: 'Connection to Anvil Uplink server lost'
Reconnecting Anvil Uplink...
Connecting to wss://anvil.works/uplink
Anvil websocket open
Connected to "Default environment (dev)" as SERVER
OperationalError('server closed the connection unexpectedly\n\tThis probably means the server terminated abnormally\n\tbefore or while processing the request.\n')

Restarted Connection

=== Scheduler === Checked at 2021-06-28 17:47:00

=== Scheduler === Checked at 2021-06-28 18:47:00

=== Scheduler === Checked at 2021-06-28 19:47:00

Anvil websocket closed (code 1006, reason=Going away)
Exception in thread Thread-5:
Traceback (most recent call last):
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 311, in call
    return _do_call(args, kwargs, fn_name=fn_name)
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 303, in _do_call
    return _threaded_server.do_call(args, kwargs, fn_name=fn_name, live_object=live_object)
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\_threaded_server.py", line 404, in do_call
    raise error_from_server
anvil._server.AnvilWrappedError: 'Connection to Anvil Uplink server lost'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\gwebe\anaconda3\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Users\gwebe\anaconda3\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 143, in heartbeat_until_reopened
    call("anvil.private.echo", "keep-alive")
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 314, in call
    raise _server._deserialise_exception(e.error_obj)
anvil._server.AnvilWrappedError: 'Connection to Anvil Uplink server lost'
Reconnecting Anvil Uplink...
Connecting to wss://anvil.works/uplink
Anvil websocket open
Connected to "Default environment (dev)" as SERVER
Anvil websocket closed (code 1006, reason=Going away)
Exception in thread Thread-48:
Traceback (most recent call last):
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 311, in call
    return _do_call(args, kwargs, fn_name=fn_name)
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 303, in _do_call
    return _threaded_server.do_call(args, kwargs, fn_name=fn_name, live_object=live_object)
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\_threaded_server.py", line 404, in do_call
    raise error_from_server
anvil._server.AnvilWrappedError: 'Connection to Anvil Uplink server lost'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\gwebe\anaconda3\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Users\gwebe\anaconda3\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 143, in heartbeat_until_reopened
    call("anvil.private.echo", "keep-alive")
  File "C:\Users\gwebe\anaconda3\lib\site-packages\anvil\server.py", line 314, in call
    raise _server._deserialise_exception(e.error_obj)
anvil._server.AnvilWrappedError: 'Connection to Anvil Uplink server lost'
Reconnecting Anvil Uplink...
Connecting to wss://anvil.works/uplink
Anvil websocket open
Connected to "Default environment (dev)" as SERVER

=== Scheduler === Checked at 2021-06-30 09:39:29

=== Scheduler === Checked at 2021-06-30 10:39:29

I have not been able to quite figure out what is going on in this output.

In this case, Anvil reconnects in the first hour (sometimes it lasts longer), then the scheduler seems to pick itself back up about 1.5h later, then it prints every hour again for about 3 hours, then Anvil reconnects again, then the scheduler recovers a couple days later.

In the mean time, I was getting anvil.server.UplinkDisconnectedError: The uplink server for "<FunctionName>" has been disconnected and I believe a few other strange errors. Also during this time, I was using the same desktop (that was running the scheduler) and my laptop on the same internet with no internet connectivity issues on either device.

I intend to simplify this with a minimal app and get something repeatable and less chaotic, but I hope this can at least give some insight or maybe point to something I am doing wrong somewhere.

Thanks!

A minimal example would be really helpful! It sounds like that OperationalError is indeed a red herring. To be clear, the desired behaviour is:

  • If the Uplink’s connection is interrupted, it should reconnect immediately
  • As soon as it reconnects, all the server functions available in that Uplink should be re-registered, and you should no longer see The uplink server for "<FunctionName>" has been disconnected errors

If you can get it to violate either of those rules, that’s an actionable report of an Uplink bug and we’ll jump right on it!

I have been working on it all week but still struggling to get a minimal repeatable example. I am going to take a break from hardcore investigating and will report back if I can figure out anything that is consistent.

I found that part of the problem was my Windows desktop was sneaking into sleep mode every now and then :man_facepalming:. However, even after fixing that, I am still seeing the same random errors - function disconnect exceptions or, more commonly, the client just hangs and then timeouts. I have found that printing a word on my Uplink program seems to help the hanging issue somewhat. But I have no idea if that is actually working or not.

Hi @ianbuywise - I was just looking for a solution to this problem and saw your reply about the scheduler.

If there is ANY chance you can share this with me or at least point me in the right direction I would be really grateful. I’m in the same situation as you are/were where “I get called if something breaks” and it sucks if something breaks while I am getting drunk on a Saturday night!

many thanks!!!

michael

I explained a little more detail here:

I can elaborate more if needed. :slight_smile:

In the same thread, @david.wylie talked about using cron if you were running linux.

If so you could just run a cron job that kills your currently running script and then starts a new one afterwards. Either using the cli directly, a bash script, or a python program that parses the output of something like ps aux to find the PID of your script and kill it.

I like this site for helping me with cron timing:

(edited for code formattiing)

Like xyzcreativeworks (in a different thread) I am running a simple Raspberry Pi Pico W app to report temperature, humidity and pressure (BME280 sensor in my case), but I have a different problem.
The app produces plots of the 3 quantities and returns values for display on an LCD attached to the Pico.


I want to see the daily variation at various places in my house.

When I run my MicroPython script it connects correctly to the server and the app starts correctly.

Connecting to Anvil…
Connected
Authenticated to app UOLVLMN4PPS4T32X

It runs perfectly for a while, delivering measurements every 20 minutes, producing nice plots, but then disconnects. I can restart everything without problem, but only for a few measurement cycles. I then get the error message (in my microPython shell):

Connecting to Anvil...
Exception in uplink reconnection loop:
Traceback (most recent call last):
  File "anvil/pico.py", line 154, in _connect_async
  File "anvil/pico.py", line 112, in _connect
  File "async_websocket_client.py", line 94, in handshake
OSError: [Errno 12] ENOMEM

I don’t know its cause or significance, but even when it is working correctly the Anvil app reports an error on each measurement cycle.

anvil.server.UplinkDisconnectedError: Uplink disconnected
at Form1, line 70

If it helps, the Python code (don’t laugh, I’m self-taught) is:

> \# anvil_pico_w_bme280_plots_lcd_values.py

# Sends sensor values to Anvil for plotting
# Reads back values sent from Anvil. Formats and displays them
 
import anvil.pico
import uasyncio as a

import utime
from breakout_bme280 import BreakoutBME280
from pimoroni_i2c import PimoroniI2C

\# for the lcd display
from lcd_api import LcdApi
from pico_i2c_lcd import I2cLcd

UPLINK_KEY ="D7ZG64IS6JGZP775URKBXFRL-UOLVLMN4PPS4T32X"

PINS_BREAKOUT_GARDEN = {"sda": 4, "scl": 5}
PINS_PICO_EXPLORER = {"sda": 20, "scl": 21}

i2c = PimoroniI2C(**PINS_BREAKOUT_GARDEN)
bme = BreakoutBME280(i2c)

\# Scan to get the device addresses
I2C_addr = i2c.scan()
lcd_addr = I2C_addr[0]
bme280_addr = I2C_addr[1]
print("lcd = ", lcd_addr, " (dec)")
print("BME280 = ", bme280_addr, " (dec)")
print()

\# Set up the LCD
lcd_num_rows = 2
lcd_num_cols = 16
lcd = I2cLcd(i2c, lcd_addr, lcd_num_rows, lcd_num_cols)

\# print(bme280read())

\#Callable function for sensor readings
@anvil.pico.callable_async
async def bme280read():
    reading = bme.read()
    AT = reading[0]
    AP = reading[1] / 100
    RH = reading[2]
    
    if AP < 750 :
        AP = 1000
        
    return(AT, RH, AP)

@anvil.pico.callable(is_async=True)
async def show_message(message):
    \# Remove parentheses from string
    newstring = message.replace('(', '').replace(')', '')
    
    # Convert string to list using ", " as delimiter
    newlist = list(newstring.split(", "))
    
    # Strings with full decimals
    tim = newlist[0]
    at = newlist[1]
    rh = newlist[2]
    ap = newlist[3]
    
    # Convert to floats
    at = float(at)
    rh = float(rh)
    ap = float(ap)
    
    # Format to 1 decimal and return as string
    at = str("%.1f"%at)
    rh = str("%.1f"%rh)
    ap = str("%.1f"%ap)
    
    # Check (Pico shell)
    print(tim, at, rh, ap)
    
    # Display on LCD    
    lcd.clear()
    lcd.backlight_on()
    lcd.putstr("AT = " + at + " deg C")
    lcd.move_to(0,1)
    lcd.putstr("RH = " + rh + " %")
    utime.sleep(1100) # meas cycle 1200 sec
    lcd.clear()

\# Connect the Anvil Uplink.
\# In MicroPython, this call will block forever
anvil.pico.connect(UPLINK_KEY)

I hope someone can give me some ideas.

(I’ve reformatted your code using backticks for readability - please try and wrap code like this when posting as it’s much easier to read)

It looks to me like you are missing the pico equivalent of

anvil.server.wait_forever()

after you connect to the uplink. Without that on a “normal” uplink, the code will not stay running. So I assume the pico equivalent needs to be there as well?

So my guess is you need :

anvil.pico.connect(UPLINK_KEY)
anvil.pico.wait_forever()

though I confess I don’t use the microcontroller stuff.

Thanks for your reply.
Should I have seen the code as you re-formatted it? What I see in your reply is what I posted. I notice that I forgot to escape a couple of the first comments and that I apparently didn’t need to escape a few others. Sorry for that. Should I replace all # comments with triple backtick strings in future posts?
I based my microPython code on an article by Tom’s Hardware. This ends with the same line as my code. The main.py from the Anvil .uf2 file also ends with the same line (and comment) as in my code, so I assume that this is correct for microPython at least.

For multiple lines (a block) of code, add a
``` python
line before the block, and
```
line after the block. This eliminates the need to modify individual lines of code, and provides a clear, clean presentation in the Forum’s display.

1 Like

Ok, I’m out of my depth with regard to micropython. Hopefully someone else will pick this up.