Pico W uplink disconnects and fails to recover

I came to test the Anvil.Works solution first when I bought a Pico W a few weeks ago. Anvil offers a great way to create a simple cloud connected IOT device out of the Pico W.

Unfortunately the Uplink runs for several hours but then breaks. When it does, it seems to reconnect, but the running task does not recover. This is clear from the state of the blinking led and an end to any further recorded activity.

My current Pico W side code is pasted below:

import anvil.pico
import utime, time
import onewire, ds18x20
import uasyncio as asyncio
from machine import Pin
from machine import ADC


# the onewire device is on GPIO22
dat = machine.Pin(22)

# create the onewire object
ds = ds18x20.DS18X20(onewire.OneWire(dat))
#scan bus to find sensors
roms = ds.scan()


# This is an example Anvil Uplink script for the Pico W.
# See https://anvil.works/pico for more information

SERVER_UPLINK_KEY = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

# We use the LED to indicate server calls and responses.
led = Pin("LED", Pin.OUT, value=1)

async def pico_log():
    while True:
        created = utime.localtime()
        seconds = created[5]
        led.toggle()
        await asyncio.sleep(0.5)
        #drop through on minute boundary
        if ((seconds % 60) == 0):
            # read onboard coretemp sensor
            sensor_temp = machine.ADC(ADC.CORE_TEMP)
            conversion_factor = 3.3 / (65535)
            reading = sensor_temp.read_u16() * conversion_factor 
            coretemp = 27 - (reading - 0.706)/0.001721

            #onewire read roomtemp based on initial bus scan
            ds.convert_temp()
            await asyncio.sleep(0.75)
            roomtemp = ds.read_temp(roms[0])
            
            #server call to log data in database
            log = await anvil.pico.call('add_record', coretemp, roomtemp, created)
            print(f"Roomtemp: {roomtemp} @ {created} - {log}")
            
            #pause to ensure we do not retrigger during same-second
            await asyncio.sleep(0.5)
            
# Connect the Anvil Uplink. In MicroPython, this call will block forever.
anvil.pico.connect(SERVER_UPLINK_KEY, on_first_connect=pico_log())


# There's lots more you can do with Anvil on your Pico W.

I cannot find much advice on diagnosing this problem and wonder if others have found a way around it? I have the 0.1.2 FW running on the Pico W.

Steve

The Pico W terminal shell is now giving this following a failure:-

Roomtemp: 17.9375 @ (2023, 3, 8, 10, 53, 0, 2, 67) - False
Roomtemp: 17.9375 @ (2023, 3, 8, 10, 54, 0, 2, 67) - False
Roomtemp: 17.9375 @ (2023, 3, 8, 10, 55, 0, 2, 67) - False
Roomtemp: 17.9375 @ (2023, 3, 8, 10, 56, 0, 2, 67) - False
Roomtemp: 17.9375 @ (2023, 3, 8, 10, 57, 0, 2, 67) - False
Roomtemp: 17.9375 @ (2023, 3, 8, 10, 58, 0, 2, 67) - False
Roomtemp: 17.875 @ (2023, 3, 8, 10, 59, 0, 2, 67) - False
Exception running uplink task: on_first_connect
Traceback (most recent call last):
  File "anvil/pico.py", line 123, in _launch_task
  File "<stdin>", line 52, in pico_log
  File "anvil/pico.py", line 218, in call
Exception: Internal database error: ERROR: out of shared memory
  Hint: You might need to increase max_pred_locks_per_transaction.

This is first time I’ve seen this error, which must be coming from the server(?)

Normally, I would see something like the following when the problem happens:

Roomtemp: 17.9375 @ (2023, 3, 8, 10, 58, 0, 2, 67) - False
Roomtemp: 17.875 @ (2023, 3, 8, 10, 59, 0, 2, 67) - False
Connecting to Anvil...
Connected
Authenticated to app 3FB3H5WCUQDJ42GX

It seems I am just self documenting my Anvil.Works journey via this forum post, but perhaps my experience working with Pico W will help someone in the future. I really didn’t want to give up on the environment, so kept digging.

It seems that the server uplink task does reconnect if a break occurs, but in my case the problem was that the running task stopped, and I had no way to resume it. Having read a little into asyncio, I changed the way the logging task is initialised in order to get some visibility and remote control. With the uplink running, I could snoop on the task status and restart it if needed via the Web App. Changing the way the task launches on board initialisation seemed to reduce the incidence of the task stopping, but not completely. The uplink task sometimes stopped while a server call from Pico W submitted a record to the database. This often happen between three and four in the morning, and may be related to how busy the database server was at the time.

The fact that the uplink could break and the task could stop, forced me to look at incorporating the hardware watchdog, which I currently trigger in the logging task. If the Pico’s server call takes longer than 8000ms to complete cleanly, the Pico W will reboot. Not really a controlled response, but if you want a remote solution to recover from unexpected problems, the watchdog is probably always going to be required. I just hope the Anvil server doesn’t mistake 8 second uplink reconnection attempts with a DOS attack!

My current Pico W code is below:

import anvil.pico
import utime, time
import onewire, ds18x20
import uasyncio as asyncio
from machine import Pin
from machine import ADC
from machine import WDT

wdt = WDT(timeout=8000)  # enable it with a timeout of 8s


# the onewire device is on GPIO22
dat = machine.Pin(22)

# create the onewire object
ds = ds18x20.DS18X20(onewire.OneWire(dat))
#scan bus to find sensors
roms = ds.scan()


# This is an example Anvil Uplink script for the Pico W.
# See https://anvil.works/pico for more information

SERVER_UPLINK_KEY = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

# We use the LED to indicate server calls and responses.
led = Pin("LED", Pin.OUT, value=1)
lastcall = "None"
task = None

@anvil.pico.callable(is_async=True)
async def pico_fn():
    global task
    if(task != None): 
        if task.done() == True:
            taskstatus = "Stopped"
        else:
            taskstatus = "Running"
    else:
        taskstatus = "Stopped"
    x = f"{lastcall} - {taskstatus}"
    return x

@anvil.pico.callable(is_async=True)
async def pico_task_fn(command):
    global task
    if command == "start": 
        if(task != None): task.cancel()
        await asyncio.sleep(1)
        wdt.feed()
        task = asyncio.create_task(pico_log())
    elif command == "stop":
        if(task != None): 
            task.cancel()
            task = None


async def pico_log():
    global lastcall
    global task
    while True:
        wdt.feed()
        created = utime.localtime()
        seconds = created[5]
        minutes = created[4]
        led.toggle()
        await asyncio.sleep(0.5)
        #drop through every 5 minutes
        if ((seconds == 0) and (minutes % 5) == 0):
            # read onboard coretemp sensor
            sensor_temp = machine.ADC(ADC.CORE_TEMP)
            conversion_factor = 3.3 / (65535)
            reading = sensor_temp.read_u16() * conversion_factor 
            coretemp = 27 - (reading - 0.706)/0.001721

            #onewire read roomtemp based on initial bus scan
            ds.convert_temp()
            await asyncio.sleep(0.75)
            roomtemp = ds.read_temp(roms[0])
            
            #server call to log data in database
            start = time.ticks_ms()
            log = await anvil.pico.call('add_record', coretemp, roomtemp, created)
            end = time.ticks_ms()
            calltime = time.ticks_diff(end, start)
            lastcall = f"Roomtemp: {roomtemp} @ {created} - {log} {calltime} ms"
            print(lastcall)
            
            #pause to ensure we do not retrigger during same-second
            await asyncio.sleep(0.5)
            
async def main():
    global task
    task = asyncio.create_task(pico_log())
    await asyncio.sleep(2)

    


# Connect the Anvil Uplink. In MicroPython, this call will block forever.
anvil.pico.connect(SERVER_UPLINK_KEY, on_first_connect=main())



# There's lots more you can do with Anvil on your Pico W.

Uplink will still randomly fail :frowning:
May have to give up on this now.

I know there are folks here using uplink in a variety of situations and have developed mechanisms (primarily via cron jobs) to restart the uplink scripts as needed. Hopefully one of them will chime in here. I know they’ve posted details to the forum in the past, but it’s sometimes hard to find specific replies like that.

It it just today? because there were uplink outages across the board about 2 hours ago, 3.25 hours ago, and 4.5 hours ago, all for a minute or two. If that is when your pico disconnected, then it wasn’t the pico code that did it.

1 Like

Unfortunately this issue happens randomly and I’ve not yet seen my Pico W stay connected and logging for a full 24 hours. From time to time the uplink will drop and when I have the shell connected I can see it goes through the process of reconnecting and authenticating to the app. My Pico W side code has evolved to improve the chances of the logging task restarting, but clearly there are times when the uplink stays down and server calls cannot get through. When this happens all I can do is power cycle the Pico W. Maybe someone knows a way I can test and restart the uplink via the watchdog or some software intervention supported by the Pico W? Perhaps you know a better way or place to call the anvil.pico.connect function?

Well I have managed to keep the system logging for over 24 hours now. I start the uplink as a separate task that I can monitor. This means I can include it in to the watchdog supervision.

I’ve stripped out all my logging specific code and listed the main Pico W code that seems to keep the uplink and logging running. I’ve wait 30 seconds before enabling the watchdog to help me get back into to make changes!

import anvil.pico
import utime, time
import uasyncio as asyncio
from machine import Pin
from machine import ADC
from machine import WDT

# This is an example Anvil Uplink script for the Pico W.
# See https://anvil.works/pico for more information

SERVER_UPLINK_KEY = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

# We use the LED to indicate server calls and responses.
led = Pin("LED", Pin.OUT, value=1)

# Some global variables
uplink_task = None
log_task = None
wdt = None



async def pico_log():
    global uplink_task
    global log_task
    global wdt
    while True:
        # check uplink still running and retrigger watchdog
        if(uplink_task is not None): 
            if uplink_task.done() == False:
                wdt.feed()
        # blink LED
        led.toggle()
        await asyncio.sleep(0.5)
        # add code below for logging task 
        # ...
            
async def startup():
    global uplink_task
    global log_task
    global wdt
    
    # Connect the Anvil Uplink.   
    uplink = anvil.pico.connect_async(SERVER_UPLINK_KEY)
    uplink_task = asyncio.create_task(uplink)

    await asyncio.sleep(30) # wait 30 seconds before enabling watchdog
    wdt = WDT(timeout=8000)  # enable it with a timeout of 8s
 
    # Start logging
    log_task = asyncio.create_task(pico_log())
    
    # Seems like I have to keep this task looping for ever otherwise it all stops.
    while True:
        await asyncio.sleep(5)


asyncio.run(startup())

# There's lots more you can do with Anvil on your Pico W.


2 Likes

Thank you so much for documenting your journey and your final solution.

I have been struggling with exactly the same issue, and your use of a watchdog was a revelation to me!

The machine.WDT timeout can only go up to 8.388 s on the Pico W, and I have a data collection exercise that takes longer. I therefore built my own watchdog to do the same trick:

async def feed_it():
    while True:
        current_time = time.ticks_ms()
        if (current_time - last_refresh) > your_time_limit:  # All these times in ms
            machine.reset()
        await asyncio.sleep(30) # Wait 30 seconds before checking again

I then call it as another task in startup()…

    watchdog_task = asyncio.create_task(feed_it())

I can then feed it with:

last_refresh = time.ticks_ms()

…at the start of my code (to set the first timestamp), and then wherever I need it in my data collection function (being sure to declare last_refresh as a global variable).

1 Like

Hi Craig, Thanks for your comments and glad you found my notes helpful.

From what I can see, you’ve created a ‘software’ watchdog which may well be a suitable solution. It is reliant on the ‘feed_it’ task not stopping though. What you could do is combine the two, using the hardware watchdog with its 8+ seconds limit, and retriggering it in your ‘feed_it’ task well withing the 8 seconds, so that it never expires unless your ‘feed_it’ task allows it to, or your ‘feed_it’ task stops for any unknown reason. Perhaps this is overkill for most needs, but thought I’d mention it. Hardware watchdogs have prevented me needing to drive hundreds of miles to reset a device in the past!

1 Like

The software watchdog seems to be sufficient to handle the uplink losses that cause my main function to stop working (running almost 24 hours now, with 3 reboots from the watchdog).

That’s a great suggestion to add the hardware watchdog too, and I’ll definitely be implementing that next.

Whenever I have any kind of system like this, I put one of those wall-plug adapters for a lamp that allows you to toggle the power output to whatever it is attached to via some sort of app, from anywhere in the world. ( it usually needs to be connected to wifi though )

They are in the 10-30 dollar range now, and if you just use them once, you’ve already paid for it in travel costs.

Thank you for the suggestions about solving the uplink disconnect problem. I had been trying various approaches but none worked reliably. I could not get a ping test to work reliably and I have used anvil.pico.connect_async to test the connection coupled with the class WDT – watchdog timer. I have attached the test code in case it is of interest to anyone else looking to solve the problem

import anvil.pico
import utime
import uasyncio as asyncio
from machine import Pin, WDT


SERVER_UPLINK_KEY = "server_*********************************"

# Use the system LED to indicate server calls and responses
led = Pin("LED", Pin.OUT, value=1)

# Some global variables
uplink_task = None
log_task = None
wdt = None


async def pico_log():
    global uplink_task
    global log_task
    global wdt

    while True:
        # check uplink still running and retrigger watchdog
        if uplink_task is not None:
            if uplink_task.done() is False:
                wdt.feed()

        # Blink LED
        led.toggle()
        await asyncio.sleep(0.5)

        # Add code below for logging task if required
        
        
@anvil.pico.callable(is_async=True)
async def pico_fn(n):
    # Output will go to the Pico W serial port
    print(f"Called local function with argument: {n}")

    # Blink the LED and then double the argument and return it.
    for i in range(10):
        led.toggle()
        await asyncio.sleep(0.05)
    return n * 2


async def startup():
    global uplink_task
    global log_task
    global wdt

    # Connect the Anvil Uplink.
    uplink = anvil.pico.connect_async(SERVER_UPLINK_KEY)
    uplink_task = asyncio.create_task(uplink)

    await asyncio.sleep(30)  # Wait 30 seconds before enabling watchdog
    wdt = WDT(timeout=8000)  # Enable it with a timeout of 8s - this is the max for the Pico

    # Start logging
    log_task = asyncio.create_task(pico_log())

    # Loop forever to keep the tasks running
    while True:
        # Call pico_fn(n) function here:
        result = await pico_fn(42)
        print(f"Result: {result}")

        # Loop is set to 50 seconds to repeat connection check
        await asyncio.sleep(50) 


# Run the startup task
asyncio.run(startup())