It seems background tasks can't see globals?

I have a background task which seems to treat a global variable (despite declared as such at the start of the Server Module and at the start of the backgorund task function) as a local variable.

# Globals
GB_TS_DF = pd.DataFrame()
GB_IS_DF = pd.DataFrame()
GB_ORG_DF = pd.DataFrame()

@anvil.server.callable
def loadRaw(company):
  global GB_ORG_DF
  task = anvil.server.launch_background_task('loadRawBG', company)
  print(len(GB_ORG_DF.index))
  
  return task


@anvil.server.background_task
def loadRawBG(company):
  global GB_RTS_DF
  global GB_RIS_DF
  global GB_ORG_DF
  
  log_entry = ''
  if len(GB_RTS_DF.index) == 0:
    f_gdrive = app_files.tracks_20190821_csv 
    f_bytes = f_gdrive.get_bytes()
    f_contents = io.BytesIO(f_bytes)
    GB_RTS_DF = pd.read_csv(f_contents)
    # Now add a ParentOrg column
    GB_RTS_DF['ParentOrg'] = ''
    for r, row in GB_RTS_DF.iterrows():
      DeviceAllocOrgId = row['DeviceAllocOrgId']
      if not math.isnan(DeviceAllocOrgId):
        print(GB_ORG_DF.columns)
        print(len(GB_ORG_DF.index))

When I print the length of the global dataframe GB_ORG_DF in the background task it’s 0.
From the normal call it’s 1150.

I wonder if the two functions are seeing entirely different copies of these global variables.

As if these were two distinct runs of the same module.

Anvil does provide a global cache, named anvil.server.session, that may serve your needs. See here.

1 Like

I think the problem is caused by the fact that the scope of the global variables is limited to a request.

If you want to load some global variable once and use it with the following requests, you should try keeping the server running. You can do this by opening any server module and checking this option :
image

The you need to be careful and make sure that the code running on different request threads don’t modify the data structure at the same time.

1 Like

I’ve designed the globals to take advantage of persistent calls as Stefano suggests. And so far they work fine but not with background tasks.

@stefano.menci is absolutely right about globals and the use of “Keep the server running”, which I assume you have selected.

Bear in mind that this is just an optimisation - Anvil does not guarantee to run everything in the same process, and you will certainly need to re-initialise your globals sometimes. Background tasks are no exception, and like every other server function they should be capable of running in a new process. It’s possible that there is an issue on our end with dispatching Background Tasks to persistent server processes, and I will look into that, but in the meantime you should allow all your functions to initialise from scratch if necessary.

Hope that helps!

1 Like

That makes sense, now that I think of it.

So using globals in this way is poor practice. Can u check that up as u suggest?

What’s my alternative if I want to keep something large in memory? Keep passing it around?

Actually using globals like this is a very good idea, it will speed up almost all of your server calls. But occasionally your server process will change and need re-initialising, so you should have some sort of initialisation function which gets called at the top of every server function (and Background Task). Its job would be to check whether the global data is there, and initialise it if not.

That way your app is safe when server processes come and go, and fast in the meantime!

1 Like

Thanks gentlemen. And that’s given me a lot of confidence now to implement that initialisation check.

Just to confirm, currently, background tasks always run in a separate Python interpreter, so they’ll never share global variables with your server calls.

The reason is to do with timeouts: If a server call takes too long, we need to kill it rather than letting it spin forever. Background tasks, however, can run for a long time. If we ran them in the same Python interpreter as your standard server functions, we would have to kill your background tasks whenever a server call timed out. (Likewise, if you called kill() on a background task, it could take out several other server calls as collateral damage! We try to avoid giving you unpleasant surprises like that :wink: )

To avoid these issues, we run each background task in a separate Python process. But the downside is that background tasks don’t share global variables with your server calls.

4 Likes