Multiple Background Tasks on separate CPU-cores – Parallelising background tasks

Hi guys,

I have a background task running 24/7 (being checked on and restarted so it doesn’t randomly die unhandled). This background task is basically my engine, handling the AI generation. As new rows come into my table, it works through them (the rows on the table are my ai que).

As I am on a dedicated server, I have 4 cores available. I would love to have this task run 4 times, once on each of these cores. I can split the que by assigning each row a number between 1 and 4. What I do not know how to do is to ensure that each background task runs on it’s own core rather than all clogging up one core.

Will this happen automatically? Is this possible? Any advice would be greatly appreciated!

Thanks,

David

Yes.

Having 4 threads (tasks) using 100% of 1 core each on a 4 core machine will leave 0 cores for the http server, database server, and hundreds of other tasks the machine constantly runs. Those tasks may use very little cpu, but that little cpu should always be available, otherwise the transactions will slow down, start having conflicts, hence slow down more, and the whole world will stop.

You never know which core does which job, but, assuming that those threads use 100% cpu, I would start with 3 threads.

As per this post: Is it safe to use asyncio on the server? - #8 multithreading isn’t supported on Anvil server code (due to the servers themselves using multithreading, as you note).

Have you found a way to use multithreading safely?

Thanks for the quick reply @stefano.menci! Actually they don’t use 100% of the cpu, but indeed I like the idea of being cautious and just using 3 cores. But would it work? Or would the load not be divided over the cores?

I suspect you’ll have more control over Uplink programs.

This particular example seems to work. Haven’t tested it out properly but maybe something you can check out. Of course, as mentioned here, this may not safe to use so do so at your own risk and only implement in production after careful testing.

@anvil.server.background_task
def run_multiple_threads():
    
    def thread_function(thread_no):
        loop = asyncio.new_event_loop() #Create a need loop for the thread
        asyncio.set_event_loop(loop)
    
        async def print_in_thread(): #This function will run on all threads
            while True:
                print(f"Message from thread {thread_no}")
                await asyncio.sleep(2)  
    
        loop.run_until_complete(print_in_thread())

    threads = []
    
    for i in range(3):  #Creating 3 Threads
        t = threading.Thread(target=thread_function, args=(i,)) 
        threads.append(t)
        t.start()
    
    for t in threads:
        t.join()

Also worth adding that if you are trying to do something that is not particularly heavy for CPU, you can create multiple async functions within a single background task. This is a more safe approach. Are you tracking the CPU usage for your tasks?

Right, forget what I said earlier, that would apply to processes, not threads/tasks. As far as I know, a dedicated server has only one Anvil server, that is only one process, that is all the threads for all the tasks for all the apps run on the same core.

Tasks are threads, and they all run in the same process as the main server, so, 4 background tasks will all run in the same core, and you will only use 25% of your 4 available cores.

If you want to use multiple cores, you can’t use tasks or threads or async, you need multiple python processes.

As @p.colbert mentioned, you could have multiple uplink processes running in parallel, either in the same computer or in different computers.

I’ve never tried running other processes in my dedicated server. I have other VMs for my uplinks, because they need local resources. In your case, since you don’t need local resources, you could run the same processes in the same machine the Anvil server runs, (that’s the very point of this question).

Perhaps you can figure out a way to add your scripts to the repository. Those script can’t be in the server_code directory, otherwise they would be imported every time the app starts. Perhaps you could put them in a table, have a server module create a file in /tmp and run a process with that file.

The app’s server code would spawn 3 new processes, each running your script that connects with an uplink key. A scheduled task would check that they are still up, maybe trying to call one of their callable functions.

This is dangerous. Anvil has a refined way to ensure that no threads are stuck and you don’t run out of resources. And if something goes wrong, they kill the server process and restart from scratch. But if you screw up and leave hundreds of your processes hanging, there is no one doing the housekeeping for you. So… you are playing with fire.

To clarify @davidtopf2: Yes, this will work!

Each Background Task runs in its own process (this is so that it’s possible to kill them without taking out other tasks).

2 Likes

Ok, than forget what I said when I said forget what I said earlier!

thanks so much for the help guys!