Is it reasonable that with 2GB of memory I can only run 15 simultaneous background tasks?
This hinges heavily on your definition of “reasonable”, and whether it agrees with tht of the Python ecosystem, but each background task is a separate Python process. (This is true even when using Persistent Server - it’s so that you can kill them individually if they go rogue, which isn’t something you can do to a thread). And if each one weighs in at ~133MB then 15 of them would add up to 2GB. If you’re loading hefty data at startup and/or big Python libraries, 133MB doesn’t sound impossible to me. You might want to try to optimise that (eg if you have global initialisation code in your Server Modules that allocates memory, maybe make it “lazy” so it only allocates first time it’s used, that sort of thing).
If the thing you want to parallelise is your OpenAI API calls rather than any local processing, then there’s an alternative you might consider, and that’s using Python’s async
support. A little Googling suggests that OpenAI has an async-capable API these days, and async is supported in Anvil server code.