I’m working on converting some of my ETL scripts usually run on Airflow to run directly in the app as scheduled tasks (which is awesome BTW).
While refactoring the code to work as a scheduled task (using the background task decorator) I thought it would be smoother to be able to run an entire module as a scheduled task. The way I think of these things is not as snippets of code but as files (modules).
When developing these tasks I use an IDE on my local machine then copy the code to Anvil. This is because I need to iteratively test lots of data code and an interactive REPL (I use JupyterLab) is very useful for this.
So the request is this…the ability to run an entire module as a scheduled task (or even background task). This way if there is a bug I can fix it in my IDE, then export the entire script and copy/paste it into Anvil. This is similar to the way I do it with Airflow.
The hackey way around is to highlight the text and tab it over, then add the decorator at the top. But it would be nice nevertheless.
Create an anvil module that defines a background_task decorator that does nothing
Add an if __name__ ... block to call the background function when you run in your local IDE
At this point the same module will run both locally and on the server: your IDE will be happy with the decorator that does nothing, while the Anvil server will ignore the if __name__ ... block.
EDIT
Or you could split it in 2 modules, one with the background function that imports the other module and executes its main function.
This would work if I were using a “normal” IDE. I don’t believe this would work given the cell structure of JupyterLab though (each cell is executed independently).
I like both solutions @stefano.menci, so thank you very much. I guess I should get used to the idea of having things being in functions as opposed to “scripts”.
I did a quick test with the background function importing a module, but unfortunately Anvil imports all the server modules, regardless of which module the background function is defined in, before executing the background function. So I ended up with importing the module twice.
This makes me think that there is another way: you make an app with two modules, one with the background function that does nothing and one with the code you want to drop in. When it’s time to run the background task Anvil imports both the modules, then executes the background function that does nothing.
It’s ugly because you need a whole app for each background task, but it may be what you are looking for.
Thanks for testing it. For now it makes more sense to stick with copy pasting line by line into a background task. It makes debugging harder, but it’s the simplest solution I suppose.
You might be able to do something using anvil.server.context.type (ref docs link), which tells you if you’re in the Uplink or a Server Module (of course, if your environment doesn’t have the Uplink installed, you don’t have anvil.server installed, which is also useful information!)
Have you looked at the NBDEV project by Jeremy Howard? Seems like a good use case - you develop iteratively in Jupyter - it creates the modules that you want, cleans out all the test / scratch code in the notebook to create a .py file, and you don’t have to do the cell by cell copy-paste thing. Works with git as well.
I’ve totally updated the way I run tasks now. I’m heavily data science focused, and I’ve switched to using Deepnote for my background tasks. They have great scheduled notebooks and the development is a breeze since it’s Jupyter compatible.
Anvil is still my go-to for apps. But it saves me a lot of life and frustration to move complex operations to something more suited to it than an app platform.
I’m currently building an app that does scraping, data cleaning, and machine learning all in scheduled notebooks on Deepnote, and then the final data is entered into Anvil. This way the app isn’t burdened by long-running or CPU-bound tasks.
I still haven’t used nbdev (but I’m a huge fan of Fast.ai). I’ve been considering it for the creation of a data visualization package on top of Matplotlib.