Provide explicit ways to prevent concurrency

hugetim · April 29, 2022, 10:28pm

On the client-side, Anvil currently treats all functions as coroutines and behaves as if await keywords were present in front of certain lines of code (lines of code which may correspond with the “blocking code” described here, but may be much broader than that).

Please add an app-wide option to instead have all functions behave like uninterrupted subroutines (or, in other words, without implicit await keywords in front of server calls and other “blocking code”) and make this the default for new apps, particularly for new users.

For apps in which this option is not enabled, please also provide client-side analogues to the Data Tables in_transaction context manager and decorator. (Within such “client-side transactions,” the code should run uninterrupted–and with no other code running concurrently–that is, without implicit await keywords in front of server calls and other “blocking code.”)

Additional discussion here:

p.colbert · April 30, 2022, 5:30pm

Purpose: to ensure that the “in-transaction” code does not get interrupted by other code (e.g., a response to a button click or timer) until it completes.

This would be used to update multiple Client-side variables, where all updates must be completed (or rolled back), to keep the App’s in-memory variables in a logically-consistent state.

The classical example, in an on-line game: transferring (simulated) funds, to debit one account and credit another. Things can get royally botched if any code sees the transaction variables in a half-completed state!

It should be possible for these “transactions” to nest cleanly.

Rationale
Currently, most code relies on the brevity of such (implicit) transactions to protect them from unexpected interlopers. However, when such transactions include server calls or database activity, they’re no longer brief. The resulting errors, being extremely timing-sensitive, are nearly impossible to reproduce. The aim here is to let the developers prevent such devilish situations in the first place.

junderhill · May 3, 2022, 2:15am

I think this is a bad idea. Moreover, I think this, along with the referenced discussion, miss a couple of important points:

Python is a synchronous language. Generally speaking, your program is executed one statement at time, in order, which makes it easy to reason about correctness, particularly for beginning programmers. Anvil, via the underlying compiler, Skulpt, bends over backwards to preserve this characteristic, while compiling your code into JavaScript–a non-blocking asynchronous language in which things can easily happen out of order, causing endless problems.
Any flaws in this approach which bleed through (such as UI events being triggered while a server call is running) are a result of the nature of the underlying environment, not Anvil, and we should work to prevent those problems intruding into the orderly world of Python, not the other way around.

It’s sometimes useful to have multiple things running, or waiting, at the same time, but this is very hard to get right, as anyone who’s built a non-trivial JavaScript application (or a multi-threaded Realtime application) knows. Yet most applications don’t need concurrency, most of the time.

I’ve built several production applications in Anvil, some with custom JavaScript libraries that talk to external services (like Google Firebase) from the front end. In all cases we flatten out the concurrency in JavaScript wrappers, so the app behaves in a predictable, easy-to-understand manner. If UI events could cause concurrency problems while a server call is running, we simply block that by disabling the component, or by setting a flag.

In cases where you don’t want to wait, more sophisticated approaches are possible. You can achieve pseudo-concurrency in Anvil any time using JavaScript, which you can call into, and out of, at will, in Anvil. Remember, under hood, once it’s running, it’s all JavaScript. But why on earth would you want to open up this can of worms in Anvil, by default?

No offense intended to anyone, love you all, YMMV. I just really hate worrying about concurrency in a data processing app.

John

stefano.menci · May 3, 2022, 4:47am

I agree with @junderhill.
Let’s keep it simple.
If you really need multi-threading there are other ways, but you usually can do without, and introducing uncontrollable behaviors is just looking for trouble.

… and using timers or time.sleep(0). Not as flexible as javascript, but much easier and most of the times works just fine.

I have an app that crunches numbers for minutes on the client side. It uses a timer to split the task in discrete chunks (when I made it I didn’t know I could have used time.sleep(0)). At the end of each chunk it updates the progress on a component, the event loop takes the control, updates the ui, takes care of pending events and, after 0.01 seconds, the next tick event starts the next chunk.

It’s easy to manage. It’s similar to async, where I give the control away when it’s OK to give the control away.

The only problem I have is when there is a server call waiting and the tick event takes longer than 30 seconds (see here). But the solution to that problem is to take care of the value returned by the server call while the tick event is running, not to execute the lines of code that follow the server call.

For example if an event executes x = server.call('f'), and then, while the client is waiting for f to return, another event starts a long running task, Anvil should take care of the server call without waiting for the long running task to finish, but should assign the resulting value to x only after the long running task has finished or gives control by calling time.sleep(0).

In my previous post I was complaining about the error, but I wasn’t asking for other code to run unexpectedly without my consent.

p.colbert · May 3, 2022, 11:32am

This is precisely what we’re trying to accomplish. The title reflects the fact that we currently have some uncontrolled concurrency, and we would like to have clean, simple ways to limit or eliminate it, as shown here:

Emphasis mine. The goal is to prevent concurrency, by default – except where the developer (or the task) demands otherwise.

We hope that the latter case will be rare. However, when it does happen, we should at least be able to prevent concurrency for critical areas of code, i.e., via the in-memory equivalent of (nestable) transactions.

hugetim · May 3, 2022, 1:06pm

Right, I feel like we’re really talking past each other, @junderhill and @stefano.menci. It seems you are both acknowledging that Anvil’s current behavior (in which JavaScript’s non-blocking, asynchronous nature bleeds through any time there is a server call) is not ideal. But, as experienced Anvil/JavaScript programmers, you have learned how to cope with this (undocumented) behavior with practices like:

That may seem simple to you, but for me as a newbie Python programmer coming to Anvil a few years ago trying to build my first app, it was bewildering to have this implicit concurrency be causing errors.

(Expand for more on my specific experience.)

My main form had a status attribute and a seconds_left attribute. In the “normal” status, the seconds_left was meant to be irrelevant, but in the waiting status, the seconds_left was meant to count down to zero, at which point a server call would be triggered to report that the user was unresponsive. A Timer called every 5 seconds would check whether the status was waiting and seconds_left was <= 0. (I also had another Timer whose job it was to just increment down the seconds_left toward zero, once every second, when the status was “waiting.”)

A button click would trigger the status change from “normal” to “waiting” and reset seconds_left to something like 30 seconds, along with sending some information to the server. Usually this worked fine. But occasionally when the button was clicked, the user would be unexpectedly be immediately shown the notification saying that the timer had run out.

Just as @p.colbert describes, I had trouble replicating the error, which made it difficult to debug. I made a number of changes to the code based on false theories of what was going wrong, wasting time and unnecessarily complicating the code.

Eventually, though, I did hit on the idea that the issue was caused by changing self.status to “waiting” before setting self.seconds_left to a positive number, so if the 5-second Timer happened to tick between those two attribute changes (which was apparently made more likely by a server call I had placed in between), it would find self.status == "waiting" and self.seconds_left == 0, wrongly inferring that the user was unresponsive. Thus I was finally able to prevent the issue (most of the time, at least–but I still wasn’t clear on how the implicit concurrency was working, so I wasn’t sure I had fully fixed it).

It has taken me years (of part-time coding/learning) to come to understand programming best practices–and even just terminology–well enough to formulate this feature request (as poorly worded as it still probably seems to those with more expertise). Anvil’s default behavior should (further) protect its users from this complexity, if possible.

hugetim · May 3, 2022, 1:18pm

I changed the title (from “control concurrency” to “prevent concurrency”) to clarify that both parts of the request (1. make client-side Python functions synchronous app-wide by default, 2. otherwise, provide a way to mark specific client-side code as must-be-synchronous) are aimed at removing concurrency complications (at the potential cost of speed), not adding asynchronous capabilities (beyond what is already implicit in how Anvil currently works).

p.colbert · May 3, 2022, 1:29pm

As a practical matter, the difficulty with these workarounds is that you have to disable every visible component that could possibly interrupt. That includes components not inside self. Those may be siblings, or somewhere else in the component instance tree. (This is “especially difficult” on those occasions where get_open_form() returns None…)

Setting a flag means that every such component/handler must be coded to recognize that flag. Including components that you didn’t write.

The framework can identify potential offenders pretty easily, though. Or alter the way the event loop is processed, more likely.

That might be a difficult undertaking. It might be impractical. And there are likely higher-priority targets. We’ll continue to create workarounds, however flawed and leaky, in the meantime. But at least we’ve identified and communicated a need.

stefano.menci · May 3, 2022, 2:37pm

Mmmmh… if I understand, @hugetim you are not talking about blocking code that unleashes the next in line in the event loop, you are talking about another type of concurrency. Rereading the thread, it looks to me like @p.colbert understood, while @junderhill and I decided you were talking about lower level concurrency.

Just to clarify the two points of view, here are the three types of concurrency (very roughly described):

Multi-threading: one function runs and the control passes to another function at any point in time.
This is difficult one to manage, doesn’t exist in javascript or Skulpt. No one was talking about this.
Async: one function runs and when it’s time to wait for a value to come back from the server or any other slow operation, it puts itself on hold and gives the control to the next in line.
This is easier to manage, @junderhill and I were talking about this.
Other functions are executed between events: the tick event does its job and change some variables; between tick events other functions can run, do their job and change the same variables.
@hugetim and @p.colbert were talking about this.

My opinion is that the third type is what keeps a form responsive and cannot be avoided.

Using the in_transaction way means that the events are put on hold and executed later, when the transaction ends. This could unleash a burst of clicks because the user kept clicking on a button that seemed unresponsive.

Perhaps a better solution would be something similar to the Application.EnableEvents = False in VBA, which prevents all the events from being fired?

The problem with that is, if something goes wrong and it’s not set back to True, the form will be frozen forever.

Perhaps it would be better to focus on what’s enabled rather than on what’s disabled.
What about something like this?

# disable all the events with the exception of a button and a timer
self.enable_events_only_in = [self.cancel_button, self.timer_worker]

# at this point only the events of the listed components are enabled
# and any other event is discarded (not queued)

# re-enable events on all components
self.enable_events_only_in = []

hugetim · May 3, 2022, 4:26pm

I’m realizing that the feature request is unclear in terms of how exactly it’s proposing events should work. What I’ve had in mind, though, is indeed something like preventing “all the events from being fired”–with the visible UI components disabled. (The alternative, having the events be “put on hold and executed later,” strikes me as worse.)

This doesn’t strike me as a problem worth worrying about. My understanding is that the implemention (as a context manager) could ensure that, practically speaking, it always gets set back to True.

Do I understand correctly that you are proposing an API for the “transactions” envisioned in this feature request?

In terms of the three types of concurrency you lay out, I understand the distinction between 1 and 2, but I’m having trouble understanding what you mean in 3–and why you say it “cannot be avoided.” (And, relatedly, I suppose, I don’t understand why you say you were talking about 2 whereas I was talking about 3. I think of Anvil’s current functioning as being described by 2. Also, I have edited my OP to try to clarify the main idea of feature request.)

p.s. I don’t understand why the browser would need to be responsive at all (in terms of Timer ticks and button clicks) during a server.call (as opposed to server.call_s). To me, showing the spinner suggests the browser will not be responsive (and I was surprised to recently learn that the browser is responsive despite the spinner).

ianb · May 3, 2022, 4:47pm

I like the idea of a context manager, like how the current no_loading indicator works. Making the behavior of everything at once hold on until the with block concludes?

I never considered the underlying problem a problem, I just kind of always lived with the fact that websites work that way, and users click all over the place, all the time.
That might just be me coming from a LAMP stack background (embedded php, etc) so front-end and back-end to me are just different universes with their own set of expected “how they work” rules.

stefano.menci · May 3, 2022, 5:00pm

Is when the tick event is fired, in the middle of the event there is a server call or you call time.sleep(0) or other blocking code, and the control goes back to the event loop.
Is when the tick event is fired, the event does whatever it needs to do, after it ends, the control goes back to the event loop.

I thought you were referring to 3. because you were talking about 5 second intervals.

If you really are talking about 3., where you want to “freeze” something (the form, some data structures, …), while a timer keeps ticking until something happens, then you can’t use a context manager across two executions of the tick event.

If you are talking about 2., then… are you looking for a way to say “I’m going to use blocking code, please don’t execute anything else until I’m done”?

In this case a context manager would make sense, but… there is the risk that if you have a communication problem the app becomes unresponsive. I guess the context manager will have a timeout and eventually abort the transaction and return to normal, so no, there is no problem with that. I like it.

No, since I didn’t understand the request, I thought about how to prevent stuff from happening when you don’t want it to happen. Disabling each single component would be a nightmare as @p.colbert mentioned. I think that it would be easier to focus on what you want to keep enabled.

Basically instead of saying “do not execute any code until I say so” (your request) I was suggesting “do not fire events on any non-whitelisted component” (my self.enable_events_only_in).

Using a context manager is more elegant and water tight, using a whitelist could leak some code execution that is already pending, but could be used across two executions of two tick events.

Sorry, I didn’t want to hijack your request. I thought you were asking for some more advanced ways to create concurrent processes.

p.colbert · May 3, 2022, 5:14pm

I’m quite happy with the discussion here, as it makes things clearer for all readers to come.

junderhill · May 4, 2022, 10:16am

@hugetim I apologize if I misunderstood your original post–to control concurrency implied to me enabling it further (which I think is a bad idea for Anvil), as well as preventing it. Prevent concurrency is a more precise title, but I don’t think this is practical. Anvil already prevents concurrency as much as it can, and I don’t think it can prevent it any further. I’ll try to explain, and I’ll try to be helpful, rather than just taking pot shots at Ryan Dahl.

JavaScript was originally designed to run on computers with only one core (single-threaded CPUs). In that context, if you block the main code path, the computer will hang. So JavaScript was designed as a non-blocking language, using events and callbacks for anything that might take time to complete. Even today, in a modern browser, your web app runs as a single thread. You can put your script in a tight loop, and this will lock up the tab, but you cannot write any statement that actually waits for something to happen.

Although it’s a slight oversimplification, you can think of each JavaScript statement as atomic. Because there’s only one thread, nothing else in your program can interrupt it until it’s finished. In this sense there is no real concurrency in JavaScript, and you never have to worry about realtime issues, like data corruption. You just need to worry about order of operation, due to the multiple code paths created by the ubiquitous callbacks.

Now, Anvil compiles a single Python statement into many JavaScript statements. For example, this line:

print "hello world"

compiles into this:

var $scope0 = (function($modname) {
    var $blk = 0,
        $exc = [],
        $gbl = {},
        $loc = $gbl,
        $err = undefined;
    $gbl.__name__ = $modname;
    Sk.globals = $gbl;
    try {
        while (true) {
            try {
                switch ($blk) {
                case 0:
                    /* --- module entry --- */
                    //
                    // line 1:
                    // print "hello world"
                    // ^
                    //
                    Sk.currLineNo = 1;
                    Sk.currColNo = 0

                    Sk.currFilename = './simple.py';

                    var $str1 = new Sk.builtins['str']('hello world');
                    Sk.misceval.print_(new Sk.builtins['str']($str1).v);
                    Sk.misceval.print_("\n");
                    return $loc;
                    throw new Sk.builtin.SystemError('internal error: unterminated block');
                }
            } catch (err) {
                if ($exc.length > 0) {
                    $err = err;
                    $blk = $exc.pop();
                    continue;
                } else {
                    throw err;
                }
            }
        }
    } catch (err) {
        if (err instanceof Sk.builtin.SystemExit && !Sk.throwSystemExit) {
            Sk.misceval.print_(err.toString() + '\n');
            return $loc;
        } else {
            throw err;
        }
    }
});

The resulting function, however, is executed by the JavaScript interpreter on a single trip through the event loop (someone like @stucork might be able to verify this), so each Python statement is also, for practical purposes, atomic. In other words, there’s no real concurrency in Anvil Python, either.

Except that that’s not quite true. Python is a synchronous language that always waits for I/O to complete before moving on. In other words, it blocks. How do you compile a blocking language into a non-blocking language? They use a trick, called a Suspense. Essentially, they suspend the Python statement until the I/O completes, and then resume it where it left off. In the meantime, however, other things can happen, and this is where the order of operations problem bleeds through into your Anvil app. UI events, which can happen at any time, also cause this problem.

What you want is for Anvil to stop any events from happening, or any alternate code paths from executing, until an entire ‘critical section’ defined by you completes. But there’s no way to stop the normal flow of execution in JavaScript. What you can do is to write a JavaScript function to do the uninterruptable work, and call it from Anvil.

And you can take action, at the start of an event, to prevent conflicting behavior. Let me take your example from your last post: You have a timer running, and a button the user can click. They are essentially in a race. When the button is pressed, the first thing you should do is to disable the timer, and then disable the button. Now you can proceed safely. If the timer fires first, you should immediately block the action of the button, or disable it; and don’t schedule the next tick of the timer until you’re done with whatever you’re going to do in this case. More complicated scenarios require more sophisticated logic, but the basic idea will work, if you think it through.

This is not the end of the story. Browsers are evolving. You can make any JavaScript function asynchronous with Promises. You can schedule things to complete before the next iteration of the event loop with Microtasks. You can create real background tasks with Service Workers. But all this requires even more coordination–you can’t get away from that.

Personally, I’d love to see browsers natively support many different languages, including and especially Python. I don’t see why CPYTHON can’t be baked into a browser. If I were a younger man I would write a browser with Python in it, with real threads for concurrency. That would solve this any many other problems.

John

owen.campbell · May 4, 2022, 11:53am

If I were a younger man I would …

There’s another thread all by itself!

stefano.menci · May 4, 2022, 1:30pm

I agree it would be a bad idea, let’s keep Anvil simple.

In a feature request I will leave it to the power that be to decide whether it’s practical or feasible. My speculation about what’s going on under the hood might be wrong and they might find a little trick that does the job.

Here is where I ask what I want, they will decide if it’s technically possible, worth spending time on it, part of their long term plans, and maybe, perhaps, implement it.

I have been bitten by events unexpectedly sneaking through the cracks, and I had to find workarounds.
If this feature request had been already addressed, I would have fewer scars.

I think that a context manager would be an elegant solution, but wouldn’t cover all the cases.

I think that a white list would be a less elegant solution, but would cover more cases (it would survive across two tick events for example).

I don’t know if either is practical for Anvil to implement, I know they would be definitely practical for me to use

p.colbert · May 4, 2022, 2:18pm

I love what you’ve done to describe the situation. It’s a breath of fresh air to have this out in the open, and easy to follow.

If I read your description correctly, Skulpt will transpile the single statement

account1 -= amount_to_transfer

into a single, uninterruptable block of JavaScript code. Skulpt could provide a context-manager, e.g.,

with skulpt_interrupts_suspended() as transaction:
    # Transfer of funds; must be atomic!
    account1 -= amount_to_transfer
    account2 += amount_to_transfer

so that both go into the same uninterruptable block. The transpiler would have to recognize this as a special case, but it doesn’t need to introduce any new syntax.

Is there such a thing already? If so, does it play well with Anvil?

junderhill · May 4, 2022, 7:31pm

@p.colbert This formulation of the request makes sense to me.

stucork · May 5, 2022, 4:18am

Worth clarifying what we mean by blocking code

Blocking code:

When the docs talk about blocking code, they are referring to what would be blocking code in regular Python. In Anvil that is most commonly time.sleep() or anvil.server.call().

In client-side Python (Skulpt), code that would block in regular Python cannot block because that’s not possible in a JavaScript runtime. So it “Suspends”. Suspensions are a bit like the yield statement in a Python generator. The compiler remembers the state at the point the code suspended and when the code that would block in regular Python is ready with a result (and when the JavaScript runtime says it’s ok) the Suspension resumes.

If statements aren’t blocking in regular Python they won’t suspend in Skulpt.
e.g.

with skulpt_interrupts_suspended() as transaction:
    # Transfer of funds; must be atomic!
    account1 -= amount_to_transfer
    account2 += amount_to_transfer

won’t have an effect. Those two lines are not blocking in regular Python, so they won’t suspend in client-side Python and will always run in order as expected.

The demo in the linked post provides examples of blocking code in regular Python that suspends in client-side Python, through its use of sleep() and anvil.server.call(). These points of blocking code must suspend in the JavaScript runtime. There is no way to prevent this. You cannot block the JavaScript runtime with a server call or a call to time.sleep(). If it were possible, JavaScript would have a native sleep function (it doesn’t).

Most other Python - JavaScript browser options do not have a time.sleep() for this reason.

RustPython Demo - python written in rust compiled to WASM running in the browser.
- RuntimeError: panicked at 'can't sleep', library/std/src/sys/wasm/thread.rs:26:9
https://pyodide.org/en/stable/console.html
- time.sleep() is a dummy function and calling it has no effect
Brython interactive mode
- NotImplementedError: Blocking functions like time.sleep() are not supported in the browser

This ability to implement Suspensions was one of the main reasons Anvil chose Skulpt in the early days.
How can you do a blocking-like call to the server if your client-side Python doesn’t support blocking-like code?

Here’s the perfect 5 minute talk by @mereydydd about client-side Python titled: “Compiling blocking Python to non-blocking JS”

JavaScript in its infinite wisdom is 100% non-blocking
~ Meredydd 2017

Anvil can manage its own event system but it can’t stop the browser from executing JavaScript.
If you click a Button while a server call is going on, the Browser still raises that click event.
Likewise a user is free to hit the browser back button while a server call is going on (think HashRouting).

There’s also a pay-off here between developer experience (DX) and user experience (UX).
If we want to prevent click events, then from a UX perspective, disabling the button might be the best approach. But that’s a bit annoying from a DX perspective.

Just as an experiment, here's an implementation that prevents Anvil event handlers from firing while another Anvil event is being handled. It also implements an API for disabling components that should be disabled during such handling of events.

It adapts the demo from the linked post:

from event_utils import disable_in_event, event_safe


class Form1(Form1Template):
    def __init__(self, **properties):
        disable_in_event(self.button_0, self.button_1, self.button_2, self.button_3)

    @event_safe
    def timer_1_tick(self, **event_args):
        # don't tick if there's another event_safe event going on
        print("tick")

    @event_safe
    def button_1_click(self, **event_args):
        """This method is called when the button is clicked"""
        anvil.server.call("my_long_func")

An event_safe event handler when invoked will disable components that were called using the disable_in_event method.
Other event_safe events will exit early if another event_safe event is going on.

Caveat to some of the above

If a process in client-side Python is running for a long time and has no blocking code, e.g. a long running while loop that increments a counter, then it may briefly suspend after it has been running for a long period of uninterrupted execution. This prevents infinite loops from locking up the browser.

stefano.menci · May 5, 2022, 5:43am

I really don’t like this!
If I need a process to be running “atomically” and don’t care about a frozen UI, I should be able to do it.
If I want to keep the UI responsive or I want to prevent the message from the browser about something taking to long, then I can explicitly call time.sleep(0) or use a timer to split the process in discrete chunks.

Pulling the rug from under my feet can have unexpected dangerous side effects.

stucork:

with skulpt_interrupts_suspended() as transaction:
    # Transfer of funds; must be atomic!
    account1 -= amount_to_transfer
    account2 += amount_to_transfer
won’t have an effect. Those two lines are not blocking in regular Python, so they won’t suspend in client-side Python and will always run in order as expected.

Right, but it would have an effect if amount_to_transfer was a function that contains blocking code.

Disabling many components is very annoying, and often impossible. Think of a repeating panel or an alert or a custom component containing nested components, or a combination of all the above. How would you cycle through all the components and disable them?

That’s why I was suggesting a white list rather than a black list. It’s easier to say “please discard all the events with the exception of the cancel button, a timer and that other component” than saying “please discard this event” to every single event in every single component.

I’m afraid that, while the latter could be done by adding a simple decorator(*) to every event handler (which is only possible on simple forms), the former would need some low level intervention on the event loop.

(*) I assume that’s what your event_safe decorator does, I’m on my cell now, I’ll check it tomorrow.