Simple anvil.users.get_user() call is slow on Accelerated Tables

While I already have a similar post [Accelerated Tables] Getting Linked Column of Users Row, this is a slightly different discovery that I overlooked previously and hence I am creating a new post for it (Feel free to merge it into existing one if you prefer).

It appears that anvil.users.get_user call still fetches a lot more than required sometimes.

For context, I have a fair amount of linked columns and simple object columns (Most of which I don’t need to bother with all the time).

Without accelerated tables, Anvil seem to fetch every column other than Linked columns and Simple Objects on its own (not the most ideal way but all the big data is in Linked columns and Simple Objects anyway).

With Accelerated tables, it seems to fetch all simple object columns and empty row references for linked columns too (similar to an empty q.fetch_only()). This more than doubles the time taken for getting the user row.

On a high speed WiFi, anvil.users.get_user() takes 1.2-1.5 seconds with accelerated tables and 0.6-0.7 seconds without it.

The best way would of course be to add fetch_only to anvil.users.get_user(). But until that happens, I’ll even settle for anvil.users.get_user() to not fetch simple objects or linked columns.

I’m very interested in this (and echo your plea to enable q.fetch_only() on anvil.users.get_user() ).

Can you clarify this paragraph a bit please? There are a couple of things I’m confused about.

Firstly, you say it fetches “empty row references for linked columns”. What are empty row references? Are these just “None” values? But it doesn’t fetch actual linked rows does it?

Secondly, you say “similar to an empty q.fetch_only()”. I thought an empty q.fetch_only() returned zero columns - isn’t that the point of it?

Thanks!

There is the same explanation for both points.

If you do something like

row = app_tables.table.get(q.fetch_only(), col = value)

You still get a row object. But as you said, it will not load any columns (unless you try to access the column). This is useful when you just want an empty reference to that row without needing any values of it (Use cases include comparing rows or storing the row in a data table).

Now if I have a user row with a col “A” that is a link to one or many rows, I would get these empty references when calling anvil.users.get_user() and accessing the value of col “A”.

Does that clears up the issue?

Yes, that’s really helpful, thank you.

So this seems like quite good behaviour, no? Presumably getting an empty row reference is very quick? If it’s not then I can see that’s an issue.

Yet but not as much as not loading anything. Although I think it is majorly the loading of simple object columns that slows down anvil.users.get_user() call.

Is it imperative that you invoke get_user()? Or would a custom-built anvil.server.callable make a suitable replacement?

Can you explain more about that?

Is there any other way to find the user’s row in Users (or indeed in any data table)? I’d love to build a custom replacement if I could see a way to avoid using get_user().

Aside from being maintained largely by Anvil’s Users Service, the Users table is just like any other table. You can search it, with q.fetch_only, and cache the result server-side in the persistent Session object.

Edit: This wouldn’t eliminate the first get_user() call, but could streamline some of the subsequent calls.

Edit 2: As with any other cache, making sure that it stays up-to-date is often the hard part. If you’ve got a Persistent Server running, be aware that a single Persistent Server instance may serve many different users while it is running, so take care to keep each user’s cache data distinct.

3 Likes

I didn’t know about the persistent session object, that’s very helpful!

Edit: Oh gosh, your second edit is scary! So I can’t store user_id in the persistent server object? So I think we’re back to square one right - the only way I can find out which user some data pertains to is by calling get_user() again I think?

Edit again: Actually it looks like the session is per-user, so should be safe to store user_id in there.

1 Like

I am already caching the user row for later uses. The first time call is the issue here.

Yes. And that id can be used as an index into Persistent Server’s other global data structures, if you create any.

Edit: To avoid keeping data in memory for logged-out users, a “time-to-live” approach would probably serve. If a user’s data hasn’t been asked for in, say, 35 minutes, its cache can be cleared.

Btw a bit unrelated to the current discussion but here’s the workflow I would recommend if you want to cache user row but ensure that it remains updated.

  • Store the user row in both client side module and server side sessions
  • Whenever you want to make any updates to it, send the client side reference to the server side.
  • Check the client side row with the server side one to ensure that user hasn’t done any tampering.

For ex.

Client Side

anvil.server.call("save_user_settings" , client_module.user, settings)

Server Side

def save_user_settings(user, settings):
   if user == anvil.server.session.get("user"):
        user['Settings'] = settings

The best part about this is that our client side reference also gets updated.

1 Like

After some thought, I realized that some caution is in order. Each session connects not to a user, but to a browser tab. It is entirely possible for someone to log out of a browser tab, while leaving the tab open.

The same browser tab can be later used to log in someone else. In the same App.

So, yes, you could keep the user_id in there. But always check to see whether that user_id has changed. If it has changed, then the prior user has logged out (of this session) and any of their data should be flushed, lest it be applied to the new user.

I want to reinforce that the client must have at least one call to anvil.users.get_user() during startup, since that’s what triggers the password change mechanism when users click a password change link.

Caching it after that is good sense, since it changes at specific times only, so it’s easy to get user caching right.

In theory, it could change from one server call to the next. Between calls, the old user may have logged out of the browser tab, and a new one logged in on the same tab (same Session), with the same App.

Yes, when a user logs out it changes, when a user logs in it changes. That makes keeping a cached value up to date easy, regardless of where you’re keeping the cached value (client-side or server-side). Versus a value that may change out from under you (like the contents of a definition data table), where keeping the cached value up to date is harder.

All helpful stuff, thanks everyone.

@divyeshlakhotia, it seems to me that with the current tools we have available, the only solution is probably simply not to have any substantial data in the Users table, i.e. move your simple objects to a different data table and then store a link in Users. I’m sure you’re way ahead of me on this, and not disagreeing that this is frustrating!

I have another little question arising from the discussion of empty row references. Imagine I have an empty row reference (which I retrieved from a link column in Users, say), and I want to get a number of columns from it.

Currently, I’d write:

usersRow = anvil.users.get_user()
myRowReference = usersRow['linkToUserDataTable']
dataItem1 = myRowReference['item1']
dataItem2 = myRowReference['item2']
dataItem3 = myRowReference['item3']
dataItem4  = myRowReference['item4']
dataItem5  = myRowReference['item5']

I think those last five lines are going back and forth to the data table each time, so are quite inefficient. Can this be done more efficiently?

1 Like

I don’t like unexplained magic, I never felt comfortable working with links to other tables without fully understanding when the round trips are triggered, so I never use linked rows.

I had colleagues adding columns to the user’s table in the past, with data or with linked tables, but I promptly asked them to remove those columns and keep things separated.

After 8 years the user’s table used by hundreds of apps has the columns generated by Anvil and a few columns used for permission management only. My strategy is that any link between two tables is managed by storing a key in one table, for example the user email, and doing two searches, one for the user and one for the linked table using the user email. Same strategy for any linked tables: never use linked columns.

Obviously my user table only has a few hundreds users only, which could contribute at making it faster, but I keep seeing posts about slow performances on the user’s table and the app tables in general, and I never had any performance problems, including with tables with hundreds of thousands of rows.

The reason why I never liked and used linked columns is simply because they did some not well documented magic, and (1) I like to have control over what happens and (2) adding one single line to fetch all the linked rows I need in one shot makes the code easier to read and most likely faster.

Well, this is not an answer, it’s just a little feedback, a data point for you.

6 Likes

I’m with @stefano.menci on this approach.

If I was going to use a link column, it would link to Users from the detail rows, not vice versa, precisely to avoid impacting get_user() performance, and to not open any security surprises.

3 Likes