Speeding up Client Code to Avoid Timeouts and "Server code took too long"?

Hey- this worked for my initial functions. So thank you for chiming in.

However, when I went to carry on with the rest of my app that was already built, doing unrelated operations, I received “anvil.server.TimeoutError: Server code took too long” on functions that have been running live, in production, on my site for the PAST 3 WEEKS. Now I get this error EVERY SINGLE RUN. I even updated my package from $54usd to $329 per month, and I’m still getting timeouts.

I’m positive that Anvil implemented some material change in the past week that is causing these issues.

@meredydd Can you please chime in on this? My app is effectively crashing for my users who are now requesting refunds. I’m looking at an entire code overhaul on this project based on these timeouts. I’ve spent 6 months building this app, trialing it hundreds of times, and I have never received a timeout error except in the past week. This is killing my business.

You may want to email Anvil support, too. That’s the official channel for support requests. I don’t know if it will get a quicker response or not.

Thanks, @jshaffstall! I’ve actually emailed them already and they’ve said ‘please check the forums’, or to upgrade to Support Plans. I’d request confirmation that there hasn’t been a change to their back end before upgrading to anything. I’m at a loss here. My apologies if I seem frustrated- it’s just that I’ve spent 6 months on this, and suddenly one day- for a reason no can explain- none of it works.

Your frustration is understandable. I can say that upgrading to higher tiers of paid plans rarely solves performance problems, with the notable exception of performance problems relating to the spin up of server instances (importing expensive libraries, for example, can be mitigated with the persistent server option available at the business level and above). In general, though, higher level plans help you scale to more users, not help a single user get better performance.

So if you want faster support, downgrade your plan back to where you had it and funnel some of that into a support plan until you get the issues sorted out.

5 Likes

I understand the frustration, but at this point all you can do is figure out where is the problem.

You can wait for the support to debug it for you, or you can start investigating by sprinkling prints and finding out what line times out.

You have already done it, but the code was still jumping back and forth between client and server.
What are the prints telling you now?
Is it always crashing at the same line?

When I deal with background tasks, analyzing the session logs created by the prints can be a pain, so I log by adding rows to a table rather than by printing to the standard app log.

If your app doesn’t use transactions, then you can create a function that adds a row to the log table and decorate it with @anvil.tables.in_transaction. If you do use transactions, then it can get tricky and you may be better off keeping your prints.

1 Like

Can you give us an example of one of the functions which is timing out?

1 Like

I can fix the functions so they aren’t timing out- it will just take me days. The bigger question is why are they all suddenly timing out? Yes, my program is clunky (it’s my first anvil.app), but this only started a week ago after a month of operation. I never had timeouts during trials, testing, or beta launch.

Recently, I added about 200 different tables, as each one represents a user (I know it’s not the most efficient, but each user has numerous workspace etc). Could these extra tables be slowing down my application, even though they are empty, and not used by users just yet?

The tables themselves shouldn’t impact performances, how they are used definitely yes.

I’ve never seen an app with 200 tables, but if I added 200 tables to an app and it became slow, I would blame, or at least investigate, that before blaming Anvil. That’s a huge piece of information that was missing until now.

Here are two comments:

  1. Making a usable app is very different from making a usable and scalable app.
  2. I have no clue what those 200 tables are for, but I have the strong feeling that there is a better way to do whatever you are doing.

Regardless, printing timestamps would help.

3 Likes

Thank you Stefano. 4 things:

1- printed timestamps above. Again, issue isn’t which ones that are failing suddenly now (as I know how to fix them), it’s why the timeout issue is happening at all.

2 - this is my first app. I have space for 110 users, each with 3 different workspaces (330 tables total). I have a user table with 3 columns for workspace 1/2/3: ie. user[workspace_1] = data_table_100. Often, the app searches the user table for the workspace# column, then the user-email row, locates the table name, then goes and finds that data table and looks up a row/value. It’s not efficient, but I added 230 more tables up from 100 (although only tables 1-30 in use).

3- Im going to delete tables 200-330 tomorrow and hope for the best.

4- I was having a hard time putting all variables on one table. Perhaps each user has 3 workspaces, and 40 variables per workspace. Then I have my users table rows of user email & columns of workspace1/23. Then 3 other tables, with rows of variables and columns of user emails. My issue is I don’t have time to rewrite this. I’d rather just pay $300/month for timeouts of 60 sec and be done with it while I fix it over the next month.

Do you have any ideas for my user management? Any thoughts on deleting excess tables?

Could you rewrite the tables and consolidate a lot of the data into columns that take JSON as the value? Then one column could hold the data of several columns and/or whole table(s).

One workspace table that links to the Users table to tell you which user owns a workspace. Add in a name column so each user can have multiple workspaces. Add in a simple object column so each user can have as many variables in the workspace as they like. One table and three columns.

3 Likes

You can only answer this question after looking at where it crashes, by looking at the printed timestamps.

If I understand, one table with two columns for user and workspace number plus one json column for all the variables would do the job, with one single query. Something like @jshaffstall suggested.

This is the difference between a working proof of concept and a scalable app.

I had an app running for 4 years in production, I had not even opened it in the IDE for years. A few months ago I noticed that it was slowing down, I tried to figure out what was wrong, I found the bottleneck, I contacted the Anvil support, they tried to help me for a few days. In the mean time I implemented a workaround and after a few days they told me I hit a structural limitation of the Anvil database, and my workaround was the best solution.

In this case the problem was that a table grew to a size that caused the queries to be very slow, and the workaround was to split the table in two, a slow large historical backup and a small and fast working set, with a weekly scheduled task that moves last week rows from the working set to the backup table. The app searches in the working set first, then automatically falls back in the backup, if required. (I don’t really understand what causes that problem, as I have other tables with 10 times the rows or 10 times the columns more than that one, but I feel like I’m good for a few more years.)

Summary: I had a working app that after 4 years stopped working, but it wasn’t Anvil’s fault, it was a scalability problem. Well, if this database had been faster, I wouldn’t have had that problem, but I am free to switch to a faster external database, or stick with Anvil’s. And I decided to stick with Anvil’s because one struggling week here and there is easier and less time expensive than managing a better performing external database (for me as developer, not for my users, but I’m selfish :slight_smile:).

Over the years I’ve seen cases where Anvil has changed something that has broken my apps, but when that happens either it affects hundreds of users, so they are working on it faster than I can even start debugging, or they are very responsive and immediately respond to my requests.

So, yeah, an app is never done. It will always require some maintenance.

2 Likes

If I were you, I think I’d put that money toward a support contract or otherwise hiring an experienced Anvil developer to work with you for a few hours, screen sharing.

4 Likes

I’m 99% sure it’s this. You think these things are isolated, they are not.

Every time you load any server module, if any server module in the app is using:
from anvil.tables import app_tables

Then it is cashing every single one of the hundreds of tables in the app_tables object as table objects.
I am unsure how long or how much resources this actually takes.

I don’t want this to sound mean, I just don’t want you to waste any more of your time. In this post over a month ago, every person tried to convince you you should not build out so many tables this way and gave you other suggestions.

If this is causing the rest of your live app to have problems, you should find a way to save the work that you currently have, then revert to before you built this feature into your app so your users no longer have a bad experience.
(Maybe make a clone copy first so you don’t lose the table structures etc.)

You then need to rebuild this entire feature from scratch without using hundreds of tables, and re-launch it.

Of course this is all just my opinion, you don’t have to do anything you don’t want to.

4 Likes

I second this wholeheartedly. Hitting a scaling point is a great time to refine your data model if you didn’t start with a data model before writing code.

The 200 table set up just won’t be efficient, no matter whether you stick with anvil DB or go external.

That design certainly seems to be where the refactor needs to take place. It sounds like you should be linking rows in table to user account rows. Columns filled with JSON are another possibility, but linked rows tend to perform faster if your data structure can be defined by a database schema (I tend to only use columns with JSON when the schema of that data is dynamic).

1 Like

Hi folks,

This is a long thread to respond to, so I’m probably going to miss some things, but:

First, to @taylorerwin, congratulations on your app getting enough usage to hit scaling limits – but also, commiserations on suddenly having to learn about architecting for performance while under fire. It’s one of those “good problems to have” that is nevertheless a very annoying problem!

Some observations:

  • I agree with the general consensus here that a separate table per workspace is a solution that will scale poorly. Instead, the usual way of doing things is to have one table for all the records in every workspace, with a column linking to the workspace each particular row is part of. (Then a table for the workspaces themselves, linking to the users that own them, etc.) As you’re refactoring to this design, you might even be able to re-use a lot of your code by using restricted views: a view restricted by the “workspace” column will behave very similarly to one of your old per-workspace tables!

  • Reading between the lines, I’m guessing what’s happening is that you’re making a huge number of server calls and launching a huge number of background tasks at once, and that’s overwhelming what your server environment can cope with, which is causing even very small operations to time out. This will be intermittent and difficult to debug (depending on who else is using your site, etc). This would also explain why this behaviour started suddenly when you got a few more users – a system using 70% of its possible capacity is sluggish but usable, whereas if it’s trying to use 130% of its capacity then 1 in 4 requests will be failing (and probably by timeout, thereby slowing everything else down as well!).

    The way that Anvil works, every server call (if you’re not using Persistent Server) and every Background Task (always) runs in its own Python process. Launching a Python process is a relatively expensive operation, so you want to cut down on how often you do it. (This is one reason you saw better results when you started batching things into fewer server calls, and/or launching fewer Background Tasks.) Usually, most of the load is from server calls, so the advice “upgrade to the Business Plan” is good – that enables Persistent Server, which saves you from the overhead of launching a new process every time. However, you appear to be launching mostly background tasks, which might be why your Business Plan upgrade didn’t buy you much. (Although it’s worth checking – did you turn on Persistent Server after you upgraded?) I gather you’re launching all those Background Tasks for parallelism. There may be better ways to achieve this - perhaps you can share a bit about why you’re doing this?

  • One possible red herring above – as it happens, passing a table object (eg app_tables.foo) from server to client, or into a background task, is actually pretty lightweight. (Loading the names and columns of hundreds of tables is a bit heavier, but that will be taken care of by the first point above.)

3 Likes