Warning: client_writable limitations and gotcha's

splinter · March 19, 2020, 3:58pm

Documenting currently undocumented limitations of these: https://anvil.works/docs/data-tables/data-security#views

At first blush, client_writable views seem to give a way to skip writing lots of custom callable server functions to return filtered search iterators to the client.

However, any real world usage attempts quickly break down in the following scenarios, for reasons that aren’t apparently necessary.

Adding any search criteria (keyword parameters) results in crippled view rows. Crippled how? Understandably and appropriately, the columns/fields used in the search criteria are not visible in the view, which means a view row is not the complete row. This currently means:

1.1 You can’t use a view row in a query on another table with any kind of relationship to the table being returned in the client view, because it won’t match. So if you have a filtered client_writable group view and attempt to use a row from the view (which is effectively a group, a database record for the group, minus the field/fields that were used to filter or narrow down the full list of groups), and you attempt to search users using the view row to return a list of users that have that group membership, it will NEVER work because “you can’t match a full row with a view row” or some such error. This makes view rows effectively useless for search matching as soon as you try to take advantage of them to limit row visibility for security purposes.

There is no apparent reason for this, that I can see, since the row is clearly known to be the same row, the row or record id exists, I can see no reason why a partial row could not be made to match in a search since the only thing that actually needs to match is the group record id. But…

1.2 You also cannot use a limited view row (one missing even a single field from keyword parameters in the view definition) to establish relationships. So, for instance, using the filtered list of groups, none of those “view rows” can be used in adding a group relationship to a user, only a full row record can be used to add a relationship. So again, gotcha. Don’t try to build anything serious, with user collaboration or varying permissions using client_writable views.

This again is a limitation with zero apparent reason. The row id is known, the row type is known (group row, etc), but rather than just use the id to create the relationship, retrieve the full record server side if necessary using the id, but just create the relationship, you get an error about view rows.
The last great gotcha is that as soon as you need to add any complexity to the keyword parameters, you have to abandon views. The newer query operators (https://anvil.works/blog/querying-data-tables) are completely unavailable for the keyword parameters of views. So you get used to the power of them, discover client_writable views, start trying to use them, and run smack into a wall where you have to tear your views back out as soon as you need to do anything complex. The apparent workaround of appending .search(…, …) to a view definition just leads back to returning a search iterator that cannot be used to search dynamically with keyword parameters on the client side, defeating the whole point of using client_writable views in the first place.

The bottom line:

client_writable views are a trap.

Do not use them in their current limited form for anything but throwaway, toy projects (and the knowledge of how they work/where they’re helpful is just a waste as well since you can’t actually use them for anything serious).

Best advice? Ignore that they exist. The feature should either be removed or actually made robust. In their current incarnation, with these undocumented limitations, they are a developer time trap/time waster. Using them for any serious app, you will just end up frustrated, tearing out code and redoing it using custom server calls that return full rows for each different set of keyword parameter scenarios you need to handle.

TLDR: Don’t use client_writable views.

Exceptions

Useful for avoiding timeouts on large file uploads, see @owen.campbell post below.

owen.campbell · March 19, 2020, 4:11pm

I couldn’t agree less. I regularly use them to upload data where an ordinary server call might lead to a timeout due to the payload size: Timeout on large file upload

splinter · March 19, 2020, 4:14pm

If this thread flushes out the narrow set of circumstances where client_writable views are still worthwhile, and maybe leads to some documentation of best practices for using and not using client_writable views, then I will consider my work here to be done. Thank you for adding to the full picture of this feature.

My primary issue is this:

The only limitation listed on the page documenting this feature (https://anvil.works/docs/data-tables/data-security#views) is “Columns cannot be automatically added to a restricted view.” at the very end of the page.

If the real, crippling, limitations I’ve now documented were on the page, I wouldn’t be tearing out views all over the place. My hope is that this post leads to either some clear statements of limitations (Do not use client_writable views in these scenarios…) in the official docs, or serves as a cautionary sign for a developer in the future, and saves someone else the time of tearing them back out.

stefano.menci · March 19, 2020, 4:43pm

I have never used them on my own apps, but I have always used them when teaching other people how Anvil works, the dos and the don’ts.

I always start with showing what magic Anvil can do, usually by giving a form read and write access to a table. Then I talk about how the magic works and why some magic shouldn’t be used. Then I show what are the correct approaches from the security and from the performance point of view.

When I play around with some ideas I always pass rows to the client.
When I work on real apps I never pass rows to the client. I always use server functions that collect the data, including converting all the relations to values, package it into dictionaries and return it to the client.

splinter · March 19, 2020, 4:51pm

You’re converting relations to values by storing unique row IDs? Doesn’t converting the full dataset into a dictionary translate into severe performance problems with larger data sets? You’re not taking advantage of any lazy loading from a search iterator.

I would love to see a sample app just to understand the details of how you’re implementing this pattern, but I struggle to see it being generally appropriate. Maybe I’m missing something?

stefano.menci · March 19, 2020, 5:18pm

If you have a form that shows user names and their groups, you could pass a query (ie is an iterable with rows from the Users table) and the form could use user['name'] and user['group']['name']. This would lazily access the linked Groups table to get the group name, but it would do it from the client, triggering a burst of round trips, one per row.

Perhaps Anvil is smarter than that and, instead of triggering one round trip per row, will fetch many rows at once and cache them.

Either way, you have a performance hit because you are too lazy and you do too many round trips or because you are too greedy and you cache too much stuff.

My approach is to fetch as many rows as I need from the Users table, resolve all the relations and pass them all to the client. In simple cases I do it using the app_table interface, in more complex cases I use SQL which will get everything in one joined query.

At this point I have all the data as fast as possible. Now the performance bottleneck is the client rendering. If this is too slow, I use a timer to render the rows one by one, so the user will immediately see the first ones while the ones below take their time to render. And sometimes I add the management of the scroll events from javascript to get a form that dynamically loads more rows.

Summary, from slowest to fastest:

round trips: make sure you get all you need in one round trip
queries: use app_tables for simple cases, SQL if you have too many relations
number of rows: get all the rows if they are not too many (if it’s fast enough), otherwise only get as many rows as the form will show
rendering: use repeating_panel.items = items when it’s fast enough, use a timer with 0.01 seconds to add one form at a time to the repeating_panel

splinter · March 19, 2020, 5:34pm

I would love a sample app with stair step examples, for instance, a form that grows in complexity, starts off with app_tables then shifts to SQL for the same data set, then the data set exceeds what can reasonably be retrieved, detecting that and shifting to only as many rows as the form will show, and then shifting to using a timer with 0.01 second interval to add forms to a repeating panel when rendering hits a wall.

I know that’s a lot to ask, won’t assume an example, will have to experiment with all this more.

splinter · March 19, 2020, 5:34pm

I believe Anvil is smarter than that.

splinter · March 19, 2020, 5:36pm

This seems very GraphQL-ish, but an example would be helpful. Are you saying you’re retrieving/returning two lists? In this scenario, a list of users and a list of related groups so an additional round trip isn’t required for rendering?

meredydd · March 19, 2020, 5:49pm

Hi @splinter,

Sorry to hear you’ve stubbed your toe on the limitations of Data Table views - that sounds like a pretty frustrating experience! We will update the docs to make those limitations clearer, and hopefully keep future developers out of the rabbit hole you fell down. In general, we are aware that the view support hasn’t kept up with newer features like table queries; some of this is for structural reasons in the implementation (detailed below if you’re curious) which we are working to address (watch this space)!

If you’re curious about what’s going on under the hood, I can talk a little about the reasons for these restrictions:

Searching using view rows: This is the simplest to address (it’s a straightforward refactor), and we agree it ought to be possible. As @stefano.menci points out, it’s more or less equivalent to my_table.get_by_id(view_row.get_id()), which is allowed if you have access to the relevant table.
Establishing relationships: Because Data Tables use a capability/“possession-is-permission” model, having access to a row also gives you access to everything that row links to. And if you navigate via a link, you get the full, unrestricted row. If we let you make links to restricted-column rows, then you could make a link to a restricted row, then follow that link and get the unrestricted version of it. Not safe! So, sadly, this restriction is likely to stay (at least on the client side; we can make it possible on the server because the server code can already access anything on the underlying table).
Lack of support for complex queries: This is an artefact of our “Live Object” system (which you can see me describing towards the end of this talk from PyCon 2018). The way that client-writable views work is to embed the view restrictions into the ID of the LiveObject that represents the view (because that’s covered by the signature). However, those IDs can only be JSON, and rich queries support complex Python objects which don’t necessarily have a single JSON encoding.

Our existing LiveObject implementation is being retired and replaced by something considerably more capable, which will give us a lot more flexibility. We will then be able to make views considerably more powerful (not to mention improving the performance of ordinary data table operations).

Thanks for your patience, sorry about the stubbed toes, and - as I say - watch this space!

splinter · March 19, 2020, 5:56pm

This (and the support for complex queries) will be wonderful additions. Understand your points on the limitations and security implications of establishing relationships using view rows, but your point that it could be made possible on the server side leaves me wondering if that means that there would be an implementation pattern that allows links to be established with a server function call, passing the view row, and at that point, have views work for nearly any scenario with the caveat that using them in an add_row() call from the client side would be blocked.

Still tearing out code for a while, but I look forward to maybe trying views again at some point in the future.

stefano.menci · March 19, 2020, 7:08pm

I usually return one dictionary with everything needed by the form.

For example, if my form shows a filtered list of users in pages of 20 rows and the full list of departments, I would do something like this:

On the server:

@anvil.server.callable
def get_user_info_first_page(name_filter):
  # get the first 20 users and their group names
  users = [
    {'name': user['name'],
     'group': user['group']['name']}  # this resolves all the links at once on the server side
    for user in app_tables.users.search(name=q.ilike(name_filter))][:20]
  ]
  
  # get all the department names and locations
  departments = [
    {'name': department['name'],
     'location': department['location']}
    for department in app_table.departments.search()
  ]

  # return both the lists in the same round trip
  return {'users': users, 'departments': departments}

@anvil.server.callable
def get_user_info_from_to(name_filter, n1, n2):
  return [
    {'name': user['name'],
     'group': user['group']['name']}
    for user in app_tables.users.search(name=q.ilike(name_filter))][n1:n2]
  ]

On the client:

  def form_show(self, **e):
    stuff = anvil.server.call('get_user_info_first_page', name_filter)
    self.users.items = stuff('users')
    self.departments.items = stuff('departments')
    self.page = 0

  def next_page_click(self, **e):
    self.page += 1
    self.users.items = anvil.server.call('get_user_info_from_to', 
                                         name_filter,
                                         self.page * 20, 
                                         (self.page + 1) * 20)

junderhill · March 20, 2020, 12:42pm

Meredydd, can you tell us a little more about the replacement for the LiveObject implementation? When do you think this will become available? Sounds great.