Sorting a data table by a subcolumn of a linked column

jonathan.falk · August 21, 2020, 2:55am

A Data Table has a column which contains a link to another table. I know I can retrieve a particular value from the linked table by Row[‘LinkCol’][‘ForeignCol’]. But is there any way to sort the rows by this foreign column? The only way I can think to do this is to make the subcolumn an explicit column, which seems to undermine the point of a linked column.

bridget · September 7, 2020, 3:19pm

Hi @jonathan.falk,

There’s no simple way to do this currently, and I agree it would be a great addition! Moving to feature requests

anthonys · November 17, 2022, 7:58pm

Reviving this feature request! +1

KR1 · July 20, 2023, 10:15am

Yes, it would be a great option.

nickantonaccio · November 2, 2023, 10:23am

Has this been implemented?

hugetim · November 2, 2023, 4:17pm

Not that I’ve heard. Is there something that made you think it might have been?

(Dedicated/Enterprise users can do it with SQL, I assume. For the rest of us, the workaround is to do something like list(rows).sort(key=lambda row: row[‘LinkCol’][‘ForeignCol’]).)

nickantonaccio · November 2, 2023, 7:09pm

I’ve done:

sorted(selected_patients, key=lambda x: x['Patient']['Last_Name'])

It’s never been a problem for my needs, just curious if I’d missed some documentation somewhere, and wonder how much more performant/flexible and stylish it could be if everything was handled by postgres

stefano.menci · November 2, 2023, 7:55pm

The sorting would be thousands of times faster and would use less memory if done by postgres (assuming that the tables have the correct indexes).

Your sorted function loads all the rows in memory, converts them to python and sorts by accessing the data using the nested dictionary. All the users need to fit in memory, and if you have millions of users, the server would blow up.

Configuring the postgres database with the correct indexes, the index would be updated while inserting the rows into the database. When reading your rows, you would read them one by one, not all at once, and they would be already in the correct order, no need to waste time sorting them. If you need only the users with last name starting by “M”, postgres would give you only those starting by “M”, while using python you would need to read everything, then sort everything, then scan the list until you find the ones starting by “M”.

If you are using Anvil, you are unlikely to have this kind of scaling problems, but I answered the question about postgres performance, not about the correct Anvil solution.

Obviously if you don’t have scaling problems and you are dealing with a small number of users that easily fits in memory, there is no need to waste your developer time. Save yourself the money and the time required to setup a well designed database and waste a few thousands of a second every time you need that list.

What you did is just fine.

nickantonaccio · November 2, 2023, 8:09pm

In my example, “selected_patients” represented the results of a DB query, and that’s the expected use case (typically sorting results for a view, never all the rows in the database, or millions) - so useful for my needs. I also understand, of course, that anvil.works needs to withhold some features for more expensive products, and I support that fully. If they go out of business, we all lose a productive tool.

stefano.menci · November 2, 2023, 8:13pm

A good way to use the Anvil tables is to get a query to return a manageable number of rows, then to work with those rows in memory, using the standard python tools.

Taking advantage of the recent addition to the accelerated tables that allow to exclude some columns from the query result, allows to reduce the amount of required memory (when the table includes columns that are not used in the current query).