Creating Many to many relationship with association table

Philippe · October 6, 2020, 5:51pm

Hi!

I’m new to the framework and mostly used to fastAPI or Django and I am trying to create a many to many relationship including relationship data like “quantity” with anvil tables.

Is there a “good” way to do it or should I make 3 tables with only the association table holding foreign keys? Here is the current way I would do it:

Table 1 => 1 column: name
Table 2 => 1 column: name
Table 3 => 3 column: table1_item, table2_item, quantity

Ideally I would like to have the inverse relationship in table 1 and 2 or some way to automatically create table 3 under the hood.

Thanks in advance for you kind help !
Philippe

p.colbert · October 6, 2020, 6:12pm

Welcome, Philippe!

“Good” usually means “makes my intended (and anticipated) use-cases easier to write, to read, or faster to execute.” So, it really depends on which operations you expect to do, what circumstances (data and relationships) they’re starting from, and what circumstances (data and relationships) you want to have as a result.

I sometimes summarize each (high-level, abstract) operation as follows:

Given ________________
I want to _______________
Resulting in ______________

Two Anvil features that may influence your design:

Row IDs. Every row in every table automatically gets its own unique ID. You don’t need to give it another unique identifier (unless you need to tie it to some other database table outside of Anvil).
"Link" columns. A “link” column “points to” zero or more rows in a specific table. (It’s a list of row ids.) With a column value of this type, each row can directly implement its own one-to-many relationship.

Philippe · October 6, 2020, 6:23pm

Hello!

Thanks for this quick explanation and for sure I can improve my intent !

Given 2 tables, Table 1 and Table 2 which are symetrical.

I want to link them in a many to many fashion (Table 1 can contain n Table 2 items and vice versa)

Resulting in the link to be visible by both table and to contain additional informations related to each specific link between any item of Table 1 and Table 2 such as a quantity or a price !

From your answer, I have the feeling that Anvil only supports one to many relationship and a many to many relationship should be written manually through a third table which would be the only one containing links to Table 1 and 2. Following my understanding of anvil table mechanism, I would think that Table1 and Table2 could not “show” Table3 in the DataTables interface because links come from Table3.

p.colbert · October 6, 2020, 6:54pm

You are right: Anvil directly supports a one-to-many relationship with its “Link” column type. “Many-to-many” has many possible variations, so it is something you should construct, in a manner that suits your uses.

When I think “many-to-many”, I think of it at the row level, not the table level: one row in Table 1 can be associated with many rows in Table 2, and vice versa. I believe this is what you have in mind, too.

It may not be good to think of Tables as “containing” rows from other tables. Some databases do allow that, but in Anvil, Tables don’t nest like that. Even at the row level, rows don’t “contain” other rows. Rather, they can contain references (“links” or “pointers”) to other rows. But the syntax, row_x['row_y']['price'], can make it look like containment.

If you really do have a “containment” situation, a hierarchy of records, then there may be another option: the Simple Object column. A Simple Object value is essentially a JSON document: JSON objects/lists, that can literally contain other JSON objects/lists, nested as deeply as you want to go.

This specifies a third kind of entity: information (quantity, price, etc.) that is specific to the relationship between two specific rows. Standard relational database practice would create a Table (your Table 3) to contain such entities. That is perfectly sensible to me.

Table 3 would appear in the Anvil IDE just like any other Table in your App.

It is certainly possible for a row in Table 1 to contain links to corresponding rows in Table 3 (“backwards” links). Technically, however, that’s redundant. Keeping redundant data is likely more work than it is worth. It also makes new kinds of inconsistencies possible. On that basis, I would recommend against it.

Philippe · October 8, 2020, 8:31am

Thanks a lot for your answers, I think that as you propose, I’ll stick to the basic approach of 2 main tables and 1 association table with foreign keys!

Another question related to many to many relationship and query optimization (It might deserve another thread):

If I retrieve an element which has a link and then I access the object pointed by the link. Will it be 1 or 2 queries to the underlying db ? In a wider picture, If I have an element with many links associated to it (In a one-to-many fashion), and I want to load every items linked to this element, will it be n + 1 queries to the underlying db or 1 or 2 (main fetch + links fetch) ?

Sincerely,
Philippe

p.colbert · October 8, 2020, 12:40pm

It depends on how you define “query”. I think of “query” as a search of a table or index. One query will net you N matching rows from Table 3. Each row contains a link to Table 2 and a link to Table 1.

Is following one of those rows’ links a “query”?

Suppose you use traditional identifier, i.e., a unique number or string or datestamp. Then you must call query() to find the matching row.
Suppose you use a Link column instead. Then following the link still requires a database access, under the covers, to fetch the row.

Option 2 is probably much faster than Option 1. But conceptually, does it count as a “query”? Opinions vary.

On the Client, fetching a row requires a round-trip call to the Server. So there’s communication lag on every fetch. To reduce this overhead, you can write a Server routine that fetches (and returns) all the required rows at once.

Philippe · October 8, 2020, 1:19pm

Ok ! If I understood you, if I do something like result = app_tables.Table3.search(), I will retrieve every row of Table3 and everytime I do result.link, I am actually doing a full roundtrip ? Thus if I have n rows in Table3, I will execute 2n + 1 roundtrip ?

p.colbert · October 8, 2020, 1:28pm

Thanks for the correction: I should have written search instead of query.

It’s closer to 2 + n:

One to fetch the “source” row.
One to fetch just the matching rows from Table_3. (Something like matching_rows = app_tables.table_3.search(source = source_row) )
One to fetch each of the target rows.

for match in matching_rows:
    target_row = match['target']
    # do stuff with target_row

Philippe · October 8, 2020, 1:32pm

Ok!

And If I wanted to do everything in a single search? Should I collect all the links which I think are the ids and search them in a separate query ?

I don’t know if we can actually put a list as an argument of the search method !

p.colbert · October 8, 2020, 1:45pm

This is the point where it becomes worthwhile to skim/search the docs and tutorials for things like Storing Data in Data Tables.

It would not be worthwhile for me to repeat all that! And you’re sure to find examples, with answers to other, related questions you might have.

If you’re going to go over Table 3, and cover all of its linked rows, then you might want to consider order_by, to group the results by source (or target) row.

I would further suggest doing as much work (collection and ordering of results) as possible in a Server function. That function will have faster access to your tables than anything on the Client. It can get more work done in less time.

To your Client code, your Server function would have the effect of a more complex search, no matter how you decided to write the Server function.

Philippe · October 8, 2020, 1:55pm

Thanks for all these insights !

I’ll go search deeper into the docs and tutorials for the remainder but now it’s clearer in my mind how anvil actually process queries and how to access data smartly!