Data Table, Order / Filter by Column in Link Table

mcmasty · September 12, 2019, 3:07pm

Hello,

I am trying to order by a column in a joined table. I can obviously sort the list in python, but as I accumulate more data, I will want to filter the results from the database before the result set is returned.

The data model is as follows:
2 core tables: Athletes, Races
1 relationship table: Athlete-Races

A single athlete can sign-up for multiple races.
Multiple athletes can attend the same race.

I am currently trying to display the list of races that an athlete signed up for, or a profile page for that specific athlete.

My server code is (hopefully self-explanatory):

  ath_races = app_tables.athlete_races.search( athlete_link=ath_row )
  
  for ar in ath_races:
    race_list_result.append({'race_date': ar['race_link']['race_date'],  #.strftime('%d-%b-%Y')
                            'race_name': ar['race_link']['race_name'],
                            'race_priority': ar['race_prio_link']['priority'],
                            'race_priority_description': ar['race_prio_link']['priority_description']})
  return race_list_result

Is there a way to use the query object to

only select future races ?
order the result set in proximity order ?
change format of date ‘%d-%b-%Y’

I am using a DataGrid on the Athlete Profile page to display the results.

Obviously one solution is to denormalize the race-date value and have it both the Races Table as well as the Athlete-Races relationship table, however, I thought I’d seek a solution that works with a Normalized data model. Any Thoughts ?

From a theoretical perspective, I can see the usefulness of being able to filter query results on a column that accessible via inner join. (In my case, filter the athlete-races table by columns in the race data table … e.g. all races in a location, such a California or Canada, aggregating by date, e.g. athletes racing in September)

I know how to write all the SQL, if there is a “Pass-Through-SQL” mechanism.

Thanks in advance.

Cheers,
Tyler

campopianoa · September 12, 2019, 3:14pm

Hello and welcome,

Thanks for the details. It might be easier to assist if you are able to share a simple example app that demonstrates the table set up that you are working with (at least I find that easier to play with).

I did not read too carefully; however, have you seen tried this general sorting functionality?

people = list(app_tables.people.search(tables.order_by("Name")))

I’m not aware of any built-in way of returning dates in a specific format from the datatables. One way to do it would be to use the datetime module in Python to convert your date formats once you have pulled your records. Perhaps others know a better way.

Sorry if you mentioned this, but is your relationship table a separate table? I’m just curious if you have considered using a linked row instead of a separate table. This is explained here.

If future races are determined by a date then you could probably use Anvil’s query operator to select dates greater than the current date. Please see the documentation here.

mcmasty · September 12, 2019, 3:51pm

@alcampopiano Thanks for the quick response

I am using LInk Columns.
My relationship table is an independent table.

Relationship Table is the core table in the query, and thus does not have a “Name” attribute…(this is the crux of my question, as the “Name” attribute in your example would reside on Athlete Table or Race Table depending if you are looking Athlete Name or Race Name)

Here is an App to demonstrate:

https://anvil.works/build#clone:VGOVZSF7MCPREQDA=XMWHRIH5DHXE4T2DVC7GKBUD

If you select athlete “Leonard”

I would like Independence Marathon excluded from the list as it is in the past.

I would like data presented in date order (i.e. for Leonard, Turkey Trot BEFORE Irish 70.3)

let me know if you have additional questions.

Cheers,
Tyler

campopianoa · September 12, 2019, 4:22pm

Okay, I had a quick look and came up with an example that could get you started.

Feel free to show code to demonstrate what you have tried.

Note that as I mentioned above, you could consider using multi-row links to associate athletes with races. You wouldn’t need a separate athletes-races table.

So, I have made my example to demonstrate that point and other techniques that you can extrapolate from:

You will see that I do the following:

select a single athlete’s races (say John’s races).
then, sort John’s races by date (but you could sort by anything)

Again, the datetime module could be used along with the query operator if you want to avoid past dates.

The rest of the techniques are mentioned in my post above (i.e., the query operator, order_by, row links, datetime module).

Hopefully this can get you started.

clone:
https://anvil.works/build#clone:5V75FK6GY55ENYLU=F3LTGMYE2TSIZJON7JBJSIUG

relevant code:

ath_races=[r['location'] for r in app_tables.athlete.get(name='john')['races']]
races=app_tables.races.search(tables.order_by('date'), location=q.any_of(*ath_races))

mcmasty · September 12, 2019, 5:22pm

@alcampopiano

Interesting Code snippet. I’ll chew on it a little bit. Seems like there will be some mileage. Thanks.

In the same way I was ruling out de-normalizing my data model, I was also ruling out multiple database queries. That is I was trying to do all the data manipulations this with a single access to the database.

In the near future, I’ll also want to get Race related metadata counts (for example number of “A”, “B”, or “C” races per athlete this season, number athletes at each race, etc)… I’m guessing the solution will be a similar pattern to what you show above, that is a chaining together multiple database queries and then do the aggregation / grouping in python server code instead of the database.

FWIW, The reason I used 1:1 linkages is that I need to hang an attribute off the relationship.

A single race may be the the top priority (dubbed “A” priority) for athlete (Leonard), but that very same race, would be a slightly less important (dubbed “B” priority) for different athlete (e.g. Julie) … and for a third athlete, it is basically a “Training Day” (dubbed “C” priority)

This is important b/c the planning & schedules should be shifted to make sure the highest priority events (so called “A Races”) are driving the schedule. The priority rating is for a single athlete at a single race.

Its not clear to me, how I would associate a race priority with multi-link.

Anyway, I think I can accomplish my immediate blocker with multiple queries, so thanks.

Cheers,
Tyler

hugetim · September 12, 2019, 9:12pm

Here’s a simple change to only select future races (though you could alternatively use a list comprehension):

import datetime
for ar in ath_races:
  if ar['race_link']['race_date'] > datetime.datetime.today():
    race_list_result.append({'race_date': ar['race_link']['race_date'],
                             'race_name': ar['race_link']['race_name'],
                             'race_priority': ar['race_prio_link']['priority'],
                             'race_priority_description': ar['race_prio_link']['priority_description']})

To sort, add this line:

return_list.sort(key=lambda r : r['race_date'])

For formatting the date, I think your .strftime('%d-%b-%Y') may work (though I have no experience with this method myself, so I’m not sure), but you need to import the datetime module as above. Also, you’ll want to format after sorting, or else add the formatted date as a separate item in the dictionary.

p.s. Here’s an alternative using a list comprehension. I haven’t tested it, but this is what I have in mind:

import datetime
result_list = [{'race_date': ar['race_link']['race_date'],
                'race_date_str': ar['race_link']['race_date'].strftime('%d-%b-%Y'),
                'race_name': ar['race_link']['race_name'],
                'race_priority': ar['race_prio_link']['priority'],
                'race_priority_description': ar['race_prio_link']['priority_description']
               } 
               for ar in app_tables.athlete_races.search( athlete_link=ath_row )
               if ar['race_link']['race_date'] > datetime.datetime.today()]
return_list.sort(key=lambda r : r['race_date'])