Data table support for aggregate functions

Is there a way to efficiently retrieve summary information from a data table?
e.g. the number of orders at each possible order status
If I was doing this outside of Anvil, I’d simply use SQL count and group by

2 Likes

Hello @ahawes and welcome,

I think it will depend on which summary information you would like and how your tables are set up.

Using the DataTables API you can do something like the following to get the number of orders with a particular status (not a groupby).

num_rows=len(app_tables.orders.search(status='in_stock'))

To get a groupby-like operation, perhaps numpy could help:

import numpy as np

my_col=[r['order_status'] for r in app_tables.orders.search()]
unique, counts = numpy.unique(my_col, return_counts=True)
my_counts=dict(zip(unique, counts))

# returns a dictionary where each key represents the order status and the values are their counts

In terms of using SQL directly, the docs here say:

Users on Dedicated Hosting or Enterprise plans can access the data in their Data Tables using SQL directly. To get started, contact support@anvil.works to get your dedicated database credentials.

1 Like

Personally if I need to do something like this I will load the subset of data I need (I’ll subset by first querying the table for what I need) into Pandas. Pandas has copious amounts of aggregation options.

I don’t (yet) see a performance issue with Pandas, but I also am careful to write Pandas code that can be executed via arrays and therefore take advantage of the Cython integration.

I do use SQL aggregate queries.
They are available on my dedicated plan, I don’t know on what other plans they are available.

Thanks for the suggestions. I will try them and see what works best for me

+1 for Pandas.

@ahawes I am sure you have already seen the post below, but if not, it is very helpful.

2 Likes

itertools.groupby can be used on a data table query. Both return iterators, so it’s a reasonably efficient technique.

5 Likes

I was just looking through the documentation and forum to find aggregation functions. As far as I can tell and from what I see here, it looks like there is no “group by”-like functions intrinsic to data tables.

  1. Is that the case?
  2. Could we keep this conversation alive and move this post to a feature request? Or is there no overall motivation to add these functions? Is it just recommended to use these alternatives (pandas/itertools/sql) instead?

It would certainly simplify my code if data tables had built-in aggregation. It also looks like plenty of other people have had to create a plethora of alternatives.

Hi @gweber.lauraandleigh,

  1. This is correct. Anvil doesn’t support groupby lookups out of the box.
  2. Please do create a Feature Request, and I can then add it to our list :slight_smile:
1 Like