Database calls: single vs multiple (cost and performance)

adhamfaisal · February 16, 2019, 6:54pm

Is it better to make a form that submits:
A: a single call to transfer “multiple rows” of data (at once) to the database
or
B: multiple calls to transfer every single row of data (individually) to the database
I want to know from two perspectives; performance and cost

Thank you

david.wylie · February 16, 2019, 7:06pm

Here’s my experience on MySQL, for what it’s worth.

This is much faster :
insert into table (col1,col2) values
(val1,val2),
(val3,val4),
(val5,val6)
etc

than this :
insert into table (col1,col2) values (val1,val2)
insert into table (col1,col2) values (val3,val4)
insert into table (col1,col2) values (val5,val6)

but the first one will fail to insert all rows should one of them violate a constraint.

load data infile … is, I think, marginally the fastest of all. I sometimes write all my inserts to a temp file and use this method to insert them.

So it depends on what you are inserting (and how many records), along with how you want it to fail when things go wrong.

A note - I regularly insert upwards of 10k rows at a time. Worst case is between 1 to 10 million, but to be fair, that’s not all that common in a single insert.

stefano.menci · February 16, 2019, 7:21pm

Round trips between the form and the Anvil server are usually the bottleneck.

The best way to optimize performances is to collect all you need inside one dictionary, make one call to one server function, the server function will use the dictionary to do whatever it needs to do, then gather all the info it needs to gather, collect it inside a dictionary and return it to the client.

This way you have one round trip only. It might be slightly slower, but it will usually be much faster than splitting the work into several server side functions that require several round trips.

adhamfaisal · February 16, 2019, 9:15pm

does any of these methods affect the cost of using the database/server ?

david.wylie · February 16, 2019, 9:29pm

What do you mean by cost? Do you mean the actual cost, ie hardware/rental costs? Or are you talking about cpu/memory resource cost?

Actually, my answer is the same either way - I’ve not actually profiled any of it The multi-value technique (the second one I mention) is my most common method and it doesn’t worry my setup at all.

I tend to throw hardware at databases until the cpu usage is under 40%. Traditionally I run galera clusters (http://galeracluster.com/) but I am trialling a CockroachDB cluster (https://www.cockroachlabs.com/) which looks interesting, especially from a horizontal scaling perspective.

My Galera clusters are $80/node (6 core, 16gig ram). They are VMs so cpu is not guaranteed but I never have any issues with them. Typically each cluster does about 15 million inserts per month, though it does fluctuate.

ananth.krishnamoorth · December 16, 2020, 2:00pm

Makes a HUGE difference once you stop mindlessly iterating from client to server, and make a single call. Thanks!!