Using stored procedures and views in DB

attila · January 28, 2023, 4:33pm

Would be great to be able to keep the database logic inside the database and use stored procedures and views defined within the database.

p.colbert · January 29, 2023, 7:23pm

That’s probably available with SQL Access and Indexing.

As the next best thing: write your procedures in Python, server-side. This gives you the opportunity to enforce whatever constraints you can imagine. Then you can always call those functions, instead of more basic ones, to CRUD (Create/Read/Update/Delete) your data.

attila · January 29, 2023, 8:31pm

I’m not sure if it is possible via SQL access. It seems to me limited. I read it somewhere before (forum or by support mail - not sure anymore), it is not yet supported. I might be wrong.

Having views or procedures within your database is not the same as having it on server side. While complex, difficult functions should be in your code (py), but there are reasons when you want to keep some logic within the database As a generic rule, you should keep data relation within your database. I.e. If you have multiple tables like users and usergroups (for simplicity now), you can create a view to get a list of users and it’s group or a procedure to add, remove, check / convert data (i.e. password) within the database. There are multiple benefits of this, here is a few:

if you have other application or part of the code that access to the same database, you can keep the data related logic in one place, you only need to call a view or procedure and don’t care much about what is going on inside the db, how many tables, joins etc needs to be addressed.
many times, db are developed by another people then the code… helps to make sure everyone done his part right
your db can verify if all relations are correct, less place for errors, especially when you are changing a db layout. You can catch it and you need to fix it “one” place only, within the db. Client codes could stay the same as before
it is much faster, since way less rows or data leaving your database (obviously sql/db is not for complex functions, but updating 1M rows can be done easily with 1 db functions, which iterates over all rows, check, might update other parts of the db or validate it, set times etc. If you do it outside of DB you might need to get each row id and send sqls back to update them… it would be another story regarding performance, memory usage, stability, scalability etc)
easier to do atomic procedures (so you can be sure during that function the related records wont change by another call)
it is safer, since no one outside can see the relations

So, it is not really “Can I use something else?”, sure I can. But I am interested to keep developing using “common db standards” instead of going back years, when we did not had these features within databases.

Btw, this is one of the reason (not the only one), while I’m using my own DB running on my servers via uplink. So uplink gives a good workaround, but it cost you to manage a db server apart from anvil. In my case it is ok, since I already have it for other projects.

stefano.menci · January 30, 2023, 1:57am

I have a dedicated plan with SQL access. I have explored a little and I was able to create some function and use them in SQL queries.

I’m not an expert in DB administration and I don’t know the details of the Anvil setup, so I’m very cautious when I put my hands in there. I have not tried to create any custom indexes. I think it would be possible, but I never needed anything more than what’s available from the IDE.

I don’t think it is possible to manage triggers, constraints or other advanced features, but again, I never really needed them, so I never explored too deeply in that direction.

owen.campbell · January 30, 2023, 8:53am

If you have a plan with SQL access, you can do whatever you like in the db. (Which is exactly why it’s not available on shared plans)!

attila · January 30, 2023, 9:16am

thanks to clear this out!

stefano.menci · January 30, 2023, 1:36pm

I’m not 100% sure about the “whatever you like”.

I remember trying to do certain things (I don’t remember which things) and hitting the “you don’t have the permission to do this” wall.
Perhaps I could have asked the Anvil team to remove that wall, but the default permission settings did prevent me from… something.

And there were other things (again, I don’t remember which things) that I couldn’t do because of the structure of the database. I’m not an expert, so I was googling my way through it and I may very well be wrong, but I remember hitting another “this feature is not available with this kind of views” wall.

I would be very happy to be wrong on this front

p.colbert · January 30, 2023, 2:48pm

I absolute agree with your reasoning. In the desktop world, where I do have SQL access, that’s exactly what I do, and why. My Oracle database has many more CHECK and PRIMARY KEY and FOREIGN KEY constraints than tables. (It’s close to 300 tables…)

In the Anvil world, since I don’t have SQL access, at this time, the next best thing is to do it all at the Python level. I just wanted to assure you that this approach does work.

hugetim · January 30, 2023, 3:52pm

Can what you are describing (aside from “it is much faster, since way less rows or data leaving your database”) not be accomplished with a separate Anvil app with shared-table access to the database implementing these views or procedures, a separate app which could then be used as a dependency?

Is what you are describing different from what Uncle Bob calls “Gateways” in The Clean Architecture? I’m really just wanting to understand, as someone with only minimal dev experience outside of Anvil.

owen.campbell · January 30, 2023, 3:56pm

I maintain an SQL tutorial for Python programmers:

https://owencampbell.me.uk/sql_python_tutorial/

The intro page on there describes some of the circumstances where using SQL might be more efficient than other techniques.

attila · January 30, 2023, 5:29pm

@hugetim there are multiple aspects here, on one hand you have a design perspective, what would looks good and easy to maintain, handle, place for less bugs, etc and on the other hand you have technical differences.

I think @owen.campbell describes it well (did not checked it all), but just to give you a few simple example.

In general speaking, when you are making a small db or an app that don’t need further development for years or you are the only one to work with it could make it less important to use sql functions, views, indexes etc. IT does not mean it would not help, but it has less effect.
When you have a code which is written by multiple people, developed during years or maintained and new features added for further years, then these things could be vital.

You can paint a small frame of your car’s headlight with a cheap ready to use spray can of paint without any issue, but you will see kind of a big difference when you want to paint the whole car using spray cans.

Imagine you have 1 record with string… python strings are inmutable. This makes it easy to use but very bad for performance, because every string needs to be recreated. a = “b” + “c” is not 1 string but 1 created for b, one for c, one for the sum “bc” and one for the a as last assignment. When you’re doing something with a string in the db, it could spear a lot or resources!
Also, if you are working with such a resource hungry items (db related ofcourse) in a client code you are eating up memory in the client side (browser) which is harder to scale or monitor. You have no control over how much memory a user has and what other applications is running. Server side code (or db code) runs on server, so it uses memory on the server side and you can have a way to control it, monitor it properly (exactly how much memory is used by your app/db) and add more if needed or find a solution to save more if that is what you want.

if you design your DB using EER (diagrams) it helps to see the whole db and each table relations. Helps a lot when you are working on your sql. Same help you get from a stored procedure, since it simplify your request. You can “hide” all difficult join, id relations in there and use as simple function like “loging_user(name, passwd)” where the stored procedure will check user name, password, set last login time, make any other related db insert and finally return a true or the user[] as results.
You can achieve this using your method with another shared module. However, when you change something in the db (pretty common during development), you have to change the shared lib and might be the client code which use it, so min 2 but could be even 3 places, in 3 dependent codes (including your db). Using stored proc, you can manage it together the changes in the db and keep the db in working condition.
More code gives you more dependencies. And dependencies are one of the hell of programming, it force you to do much more work and somewhere always giving you headache.

If you looking at the performance, obviously keeping what fit inside db is better. Even in your example, you make an extra python call or even could be extra network transaction between your code and the shared module. Again performance, speed, memory.

Anvil gives you an easy way of working but easy does not works always. It is easy to get started, easy to make some small app but could be hard(er) to manage or scale when problems hits you.

You need to decide which part of the “easy” vs “more work” works out for you better in short or long term

hope this helps a bit to understand

p.colbert · January 30, 2023, 5:51pm

@hugetim, putting all that code, the sanity-checks, the consistency rules, inside the database, and having the database enforce them, means that you can rely on the database to hold your single source of truth, no matter how many different languages and programs you need to use outside of the database. It prevents a huge class of garbage in, garbage out problems from ever occurring in the first place. The database simply refuses to accept those kinds of garbage!

If the safeguards are not in the database, then they must be implemented outside the database. Everywhere. Consistently. The more, different programs that need access to the data, the harder it gets to maintain that consistency.

And as @attila points out, doing so outside the database is more costly in another way: database performance.

In short, doing these things inside the database scales up. Doing it outside does not.

stefano.menci · January 30, 2023, 7:14pm

Case 1: the app is in charge of the rules
In 2 years you need to do a little change, you don’t remember what rules the app needs to enforce, you don’t enforce them, and you end up with data inconsistencies or loss.

Case 2: the db server is in charge of the rules
In 2 years you need to do a little change, you try to do your thing and the database will refuse to end up with data inconsistencies or loss.

attila · January 30, 2023, 9:01pm

in two years, you will start to read docs! You have some, right?

btw, in the db all procedures are listed. Those are easy to check what they are doing. Much more easier then going through the client/server code to check how do you create a table and which other tables needs to be addressed. Stored procedures or views make it much simplier to get back on track… in 2 or 5 years later. At least they helps me.

But it is your own decisions which way to go, what do you want to use and what is in your favour. I’m just curious what I can and cannot do within Anvil and how… Since I can use my uplink code using my own db, I don’t have any problems. I can use my way to develop, and those who want to “simplify” the way of making db transaction, Anvil is doing a great job.

Just like forget css, js and html code you can also forget these “old fashion db designs and tools”. However if your project grows bigger, you might need to access css, js, html or sql as well.

stefano.menci · January 30, 2023, 9:04pm

Ahaha, that’s funny!

ianb · January 30, 2023, 10:12pm

As a person who thinks most things could use more SQL I couldn’t agree more, but I do know where they (Anvil) are coming from designing (with a priority anyway) for anvil to be as python oriented as possible. I fully expect some people love javascript as much as I don’t, and wish there were more of it in anvil.

I’m not saying this feature request is unwarranted, just that the escape hatch for not being able to do something in python/anvil has always been using uplink or building your own API or offsite BYOdb connection of some kind.

So just like javascript became more supported with the javascript bridge, I think this would be useful in the long term, but there are so many simplistic SQL access functions that we also need replicated interfaces for beyond just stored procedures.
Examples include the ability to UPDATE to the results of a SELECT erm .search() query, and some form of executemany that supports placeholders or some other way to take inserts, updates, or upserts that can digest an iterator at realistic Database write speeds.

They should do it, but it won’t be easy to do safely, so it won’t be fast to build well, so I wouldn’t expect it super soon is my point I guess.