Best practices: merging branches that use different databases

stefano.menci · May 24, 2023, 9:09pm

I have created a new Development environment linked to the dev branch. The Development environment has its own database that:

shares some tables with the Production environment
has some new tables
has some tables with the same name as tables from the Production environment, but with some columns of different type or with more columns

What is the best way to go to production?
What happens if I simply merge dev into prod, considering that the two databases are not compatible with each other?
Will the merge tell prod to use the new database?
Or should I stay away from the merge and instead:

change the url of the Production environment
assign the public url to the Development environment
run my last minute production tests
if something goes wrong immediately undo steps 2 and 1, fix the problems and restart
disable scheduled tasks on Production
setup scheduled tasks on Development
delete the Production environment
rename the Development environment into Production
make sure other settings like the email test mode have the appropriate setting in both Development and Production environments
uplink keys… what do I do here? Is it possible to link the uplink keys that were in the old Production to the new Production? (I think I’m using uplink only for testing, so using new keys shouldn’t be a big problem)

jshaffstall · May 24, 2023, 9:29pm

In my experience, no. That would be a disaster, since the dev database and prod database have different data.

My workflow when I have database changes is (simplified, I have a testing environment, too):

Migrate the db changes from dev to prod (using the schema mismatch tool)
Merge the code changes

Presumably I’ve tested everything already using the url of the dev environment. There are a variety of situations that could affect all of this:

The db changes are additive (adding columns, tables, etc). Step 1 from above is harmless in that case.
The db changes are at least partly destructive (removing columns, tables, etc). Step 1 and 2 need to be done in quick succession, OR I need to do the db changes in two stages…migrate the additive changes, merge the code, then migrate the destructive changes, testing the development environment after making the destructive changes.

stefano.menci · May 24, 2023, 10:13pm

I already asked what the schema is for and you tried to answer. Maybe you did answer, but I still don’t understand.

In this very app, I see the schema mismatch warning on the production database, I resolve the problem, and the warning appears on the development database. I resolve that problem, and it goes back to the production one. Rinse repeat. I sent the video of this back and forth to the support and they said something like “it’s working as designed, but it’s bad design so we will fix it soon.”

The bottom line is that I still don’t understand what the schema is for, I have that mismatch warning on most of my apps and they don’t care. As far as I understand that schema is still useless. If it was useful, my apps wouldn’t work.

~~My changes are destructive~~. My changes are not destructive, they were at a certain point, but at the last check I have some new tables, some new columns and some columns deleted. I can add the new columns, I can share the new tables with the development database and I can leave the columns that will be deleted for now and delete them later.

Mmmmh… I think I will do these changes to the production database before trying the merge. But I will do the changes by hand, I will not use that tool. I don’t understand it and don’t trust it. I only understand that it doesn’t work.

jshaffstall · May 24, 2023, 10:37pm

That isn’t quite correct. You don’t understand it, true, but it works fine if you use it based on what it’s actually doing.

There are two sources of truth for the database design.

What’s actually in the database
What’s stored in the source code (in the yaml file)

You get the schema mismatch message when those two don’t match. That is normal to get the schema mismatch in your production database. You do not want to fix that until you’re ready to migrate changes from dev to prod, and then you want to do it in a specific way.

My workflow as I’m working on in my development environment is to make changes to the development database. If I get schema mismatch messages on the development database, I know that’s just because Anvil’s failed to keep the source code up to date with the development database. I fix that message by treating the database as the source of truth by clicking The Schema of the database Is Correct:

That updates the source code with the schema so it properly reflects the actual development database.

After I’ve fully tested the development changes, and I know my code and database work, now it’s time to migrate both the database and the code over to prod. To migrate the database changes to prod, I click to resolve the schema mismatch in the prod database, but this time click the The Schema of the source code is correct button. That migrates the prod database schema to match the development database schema (you get a list of changes that’ll be made, so you can make sure everything’s there).

The TLDR version is the schema mismatch messages are normal for the production database. Don’t worry about them until you’re ready to migrate changes from dev to prod.

p.colbert · May 25, 2023, 2:35pm

Without seeing any details, system behavior like this sounds to me like a result from a big-pattern mismatch: Anvil’s design (and implementation) expected you to share (or switch between) databases in its way. You’re doing it your way, which they didn’t foresee, or take into account.

Both ways make sense, to different people, in different circumstances. But their current model can’t accommodate yours, hence their reply.

I can see how this might occur. Bear with me, please, as I try to think this through.

My best guess, based on observation, is that the IDE’s model of databases is geared for a slow, steady, one-step-at-a-time model of database evolution. Schema changes are (expected to be) rare, introduced one at a time, in one branch, and propagated out to all instances, before the next change is even considered.

This can be accomplished with a copy-schema-on-write model: each time the IDE brings an updated schema into contact with a different database or branch, just copy it.

With permission, of course. The IDE can never be sure which schema is “the” new one, or “the” old one.

But with an experimental, or “feature branch” approach, multiple, independent database changes may be introduced, each in its own branch, all coexisting simultaneously in the same “development” database. Some schema changes may be abandoned as failed experiments. Others may move forward, be kept as part of their feature, to be merged in when their feature goes into Testing (or Production).

The result: a “development” database which has accumulated a number of (unrelated) changes, in its single, unique schema. It’s the sum total of all the changes that have occurred up to that point in time, no matter where they came from.

It’s this kind of parallel development situation, where Anvil’s database evolution model, as I understand it so far, no longer fits. To merge in a “feature branch”, ideally you’d merge in just the branch-specific changes, to the source code, and to the database. E.g., add just that branch’s new column, ignoring all the other schema changes that may have happened to the same “development” database.

Unfortunately, for Anvil’s Schema-comparison tool, a Schema is all-or-nothing. It sees all the differences between that source schema, and the target schema. It can’t isolate just the one relevant database difference provided by that feature branch.

If Anvil’s tool is used to apply database changes, then all the differences it sees will be applied (or erased), not just the one that was intended by the branch.

In that case, you know better than the IDE does, so the best course of action would be to ignore Anvil’s schema tool, and apply that specific schema changeset manually. Which sounds like exactly what you’re doing:

Am I making sense, so far?

stefano.menci · May 25, 2023, 4:10pm

Mmmh… Maybe I’m starting to understand what the schema is for: it helps with getting database A to have the same schema as database B.

There is a button that makes the schema consistent with the app’s database, but I don’t see a button that makes it consistent with another database. So, I may be understanding what it is for, but I still don’t understand how to use it.

If my understanding is correct, that the schema is used to migrate a database from its old schema to the new schema of another database, then I am correct when I say that the schema is useless.

It would be much simpler to have one button that asks you to pick one database to modify and one database to use as template. Simple and clear. No need for a schema.

So, considering that there is no need for a schema for this operation, what else is the schema for?
Why do I need to keep pushing those buttons that do something that doesn’t work?

You make a good point here, but if they had taken it into account, maybe the perpetual back and forth between two broken states wouldn’t happen, but there would still be a redundant schema and all the machinery to maintain it.

As far as I understand (which may be not that far), the equation is simple:
The app is linked to a database + the database has a schema = the app has a schema.

Adding a second schema definition to the app is looking for trouble. The second definition risks to get inconsistent, and indeed it does, again and again.

I never thought that Anvil would give me a tool to change the schema of a database. Perhaps that’s why, together with that clumsy UI, I couldn’t figure out what the schema is for.

In my experience schema migrations comes with data transformation. It is never enough to simply add a column or a table, it always comes with changes on how some information is stored, that is I always need a script to read the old data from the old columns and write it to the new columns in the new format.

But I understand that there may be simple cases where an automatic schema update would help.

jshaffstall · May 25, 2023, 4:52pm

My previous post describes the steps.

Think of the schema as a sanity check, a way of answering the question, “Is Database A compatible with the code?” If you get a schema mismatch message, then the answer is No. That’s the normal answer for anything but the development database.

At the point when you want the answer to be Yes for the production database, then you migrate changes from the schema. You don’t do it from another database, because you’re trying to make production compatible with the code.

p.colbert · May 25, 2023, 4:57pm

I think that’s mostly what the IDE’s Schema support was aimed at: folks who never even had to think about databases, and now they’re stuck with one. They’ve made the odd, occasional change to their Development db, and would be well-served to be reminded to make the same changes to Production’s db – when they try to connect that new code to Production’s db.

For more complicated cases, you and I know our databases (and our feature-packs) better than the IDE can, and we should take care of database migrations ourselves.

stefano.menci · May 25, 2023, 5:34pm

I don’t agree.
All my apps have mismatching schemas, and all my apps work just fine.
The apps work well, the auto-completion works well, the only thing that doesn’t work is the UI that in some apps, the ones with multiple databases, can’t get rid of the mismatch warning on both.

There is a warning, I have no way to remove it… it doesn’t help my sanity at all.

I don’t see those steps. I see them in your previous answer to my previous question, but… I’m still in the dark.

I can think of the following workflow that would make sense:

check out dev
click on the button to fix the schema mismatch and make sure the app schema matches the database
check out prod
click on the button to fix the schema mismatch and make sure the app schema matches the database
merge dev into prod
click on the button to update the database to match the schema

Unfortunately this workflow doesn’t work, because step 2 breaks the schema of prod and step 4 breaks the schema of dev, so it’s impossible to have the app schema matching the database on both dev and prod.

At this point what do I do?
Do I trust that button that says it will modify my production database, making decisions based on two schemas, one of which is wrong?
I can’t trust it!

This one I would understand what it does, I wouldn’t need a workflow filled with checkouts and merges and I would trust it.

jshaffstall · May 25, 2023, 5:55pm

Copied from my post above, broken out into numbered steps, if that helps, and added some emphasis:

My workflow as I’m working on in my development environment is to make changes to the development database.
If I get schema mismatch messages on the development database, I know that’s just because Anvil’s failed to keep the source code up to date with the development database. I fix that message by treating the database as the source of truth by clicking The Schema of the database Is Correct
After I’ve fully tested the development changes, and I know my code and database work, now it’s time to migrate both the database and the code over to prod. To migrate the database changes to prod, I click to resolve the schema mismatch in the prod database, but this time click the The Schema of the source code is correct button.

The warning is telling you the production database is not compatible with your current source code. That is normal. The only time the production data base will be compatible with your current source code is if you have not made any changes to your development database. That’s just part of having different databases (with potentially different structures) for development and production. Production will be in sync with the current development code only at specific points in time.

All this works like a treat and is perfectly reliable, assuming that you’re using it as intended (or at least as I described above, which works in my projects).

p.colbert · May 25, 2023, 6:17pm

Which of the two buttons are you clicking, here?

stefano.menci · May 25, 2023, 6:41pm

I have not tried the workflow, because I don’t trust it, but I would click the button that updates the app schema so it matches the database.

I don’t understand how this work.
You are not mentioning any merges or any checkouts. Are you doing all of this while the dev branch is checked out?

jshaffstall · May 25, 2023, 6:53pm

Yes, I do this while the dev branch is active. That’s how Anvil knows what the development schema is, because it’s stored in the source code.

The steps above bring the production database in sync with the development database. Merging the code is a separate step that happens either before or after the database merge, depending on what’s changed in the database.

Be sure you’re clicking the right buttons, though! You’ll use both, depending on which step you’re in.

I highly recommend playing with it in a test app.

p.colbert · May 25, 2023, 7:02pm

The labelling in the IDE leaves a lot to be desired. In Anvil’s model, the “app” doesn’t have “a” schema. Each database has its own schema, and so does each source-code branch.

When a discrepancy is detected between a branch and a database, Anvil gives you three options:

Alter the database to match the branch’s schema.
Copy the database’s schema into the branch.
Close the IDE tab that’s complaining about the discrepancy, and ignore the whole thing. There’s no label for this option; it’s implicit.

#3 is rarely mentioned, but it’s the best option when you have a subset of schema changes to make, e.g., the changes belonging to a specific feature.

p.colbert · May 25, 2023, 8:55pm

I’ve added this general topic to Content Suggestions:

stefano.menci · May 25, 2023, 10:15pm

Let’s see if I understand.
I’m going to answer my own question, let’s see if I’m getting it right.

Question
I have done some changes to both the code and the database structure in the dev branch.
How do I merge the code from dev to prod and change the production database to match the structure of the development database?

Answer
These are two independent steps.

For the code, you merge dev to prod the way you normally do, regardless of any changes in the database.

For the database, you use a tool that does the following two operations:

database → schema - takes a database in input and creates the schema that defines it
schema → database - takes a schema in input and changes a database to match it

You can use this tool to get the schema from the development database and use it to change the production database.

Have you ever noticed a “Resolve…” button under a “Schema Mismatch” warning at the bottom of a database configuration? That button gives access to this tool.

The schema is the middleman between the two operations. The first operation takes one database in input and generates a schema, the second one takes a schema and modifies a database so it matches the schema.

The tool has always access to all the databases in the app. Databases can be linked to environments and environments are linked to branches, but the tool always sees all the databases, regardless of which branch is currently checked out.

On the other hand, the tool has only access to one schema, the one stored in the yaml file of the active branch. You can use the tool to get the schema from the development database, store it in the current yaml file, then use the tool again to get the schema from the yaml file and use it to change the production database. It doesn’t matter which branch is active, the yaml of the active branch is used as a clipboard to temporarily store the schema. The schema is stored in the yaml file permanently and its changes will be included in the next merge, but you can think of it as a temporary clipboard.

The IDE always checks whether the current schema is consistent with each database and shows a warning for every database that is not consistent with it. This can be misleading, because the schema stored in the prod branch may be consistent with the production database, but if you check out the dev branch, the IDE will tell you that the schema of the production database doesn’t match, and, worse, invite you to change it.

The schema is used only during the schema → database operation described above, so, with the exception of when you are changing your database, you can ignore those schema mismatch warnings.

Assuming that this answer is correct, thank you @jshaffstall and @p.colbert for patiently holding my hand and shedding some light on my darkness.

It would have been clearer if the command that changes a database was together with other database management commands, instead of being disguised as a “resolve” button that resolves a temporary issue. I think it would belong here:

If the app needs the schema for other reasons (like generating the tables the first time it runs on the open source server as mentioned by @jshaffstall), then the schema should be automatically stored in the branch a database is linked to (via the environment) and kept updated every time the database changes. If the database is changed by another app, then the IDE should update the schema when the app is loaded and the change is detected. It could show a warning to the user, but I don’t see the point.

jshaffstall · May 25, 2023, 10:32pm

That all sounds right to me.

p.colbert · May 26, 2023, 1:50pm

If there’s a difference, then you can force the branch’s schema to be updated, as described in my #2 above. From then on, as long as you have that branch active in the IDE every time that you use the IDE to change its linked database, the IDE will keep both copies of the schema in sync.

Edit: if you use a different branch to change that database’s schema, then you’ve bypassed that automatic synchronization. This may result in surprising warnings later on. I suspect that you’ve experienced this already.