Dev/Prod workflow & Data tables

This post is a bit long. I’m trying to develop and document a workflow that seperates my Dev Anvil app, local dev environment and Anvil prod app. I thought I had figured it out (thanks to this forum), but I have hit a snag with Data tables (see below).

Workflow

My development workflow consists of 3 phases: Setup, Daily dev, Deployment. It handles versioning code for these different phases of development. It also handles separation of Data tables
My system uses 2 Anvil apps: DEV and PROD. The DEV version can be published for testing. So one could speak of 3 code versions.
It is based on information found here: Best Version Control Options The code versioning is handled by Git.

Setup

  • Create app Test_App
  • Rename it to Test_App-DEV
  • Git clone the app to local dev env:
git clone ssh://theuser@anvil.works:2222/XXXXXXML4XKGY4I.git Test_App_DEV
  • Anvil clone app. This cloning also clones the tables and their contents, but not the git history.
  • Rename it to Test_App-PROD
  • If there are any tables that should be shared, this is the moment to fix those. (MIGHT be wrong; see below)
  • Check dependencies (need to use dev or prod version of those?).
  • Re-enable the uplink
  • Add a git remote to the Anvil Prod app
git remote add PROD  ssh://theuser@anvil.works:2222/X2KKKKKKVGGYYAJ3D.git
  • Force push to the Prod git repo. This copies the version history.
git push -f PROD

Daily development

  • Make changes to Anvil DEV app or in local dev env. Git push and pull to sync between the two.
  • Publish the Anvil DEV to let other people test the app (I don’t bother putting this under version control).

Deployment

  • Manually sync data tables structure between Anvil DEV and PROD. (See below)
  • App secrets should be copied manually.
  • Sync dev env to PROD:
git push -f PROD

The problem
This procedure seems to do the trick for the code. However I noticed that data tables did not need upgrading. Then I noticed that the data tables were not copied but that the PROD app referenced the tables of the DEV app. That was NOT intended.

Then I had a look in anvil.yaml file (see below). I noticed that it contained the database schema. That’s probably fine, but it also contains a table_id.
Of course when the code is synced through git the table_id will reference the DEV app’s table(!). That’s not good.

Any suggestions on how to deal with DEV/PROD versions of data tables? TIA

And feedback on the workflow above is also welcome. Especially scrutiny of the git commands would be useful (because I don’t really know what I’m doing).

allow_embedding: false
package_name: Test_App_DEV
db_schema:
- name: First
  id: 101836
  python_name: first
  columns:
    0RnQeY0d0ME=:
      name: year
      type: number
      admin_ui: {order: 2, width: 200}
    PHt8GqjlmXo=:
      name: name
      type: string
      admin_ui: {order: 1, width: 200}
    kwGgKW0vlhI=:
      name: id
      type: string
      admin_ui: {order: 0, width: 200}
    r8Ow+KBAJcc=:
      name: firstname
      type: string
      admin_ui: {order: 3, width: 200}
  access: {python_name: first, app_id: X227SXPVGGYYAJ3D, server: full, client: search,
    table_mapping_name: null, table_mapping_id: null, table_id: 101836}
- name: Other
  id: 101879
  python_name: other
  columns:
    1pODDKJPXhs=:
      name: blabla
      type: string
      admin_ui: {order: 0, width: 200}
  access: {python_name: other, app_id: X227SXPVGGYYAJ3D, server: full, client: none,
    table_mapping_name: null, table_mapping_id: null, table_id: 101879}
name: Test App DEV
startup_form: Form1
renamed: true
runtime_options: {version: 2, client_version: '3', server_version: python3-full}
services:
- source: /runtime/services/tables.yml
  client_config: {}
  server_config: {auto_create_missing_columns: false}

You may be able to find PROD’s original table ids, in an earlier copy of PROD’s anvil.yaml file.

Hi @mjmare - I don’t think I got into the same detail as you’re grappling with regarding data tables, but you might like to browse through this wiki entry on the forum regarding Git commands and some variations on your approach?

That post was great! In fact it is what got me started experimenting with all this cloning and gitting. I should have included a link to your post, sorry.
I wanted to fine tune the procedure for my purposes. And also the post lacked detail (for me) on the git commands to use.
On rereading your post I noticed that you are probably aware of the data tables problem (see footnote in step 8). How do you handle this?

1 Like

It was just a test program. No harm done.
However I’d like to solve this in a robust way. When (re)deploying an app I’d rather not have to remember table ids and manually fix them…

In theory, you could extract the table definitions mechanically (via code) from production’s anvil.yaml. They’re pretty easy to find. Then restore them into the revised file later, before committing anvil.yaml to your local repository.

I’m afraid I never solved it @mjmare. My workaround was to write some admin functions running locally to overwrite/reset data tables using the anvil.server.connect. Nothing polished enough or safe enough to share I’m afraid though!

Could you expand a little on this?

I was thinking in the direction of checking out with git both DEV and PROD versions to local dev and use some hacky script to copy over the table schema.

Hi there,

I can confirm that the table IDs in the data_tables portion of your anvil.yaml does not determine which tables your application is connected to. Your application’s tables are 100% configured in the Data Tables section of the Anvil editor, independent of which version of code your app is running. So you can safely pull and push your apps without having to edit table IDs.

(Why is that section in the YAML at all, you ask? It’s so that the standalone App Server knows what tables to create when you launch with --auto-migrate)

1 Like

Hi @meredydd
How did the PROD app get access to the DEV app’s tables (as if they were shared) then? I surely did not share the DEV’s tables with PROD.
I assumed because the table_id from PROD were overwritten by the DEV’s anvil.yaml. Seemed perfectly logical to me (which is not a guarantee that it is!)

Hi @mjmare,

How did you create the prod app? If you cloned the dev app, then the prod app got a copy of the dev app’s tables and data (changing one won’t reflect in the other). The only way to share tables between apps is to add a table from another app in the Data Tables editor.

Please see first post in this thread. It details the workflow.

Hi @meredydd,

I have completely recreated the test DEV and PROD apps. It seems to work according to your remarks: the db_schema in anvil.yaml is not used in the PROD app. That simplifies things somewhat.

Do you have suggestions on how to sync the schema (not contents) from the DEV to PROD? There is no way to programatically modify the schema, right?

PS I have modified my workflow description accordingly for anyone interested. I’d be happy to share.

I’ve run into the exact same issues you have. I don’t have a solution for programmatically updating schemas between prod and dev, but I did write the following Python code that would identify differences between two versions of my app. I wrote this all into a Jupyter notebook, but I can’t attach it here, so I’ll include it as a python script.

Note: you’ll need to have PyYaml available, and will need to update the paths to your dev and prod versions of your app. I clone each app onto my machine as a separate repository (directory) and use an rsync command to copy all but specific files between the two. Then I use this script to identify any schema changes I need to make between the two apps. Not a great solution, but it’s the best I’ve come up with so far.

import yaml

with open("path-to-prod/anvil.yaml", "r") as f:
    prod_sp = yaml.safe_load(f)

len(prod_sp["db_schema"])

with open("path-to-dev/anvil.yaml", "r") as f:
    dev_sp = yaml.safe_load(f)

dev_table_names = [t['name'] for t in dev_sp["db_schema"]]
len(dev_sp["db_schema"])

# Find Missing Dev Tables

for t in prod_sp["db_schema"]:
    if t["name"] not in dev_table_names:
        print("New Table: ", t["name"])
        for k, v in t["columns"].items():
            print(f"\tCol: {v['name']} - Type: {v['type']}")
    
def get_table_schema(schema_dict, table_name):
    for t in schema_dict["db_schema"]:
        if t["name"] == table_name:
            return t
    return None

# Find Missing Dev Columns

# Assumption: schema modifications start from dev and are copied to 
# prod as needed. Primary is intended to be the "source" schema, 
# and secondary schema is the target that should be the same as the primary.

missing_tables = []
missing_columns = []
primary_schema = prod_sp
secondary_schema = dev_sp


for t in primary_schema["db_schema"]:
    secondary_table = get_table_schema(secondary_schema, t["name"])
    if secondary_table is None:
        missing_tables.append(f"Missing Table: {t['name']}")
        continue
  
    for k, v in t["columns"].items():
        secondary_columns = [dt["name"] for dev_key, dt in secondary_table["columns"].items()]
        if v["name"] not in secondary_columns:
            missing_columns.append(f"Table: {t['name']} - Missing {v['name']} - {v['type']}")
missing_tables, missing_columns
3 Likes

I’d be interested in your revised workflow description.

@hugetim I can’t post a pdf so here’s a link to the Evernote note: https://www.evernote.com/shard/s3/sh/5fd7027d-2956-91d0-af9d-26cc278e310a/af0f4a6405ecd280ece5447bd9c5da4a

1 Like

The code for auto-detecting schema differences is terrific, thanks for posting that! It needed minor adjustments for my workflow, but is going to save me a lot of time when I make a quick schema change and forget to note it for migration to production.

2 Likes