dbt adapter: support for sink cutover in blue/green deployments

The initial implementation of our automated blue/green workflow was transparent to consumers that connect directly to Materialize (e.g. your dashboards), but not to consumers depending on sinks (e.g. your applications feeding off Kafka). What happens when you're sending data out to an external system and need to deploy schema changes?

Well, things get a little trickier. The good news is that we're now handling the tricky parts for you in the blue/green deployment workflow, too!

Sink cutover 😱

In a nutshell: if you change the definition of the object a sink depends on, you must guarantee that it doesn't reprocess the data that was already sent downstream and ensure continuity in data processing. That sounds...hard, right?

To solve this, we implemented a new command in Materialize that allows cutting a sink over to a new definition of its upstream dependency without a blip ( ALTER SINK). We then embedded this step into the existing dbt workflow, so sink cutover is seamlessly handled when environments are swapped. There are no real changes for you, as a user, other than the ability to also blue/green sinks using the standard set of dbt macros:

dbt run-operation deploy_init    # Create a clone of your production environment, excluding sinks (if relevant) 
dbt run --vars 'deploy: True'    # Deploy the changes to the new deployment environment 
dbt run-operation deploy_await   # Wait for all objects in the deployment environment to be hydrated (i.e. lag < 1s) 
                                # Validate the results (important!) 
dbt run-operation deploy_promote # Swap environments and cut sinks over to the new dependency definition (if relevant) 
 bash

For a full rundown of the automated blue/green deployment workflow using dbt, check out the updated documentation.

Bonus: blue/green deployment dry runs

As we worked this change in, we thought it'd also be handy to give you a way to double-check that the blue/green deployment workflow is going to do what you expect it to do. To perform a dry run of the environment swap step, and validate the sequence of commands that dbt will execute, you can now pass the dry_run argument to the deploy_promote macro:

dbt run-operation deploy_promote --args 'dry_run: True' 

 
10:52:30  DRY RUN: Swapping schemas public and public_dbt_deploy 
10:52:30  DRY RUN: ALTER SCHEMA "public" SWAP WITH "public_dbt_deploy" 
10:52:30  DRY RUN: Swapping clusters quickstart and quickstart_dbt_deploy 
10:52:30  DRY RUN: ALTER CLUSTER "quickstart" SWAP WITH "quickstart_dbt_deploy" 
10:52:30  DRY RUN: No actual schema or cluster swaps performed. 
10:52:30  DRY RUN: ALTER SINK "materialize"."sink_schema"."sink" SET FROM materialize."public"."my_dbt_model" 
 bash

Try it out!

These improvements are available in the latest version of the dbt-materialize adapter (v1.8.2). To upgrade, run:

`1`	`pip install --upgrade dbt-materialize`

bash

As a reminder, if you're running a pre-1.8 version of the adapter, you have to run a few extra commands to upgrade due to changes to the underlying package structure in dbt Core:

`1`	`pip install --force-reinstall dbt-adapters`
`2`
`3`	`pip install dbt-postgres --upgrade`

bash

Have you tried out automated blue/green deployments? We'd love to hear any and all feedback on this new workflow, as well as requests for new functionality.

`1`	`dbt run-operation deploy_init # Create a clone of your production environment, excluding sinks (if relevant)`
`2`	`dbt run --vars 'deploy: True' # Deploy the changes to the new deployment environment`
`3`	`dbt run-operation deploy_await # Wait for all objects in the deployment environment to be hydrated (i.e. lag < 1s)`
`4`	`# Validate the results (important!)`
`5`	`dbt run-operation deploy_promote # Swap environments and cut sinks over to the new dependency definition (if relevant)`

`1`	`dbt run-operation deploy_promote --args 'dry_run: True'`
`2`
`3`	`10:52:30 DRY RUN: Swapping schemas public and public_dbt_deploy`
`4`	`10:52:30 DRY RUN: ALTER SCHEMA "public" SWAP WITH "public_dbt_deploy"`
`5`	`10:52:30 DRY RUN: Swapping clusters quickstart and quickstart_dbt_deploy`
`6`	`10:52:30 DRY RUN: ALTER CLUSTER "quickstart" SWAP WITH "quickstart_dbt_deploy"`
`7`	`10:52:30 DRY RUN: No actual schema or cluster swaps performed.`
`8`	`10:52:30 DRY RUN: ALTER SINK "materialize"."sink_schema"."sink" SET FROM materialize."public"."my_dbt_model"`