dbt + Materialize demo: Running dbt’s jaffle_shop with Materialize
Let's demonstrate how to manage streaming SQL in Materialize with dbt by porting the classic dbt jaffle-shop demo scenario to the world of streaming.
We've recently partnered with dbt and community members to release a beta Materialize dbt adapter that allows you to transform your streaming data in real time using Materialize as your data warehouse.
This tutorial serves as a practical hands-on demonstration of the adapter. In this case, we are using static not streaming data, but the configuration and setup process is the same.
dbt's jaffle_shop + Materialize
If you've used dbt, odds are that you've run across dbt's beloved
jaffle_shop demo project.
jaffle_shop allows users to quickly get up and running with dbt, using some spoofed, static data for a fictional jaffle shop.
At Materialize, we specialize in maintaining fast and efficient views over your streaming data. While we work on hosting a public source of demo streaming data for analytics, we wanted to provide those familiar with dbt with an easy way to get up and running with our
dbt-materialize adapter and
jaffle_shop's static data.
Note: This demo won’t highlight what’s powerful about Materialize. For that, check out our
wikirecent-dbt demo or our docs!
Setting up a jaffle_shop with Materialize
Setting up the
jaffle_shop project with Materialize is similar to setting it up with any other data warehouse. The following instructions are based off the traditional
jaffle_shop steps with a few Materialize-specific modifications:
Follow the first three steps of the
jaffle_shopinstructions, install dbt, clone the
jaffle_shoprepository, and navigate to the cloned repo on your machine.
In your cloned
dbt_project.yml, make the following changes to the model materializations:
models: jaffle_shop: marts: core: materialized: materializedview intermediate: materialized: view staging: materialized: view tags: ["staging", "hourly"]
Tip: Only materializing your
corebusiness models as materialized views, without materializing your intermediate or staging views, ensures that you're only using the memory you need in Materialize.
Install the dbt-materialize plugin. You may wish to do this within a Python virtual environment on your machine:
python3 -m venv dbt-venv source dbt-venv/bin/activate pip install dbt-materialize
Install and run Materialize. The linked instructions will guide you through running a Materialize instance on your local machine. (Our cloud offering is being developed, you can register for the private beta here!)
jaffle_shopdbt profile that will connect to Materialize. The following profile will connect to a Materialize instance running locally on your machine. The
hostparameter will need to be updated if it's self-hosted in the cloud or run with Docker:
jaffle_shop: outputs: dev: type: materialize threads: 1 host: localhost port: 6875 user: materialize pass: password dbname: materialize schema: jaffle_shop target: dev
profiles.ymlyou're using for this project is not located at
~/.dbt/, you will have to provide additional information to use the
dbtcommands later on.
Check that your newly created
jaffle_shopprofile can connect to your Materialize instance:
Load the static
jaffle_shopdata into Materialize:
Run the provided models:
In a new shell, connect to Materialize to check out the
jaffle_shopdata you just loaded:
# Connect to Materialize psql -U materialize -h localhost -p 6875
# See all the newly created views materialize=> SHOW VIEWS IN jaffle_shop; # Output: name ------------------- customer_orders customer_payments dim_customers fct_orders order_payments raw_customers raw_orders raw_payments stg_customers stg_orders stg_payments # See only the materialized views materialize=> SHOW MATERIALIZED VIEWS IN jaffle_shop; # Output: name --------------- dim_customers fct_orders raw_customers raw_orders raw_payments # Check out data in one of your core models materialize=> SELECT * FROM jaffle_shop.dim_customers WHERE customer_id = 1; # Output: customer_id | first_order | most_recent_order | number_of_orders | customer_lifetime_value ------------+-------------+-------------------+------------------+------------------------- 1 | 2018-01-01 | 2018-02-10 | 2 | 33
To see what else you can do with your data in Materialize, check out our docs.
Test the newly created models:
Generate and view the documentation for your
dbt docs generate dbt docs serve
This walkthrough should leave you with a better understanding of how to integrate Materialize into your existing dbt workflow and start materializing views in true real-time fashion.
To get a better understanding of the upstream work necessary for getting your data into Materialize, start with our docs and join us in Slack if you have any questions.