dbt & Materialize: Streamline Jaffle Shop Demo | Materialize

Jessica Laughlin
March 24, 2021

Introduction

We've recently partnered with dbt and community members to release a beta Materialize dbt adapter that allows you to transform your streaming data in real time using Materialize as your data warehouse.

This tutorial serves as a practical hands-on demonstration of the adapter. In this case, we are using static not streaming data, but the configuration and setup process is the same.

dbt's jaffle_shop + Materialize

If you've used dbt, odds are that you've run across dbt's beloved jaffle_shop demo project. jaffle_shop allows users to quickly get up and running with dbt, using some spoofed, static data for a fictional jaffle shop.

At Materialize, we specialize in maintaining fast and efficient views over your streaming data. While we work on hosting a public source of demo streaming data for analytics, we wanted to provide those familiar with dbt with an easy way to get up and running with our dbt-materialize adapter and jaffle_shop's static data.

Note: This demo won’t highlight what’s powerful about Materialize. For that, check out our wikirecent-dbt demo or our docs!

Setting up a jaffle_shop with Materialize

Setting up the jaffle_shop project with Materialize is similar to setting it up with any other data warehouse. The following instructions are based off the traditional jaffle_shop steps with a few Materialize-specific modifications:

  1. Follow the first three steps of the jaffle_shop instructions, install dbt, clone the jaffle_shop repository, and navigate to the cloned repo on your machine.
  2. In your cloned dbt_project.yml, make the following changes to the model materializations:
1
models:
2
    jaffle_shop:
3
        marts:
4
            core:
5
                materialized: materializedview
6
                intermediate:
7
                    materialized: view
8
        staging:
9
            materialized: view
10
            tags: ["staging", "hourly"]
11

text
  1. Tip: Only materializing your core business models as materialized views, without materializing your intermediate or staging views, ensures that you're only using the memory you need in Materialize.
  2. Install the dbt-materialize plugin. You may wish to do this within a Python virtual environment on your machine:
1
python3 -m venv dbt-venv
2
source dbt-venv/bin/activate
3
pip install dbt-materialize
4

bash
  1. Install and run Materialize. The linked instructions will guide you through running a Materialize instance on your local machine. (Our cloud offering is being developed, you can register for the private beta here!)
  2. Create a jaffle_shop dbt profile that will connect to Materialize. The following profile will connect to a Materialize instance running locally on your machine. The host parameter will need to be updated if it's self-hosted in the cloud or run with Docker:
1
jaffle_shop:
2
    outputs:
3
        dev:
4
            type: materialize
5
            threads: 1
6
            host: localhost
7
            port: 6875
8
            user: materialize
9
            pass: password
10
            dbname: materialize
11
            schema: jaffle_shop
12

13
    target: dev
14

text
  1. If the profiles.yml you're using for this project is not located at ~/.dbt/, you will have to provide additional information to use the dbt commands later on.
  2. Check that your newly created jaffle_shop profile can connect to your Materialize instance:
1
dbt debug
2

bash
  1. Load the static jaffle_shop data into Materialize:
1
dbt seed
2

bash
  1. Run the provided models:
1
dbt run
2

bash
  1. In a new shell, connect to Materialize to check out the jaffle_shop data you just loaded:
1
# Connect to Materialize
2
psql -U materialize -h localhost -p 6875
3

bash
1
# See all the newly created views
2
materialize=> SHOW VIEWS IN jaffle_shop;
3
# Output:
4
    name
5
-------------------
6
customer_orders
7
customer_payments
8
dim_customers
9
fct_orders
10
order_payments
11
raw_customers
12
raw_orders
13
raw_payments
14
stg_customers
15
stg_orders
16
stg_payments
17

18
# See only the materialized views
19
materialize=> SHOW MATERIALIZED VIEWS IN jaffle_shop;
20
# Output:
21
    name
22
---------------
23
dim_customers
24
fct_orders
25
raw_customers
26
raw_orders
27
raw_payments
28

29
# Check out data in one of your core models
30
materialize=> SELECT * FROM jaffle_shop.dim_customers WHERE customer_id = 1;
31
# Output:
32
customer_id | first_order | most_recent_order | number_of_orders | customer_lifetime_value
33
------------+-------------+-------------------+------------------+-------------------------
34
          1 | 2018-01-01  | 2018-02-10        |                2 |                      33
35
bash
  1. To see what else you can do with your data in Materialize, check out our docs.
  2. Test the newly created models:
1
dbt test
2

bash
  1. Generate and view the documentation for your jaffle_shop project:
1
dbt docs generate
2
dbt docs serve
3

bash

Conclusion

This walkthrough should leave you with a better understanding of how to integrate Materialize into your existing dbt workflow and start materializing views in true real-time fashion.

To get a better understanding of the upstream work necessary for getting your data into Materialize, start with our docs and join us in Slack if you have any questions.

Jessica

Jessica Laughlin

Head of Product, Materialize

Jessica joined Materialize as the fifth engineer, coming from Goldman Sachs and Bluecore. She leads the product organization, focused on shipping features that make Materialize users successful in production.