Ecosystem

dbt + Materialize demo: Running dbt’s jaffle_shop with Materialize

Introduction

We’ve recently partnered with dbt and community members to release a beta Materialize dbt adapter that allows you to transform your streaming data in real time using Materialize as your data warehouse.

This tutorial serves as a practical hands-on demonstration of the adapter. In this case, we are using static not streaming data, but the configuration and setup process is the same.

dbt’s jaffle_shop + Materialize

If you’ve used dbt, odds are that you’ve run across dbt’s beloved jaffle_shop demo project. jaffle_shop allows users to quickly get up and running with dbt, using some spoofed, static data for a fictional jaffle shop.

At Materialize, we specialize in maintaining fast and efficient views over your streaming data. While we work on hosting a public source of demo streaming data for analytics, we wanted to provide those familiar with dbt with an easy way to get up and running with our dbt-materialize adapter and jaffle_shop‘s static data.

Note: This demo won’t highlight what’s powerful about Materialize. For that, check out our
wikirecent-dbt demo or our docs!

Setting up a jaffle_shop with Materialize

Setting up the jaffle_shop project with Materialize is similar to setting it up with any other data warehouse. The following instructions are based off the traditional jaffle_shop steps with a few Materialize-specific modifications:

  1. Follow the first three steps of the jaffle_shop instructions, install dbt, clone the jaffle_shop repository, and navigate to the cloned repo on your machine.

  2. In your cloned dbt_project.yml, make the following changes to the model materializations:

    models:
        jaffle_shop:
            marts:
                core:
                    materialized: materializedview
                    intermediate:
                        materialized: view
            staging:
                materialized: view
                tags: ["staging", "hourly"]
    

    Tip: Only materializing your core business models as materialized views, without materializing your intermediate or staging views, ensures that you’re only using the memory you need in Materialize.

  3. Install the dbt-materialize plugin. You may wish to do this within a Python virtual environment on your machine:

    python3 -m venv dbt-venv
    source dbt-venv/bin/activate
    pip install dbt-materialize
    
  4. Install and run Materialize. The linked instructions will guide you through running a Materialize instance on your local machine. (Our cloud offering is being developed, you can register for the private beta here!)

  5. Create a jaffle_shop dbt profile that will connect to Materialize. The following profile will connect to a Materialize instance running locally on your machine. The host parameter will need to be updated if it’s self-hosted in the cloud or run with Docker:

    jaffle_shop:
        outputs:
            dev:
            type: materialize
            threads: 1
            host: localhost
            port: 6875
            user: materialize
            pass: password
            dbname: materialize
            schema: jaffle_shop
    
        target: dev
    

    If the profiles.yml you’re using for this project is not located at ~/.dbt/, you will have to provide additional information to use the dbt commands later on.

  6. Check that your newly created jaffle_shop profile can connect to your Materialize instance:
    dbt debug
    
  7. Load the static jaffle_shop data into Materialize:
    dbt seed
    
  8. Run the provided models:
    dbt run
    
  9. In a new shell, connect to Materialize to check out the jaffle_shop data you just loaded:
    # Connect to Materialize
    psql -U materialize -h localhost -p 6875
    
    # See all the newly created views
    materialize=> SHOW VIEWS IN jaffle_shop;
    # Output:
        name
    -------------------
    customer_orders
    customer_payments
    dim_customers
    fct_orders
    order_payments
    raw_customers
    raw_orders
    raw_payments
    stg_customers
    stg_orders
    stg_payments
    
    # See only the materialized views
    materialize=> SHOW MATERIALIZED VIEWS IN jaffle_shop;
    # Output:
        name
    ---------------
    dim_customers
    fct_orders
    raw_customers
    raw_orders
    raw_payments
    
    # Check out data in one of your core models
    materialize=> SELECT * FROM jaffle_shop.dim_customers WHERE customer_id = 1;
    # Output:
    customer_id | first_order | most_recent_order | number_of_orders | customer_lifetime_value
    ------------+-------------+-------------------+------------------+-------------------------
              1 | 2018-01-01  | 2018-02-10        |                2 |                      33
    

    To see what else you can do with your data in Materialize, check out our docs.

  10. Test the newly created models:

    dbt test
    
  11. Generate and view the documentation for your jaffle_shop project:
    dbt docs generate
    dbt docs serve
    

Conclusion

This walkthrough should leave you with a better understanding of how to integrate Materialize into your existing dbt workflow and start materializing views in true real-time fashion.

To get a better understanding of the upstream work necessary for getting your data into Materialize, start with our docs and join us in Slack if you have any questions.