We’ve recently partnered with dbt and community members to release a beta Materialize dbt adapter that allows you to transform your streaming data in real time using Materialize as your data warehouse.
This tutorial serves as a practical hands-on demonstration of the adapter. In this case, we are using static not streaming data, but the configuration and setup process is the same.
dbt’s jaffle_shop + Materialize
If you’ve used dbt, odds are that you’ve run across dbt’s beloved
jaffle_shop demo project.
jaffle_shop allows users to quickly get up and running with dbt, using some spoofed, static data for a fictional jaffle shop.
At Materialize, we specialize in maintaining fast and efficient views over your streaming data. While we work on hosting a public source of demo streaming data for analytics, we wanted to provide those familiar with dbt with an easy way to get up and running with our
dbt-materialize adapter and
jaffle_shop‘s static data.
Note: This demo won’t highlight what’s powerful about Materialize. For that, check out our
wikirecent-dbt demo or our docs!
Setting up a jaffle_shop with Materialize
Setting up the
jaffle_shop project with Materialize is similar to setting it up with any other data warehouse. The following instructions are based off the traditional
jaffle_shop steps with a few Materialize-specific modifications:
- Follow the first three steps of the
jaffle_shop instructions, install dbt, clone the
jaffle_shop repository, and navigate to the cloned repo on your machine.
- In your cloned
dbt_project.yml, make the following changes to the model materializations:
tags: ["staging", "hourly"]
Tip: Only materializing your
core business models as materialized views, without materializing your intermediate or staging views, ensures that you’re only using the memory you need in Materialize.
Install the dbt-materialize plugin. You may wish to do this within a Python virtual environment on your machine:
python3 -m venv dbt-venv
pip install dbt-materialize
- Install and run Materialize. The linked instructions will guide you through running a Materialize instance on your local machine. (Our cloud offering is being developed, you can register for the private beta here!)
jaffle_shop dbt profile that will connect to Materialize. The following profile will connect to a Materialize instance running locally on your machine. The
host parameter will need to be updated if it’s self-hosted in the cloud or run with Docker:
profiles.yml you’re using for this project is not located at
~/.dbt/, you will have to provide additional information to use the
dbt commands later on.
- Check that your newly created
jaffle_shop profile can connect to your Materialize instance:
- Load the static
jaffle_shop data into Materialize:
- Run the provided models:
- In a new shell, connect to Materialize to check out the
jaffle_shop data you just loaded:
# Connect to Materialize
psql -U materialize -h localhost -p 6875
# See all the newly created views
materialize=> SHOW VIEWS IN jaffle_shop;
# See only the materialized views
materialize=> SHOW MATERIALIZED VIEWS IN jaffle_shop;
# Check out data in one of your core models
materialize=> SELECT * FROM jaffle_shop.dim_customers WHERE customer_id = 1;
customer_id | first_order | most_recent_order | number_of_orders | customer_lifetime_value
1 | 2018-01-01 | 2018-02-10 | 2 | 33
To see what else you can do with your data in Materialize, check out our docs.
Test the newly created models:
- Generate and view the documentation for your
dbt docs generate
dbt docs serve
This walkthrough should leave you with a better understanding of how to integrate Materialize into your existing dbt workflow and start materializing views in true real-time fashion.
To get a better understanding of the upstream work necessary for getting your data into Materialize, start with our docs and join us in Slack if you have any questions.