Introduction
We've recently partnered with dbt and community members to release a beta Materialize dbt adapter that allows you to transform your streaming data in real time using Materialize as your data warehouse.
This tutorial serves as a practical hands-on demonstration of the adapter. In this case, we are using static not streaming data, but the configuration and setup process is the same.
dbt's jaffle_shop + Materialize
If you've used dbt, odds are that you've run across dbt's beloved jaffle_shop demo project. jaffle_shop allows users to quickly get up and running with dbt, using some spoofed, static data for a fictional jaffle shop.
At Materialize, we specialize in maintaining fast and efficient views over your streaming data. While we work on hosting a public source of demo streaming data for analytics, we wanted to provide those familiar with dbt with an easy way to get up and running with our dbt-materialize adapter and jaffle_shop's static data.
Note: This demo won’t highlight what’s powerful about Materialize. For that, check out our wikirecent-dbt demo or our docs!
Setting up a jaffle_shop with Materialize
Setting up the jaffle_shop project with Materialize is similar to setting it up with any other data warehouse. The following instructions are based off the traditional jaffle_shop steps with a few Materialize-specific modifications:
- Follow the first three steps of the
jaffle_shop instructions, install dbt, clone the jaffle_shop repository, and navigate to the cloned repo on your machine. - In your cloned
dbt_project.yml, make the following changes to the model materializations:
1 | models: |
2 | jaffle_shop: |
3 | marts: |
4 | core: |
5 | materialized: materializedview |
6 | intermediate: |
7 | materialized: view |
8 | staging: |
9 | materialized: view |
10 | tags: ["staging", "hourly"] |
11 |
|
- Tip: Only materializing your
core business models as materialized views, without materializing your intermediate or staging views, ensures that you're only using the memory you need in Materialize. - Install the dbt-materialize plugin. You may wish to do this within a Python virtual environment on your machine:
1 | python3 -m venv dbt-venv |
2 | source dbt-venv/bin/activate |
3 | pip install dbt-materialize |
4 |
|
- Install and run Materialize. The linked instructions will guide you through running a Materialize instance on your local machine. (Our cloud offering is being developed, you can register for the private beta here!)
- Create a
jaffle_shop dbt profile that will connect to Materialize. The following profile will connect to a Materialize instance running locally on your machine. The host parameter will need to be updated if it's self-hosted in the cloud or run with Docker:
1 | jaffle_shop: |
2 | outputs: |
3 | dev: |
4 | type: materialize |
5 | threads: 1 |
6 | host: localhost |
7 | port: 6875 |
8 | user: materialize |
9 | pass: password |
10 | dbname: materialize |
11 | schema: jaffle_shop |
12 |
|
13 | target: dev |
14 |
|
- If the
profiles.yml you're using for this project is not located at ~/.dbt/, you will have to provide additional information to use the dbt commands later on. - Check that your newly created
jaffle_shop profile can connect to your Materialize instance:
- Load the static
jaffle_shop data into Materialize:
- Run the provided models:
- In a new shell, connect to Materialize to check out the
jaffle_shop data you just loaded:
1 | |
2 | psql -U materialize -h localhost -p 6875 |
3 |
|
1 | |
2 | materialize=> SHOW VIEWS IN jaffle_shop; |
3 | |
4 | name |
5 | ------------------- |
6 | customer_orders |
7 | customer_payments |
8 | dim_customers |
9 | fct_orders |
10 | order_payments |
11 | raw_customers |
12 | raw_orders |
13 | raw_payments |
14 | stg_customers |
15 | stg_orders |
16 | stg_payments |
17 |
|
18 | |
19 | materialize=> SHOW MATERIALIZED VIEWS IN jaffle_shop; |
20 | |
21 | name |
22 | --------------- |
23 | dim_customers |
24 | fct_orders |
25 | raw_customers |
26 | raw_orders |
27 | raw_payments |
28 |
|
29 | |
30 | materialize=> SELECT * FROM jaffle_shop.dim_customers WHERE customer_id = 1; |
31 | |
32 | customer_id | first_order | most_recent_order | number_of_orders | customer_lifetime_value |
33 | ------------+-------------+-------------------+------------------+------------------------- |
34 | 1 | 2018-01-01 | 2018-02-10 | 2 | 33 |
35 | |
- To see what else you can do with your data in Materialize, check out our docs.
- Test the newly created models:
- Generate and view the documentation for your
jaffle_shop project:
1 | dbt docs generate |
2 | dbt docs serve |
3 |
|
Conclusion
This walkthrough should leave you with a better understanding of how to integrate Materialize into your existing dbt workflow and start materializing views in true real-time fashion.
To get a better understanding of the upstream work necessary for getting your data into Materialize, start with our docs and join us in Slack if you have any questions.