Materialize Documentation
s
Join the Community github/materialize

Get Started with Materialize

This guide walks you through getting started with Materialize, from installing it to creating your first materialized view on top of streaming data. We’ll cover:

Install, run, connect

Select an environment and follow the instructions to get the latest stable release of Materialize (v0.26.6).

  1. Open a terminal and spin up a container running materialized:

    docker run -p 6875:6875 materialize/materialized:v0.26.6 --workers 1
    

    This starts a process using one worker thread and listening on port 6875 by default.

  2. Using a new terminal window, you can then connect to the running instance using any Materialize-compatible CLI, like psql. If you already have psql installed on your machine, connect using:

    psql -U materialize -h localhost -p 6875 materialize
    

    Otherwise, you can find the steps to install and use your CLI of choice under Install.

  1. If you’re using Homebrew, open a terminal and run:

    brew install MaterializeInc/materialize/materialized
    

    Note: For a curl-based alternative, see Install.

  2. Once the installation is complete, you can start the materialized process:

    materialized --workers 1
    

    This starts a process using one worker thread and listening on port 6875 by default.

  3. Using a new terminal window, you can then connect to the running instance using any Materialize-compatible CLI, like psql. If you have psql installed on your machine, connect using:

    psql -U materialize -h localhost -p 6875 materialize
    

    Otherwise, you can find the steps to install and use your CLI of choice under Install.

  1. If you’re using apt, open a terminal and run (as root):

    # Add the signing key for the Materialize apt repository
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 79DEC5E1B7AE7694
    # Add and update the repository
    sh -c 'echo "deb http://apt.materialize.com/ generic main" > /etc/apt/sources.list.d/materialize.list'
    apt update
    # Install materialized
    apt install materialized
    

    Note: For a curl-based alternative, see Install.

  2. Once the installation is complete, you can start the materialized process:

    materialized --workers 1
    

    This starts a process using one worker thread and listening on port 6875 by default.

  3. Using a new terminal window, you can then connect to the running instance using any Materialize-compatible CLI, like psql. If you have psql installed on your machine, connect using:

    psql -U materialize -h localhost -p 6875 materialize
    

    Otherwise, you can find the steps to install and use your CLI of choice under Install.

Explore a streaming source

Materialize allows you to work with streaming data from multiple external sources using nothing but standard SQL. You write arbitrarily complex queries; Materialize takes care of maintaining the results automatically up to date with very low latency.

We’ll start with some sample real-time data from a PubNub stream receiving the latest market orders for a given marketplace.

  1. Let’s create a PubNub source that connects to the market orders channel with a subscribe key:

    CREATE SOURCE market_orders_raw
    FROM PUBNUB
    SUBSCRIBE KEY 'sub-c-4377ab04-f100-11e3-bffd-02ee2ddab7fe'
    CHANNEL 'pubnub-market-orders';
    

    The CREATE SOURCE statement is a definition of where to find and how to connect to our data source — Materialize won’t start ingesting data just yet.

    To list the columns created:

    SHOW COLUMNS FROM market_orders_raw;
    
  2. The PubNub source produces data as a single text column containing JSON. To extract the JSON fields for each market order, you can use the built-in jsonb operators:

    CREATE VIEW market_orders AS
    SELECT
        ((text::jsonb)->>'bid_price')::float AS bid_price,
        (text::jsonb)->>'order_quantity' AS order_quantity,
        (text::jsonb)->>'symbol' AS symbol,
        (text::jsonb)->>'trade_type' AS trade_type,
        to_timestamp(((text::jsonb)->'timestamp')::bigint) AS ts
    FROM market_orders_raw;
    

    One thing to note here is that we created a non-materialized view, which doesn’t store the results of the query but simply provides an alias for the embedded SELECT statement.

  3. We can now use this view as a base to create a materialized view that computes the average bid price:

    CREATE MATERIALIZED VIEW avg_bid AS
    SELECT symbol,
           AVG(bid_price) AS avg
    FROM market_orders
    GROUP BY symbol;
    

    The avg_bid view is incrementally updated as new data streams in, so you get fresh and correct results with millisecond latency. Behind the scenes, Materialize is indexing the results of the embedded query in memory (i.e. materializing the view).

  4. Let’s check the results:

    SELECT * FROM avg_bid;
    
      symbol    |        avg
    ------------+--------------------
    Apple       | 199.3392717416626
    Google      | 299.40371152970334
    Elerium     | 155.04668809209852
    Bespin Gas  | 202.0260593073953
    Linen Cloth | 254.34273792647863
    

    If you re-run the SELECT statement at different points in time, you can see the updated results based on the latest data.

  5. To see the sequence of updates affecting the results over time, you can use TAIL:

    COPY (TAIL avg_bid) TO stdout;
    

    To cancel out of the stream, press CTRL+C.

Joins

Materialize efficiently supports all types of SQL joins under all the conditions you would expect from a traditional relational database. Let’s enrich the PubNub stream with some reference data as an example!

  1. Create and populate a table with static reference data:

    CREATE TABLE symbols (
        symbol text,
        ticker text
    );
    
    INSERT INTO symbols
    SELECT *
    FROM (VALUES ('Apple','AAPL'),
                 ('Google','GOOG'),
                 ('Elerium','ELER'),
                 ('Bespin Gas','BGAS'),
                 ('Linen Cloth','LCLO')
    );
    
    

    Note: We are using a table for convenience to avoid adding complexity to the guide. It’s unlikely that you’ll need to use tables in real-world scenarios.

  2. Now, we can enrich our aggregated data with the ticker for each stock using a regular JOIN:

    CREATE MATERIALIZED VIEW cnt_ticker AS
    SELECT s.ticker AS ticker,
           COUNT(*) AS cnt
    FROM market_orders m
    JOIN symbols s ON m.symbol = s.symbol
    GROUP BY s.ticker;
    
  3. To see the results:

    SELECT * FROM cnt_ticker;
    
     ticker | cnt
    --------+-----
     AAPL   |  42
     BGAS   |  49
     ELER   |  68
     GOOG   |  51
     LCLO   |  70
    

    If you re-run the SELECT statement at different points in time, you can see the updated results based on the latest data.

Temporal filters

In Materialize, temporal filters allow you to define time-windows over otherwise unbounded streams of data. This is useful to model business processes or simply to limit resource usage, for example.

  1. If, instead of computing and maintaining the overall count, we want to get the moving count over the past minute:

    CREATE MATERIALIZED VIEW cnt_sliding AS
    SELECT symbol,
           COUNT(*) AS cnt
    FROM market_orders m
    WHERE EXTRACT(EPOCH FROM (ts + INTERVAL '1 minute'))::bigint * 1000 > mz_logical_timestamp()
    GROUP BY symbol;
    

    The mz_logical_timestamp() function is used to keep track of the logical time that your query executes (similar to now() in other systems, as explained more in-depth in “now and mz_logical_timestamp functions”).

  2. To see the results:

    SELECT * FROM cnt_sliding;
    
       symbol    | cnt
    -------------+-----
     Apple       |  31
     Google      |  40
     Elerium     |  46
     Bespin Gas  |  35
     Linen Cloth |  45
    

    As it advances, only the records that satisfy the time constraint are used in the materialized view and contribute to the in-memory footprint.

Learn more

That’s it! You just created your first materialized view and tried out some common patterns enabled by SQL on streams. We encourage you to continue exploring the PubNub source using the supported SQL commands, and read through “What is Materialize?" for a more comprehensive overview.

Next steps

When you’re done with this guide, you can move on to “Quickstarts” to learn how to use Materialize with other external systems for different use cases.