Want to connect with Materialize? Join our growing community on Slack!

Quick Start

Get up and running with Materialize in a matter of minutes

Get Started

To help you get started with Materialize we’ll
  • Install, run, and connect to Materialize
  • Explore its API
  • Set up a real-time stream to perform aggregations on Wikipedia user edits.
Prerequisites:

To complete this demo, you need:

  • Command line and network access.
  • A Materialize-compatible CLI. If you have Materialize installed, psql works.

We also highly recommend checking out What is Materialize?

Install, run, connect

1

Install the materialized binary

  • Docker
  • MacOs
  • Linux
  • Build from source
  • Run the Binary
  • We provide the materialize/materialized image in Docker Hub. If you already have Docker installed, you can run materialized with your tools in the usual way. For example:

    docker run -p 6875:6875 materialize/materialized:v0.5.2 --workers 1
  • Assuming you’ve installed Homebrew:

    brew install MaterializeInc/materialize/materialized
  • apt (Ubuntu, Debian, or variants)
    Run the following commands as root.

    # Add the signing key for the Materialize apt repository
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 379CE192D401AB61
    # Add and update the repository
    sh -c 'echo "deb http://packages.materialize.io/apt/ /" > /etc/apt/sources.list.d/materialize.list'
    apt update
    # Install materialized
    apt install materialized
  • Materialize is written primarily in Rust, but incorporates several components written in C. To build Materialize, you will need to acquire the following tools:

    A recent version of Git

    A C compiler that supports C11

    CMake v3.2+

    Rust v1.48.0+

    Your system’s package manager, like Homebrew on macOS or APT on Debian, likely contains sufficiently recent versions of Git, a C compiler, and CMake. However, we recommend installing Rust via rustup. rustup configures your system so that running cargo build in the Materialize repository will automatically download and use the correct version of Rust.

    WARNING! Materialize requires a very recent version of Rust. The version of Rust available in your package manager is likely too old.
    Once you’ve installed the prerequisites, to build your own materialized binary, you can clone the MaterializeInc/materialize repo from GitHub, and build it using cargo build. Be sure to check out the tag for the correct release.

    git clone https://github.com/MaterializeInc/materialize.git 
    cd materialize
    git checkout v0.5.2
    cargo build --release
  • You can start the materialized process by simply running the binary, e.g.

    -w 1 specifies that the process will use 1 worker. You can also find more detail about our command line flags.

    By default materialized uses:

    Detail Info
    Database  materialize
    Port          6875

    For more information, see CLI Connections.

    ./materialized -w 1

2

Run the materialized binary. For example, if you installed it in your $PATH

This starts the daemon listening on port 6875 using 1 worker.:

materialized -w 1

Explore Materialize’s API

Materialize offers ANSI Standard SQL, but is not simply a relational database. Instead of tables of data, you typically connect Materialize to external sources of data (called sources), and then create materialized views of the data that Materialize sees from those sources.

To get started, though, we’ll begin with a simple version that doesn’t require connecting to an external data source.

1

From your Materialize CLI, create a materialized view that contains actual data we can work with.

You’ll notice that we end up entering data into Materialize by creating a materialized view from some other data, rather than the typical INSERT operation. This is how one interacts with Materialize. In most cases, this data would have come from an external source and get fed into Materialize from a file or a stream.

CREATE MATERIALIZED VIEW pseudo_source (key, value) AS
    VALUES ('a', 1), ('a', 2), ('a', 3), ('a', 4),
    ('b', 5), ('c', 6), ('c', 7);

2

With data in a materialized view, we can perform arbitrary SELECT statements on the data.

Let’s start by viewing all of the data:

SELECT * FROM pseudo_source;
 key | value
-----+-------
 a   |     1
 a   |     2
 a   |     3
 a   |     4
 b   |     5
 c   |     6
 c   |     7

3

Determine the sum of the values for each key:

SELECT key, sum(value) FROM pseudo_source GROUP BY key;
 key | sum
-----+-----
 a   |  10
 b   |   5
 c   |  13

4

We can actually then save this query as its own materialized view:

CREATE MATERIALIZED VIEW lhs (key, value) AS
    VALUES ('x', 'a'), ('y', 'b'), ('z', 'c');
SELECT lhs.key, sum(rhs.value)
FROM lhs
JOIN pseudo_source AS rhs
ON lhs.value = rhs.key
GROUP BY lhs.key;

5

Determine the sum of all keys’ sums:

SELECT sum(sum) FROM key_sums;

6

We can also perform complex operations like JOINS. Given the simplicity of our data, the JOIN clauses themselves aren’t very exciting, but Materialize offers support for a full range of arbitrarily complex JOINS

CREATE MATERIALIZED VIEW lhs (key, value) AS
    VALUES ('x', 'a'), ('y', 'b'), ('z', 'c');

SELECT lhs.key, sum(rhs.value)
FROM lhs
JOIN pseudo_source AS rhs
ON lhs.value = rhs.key
GROUP BY lhs.key;

Create a real-time stream

Materialize is built to handle streams of data, and provide incredibly low-latency answers to queries over that data. To show off that capability, in this section we’ll set up a real-time stream, and then see how Materialize lets you query it.

1

We’ll set up a stream of Wikipedia’s recent changes, and simply write all data that we see to a file.

From your shell, run this command.

Note the absolute path of the location where you write wikirecent, which we’ll need in the next step

while true; do
  curl --max-time 9999999 -N https://stream.wikimedia.org/v2/stream/recentchange >> wikirecent
done

2

From within the CLI, create a source from the wikirecent file:

CREATE SOURCE wikirecent
FROM FILE '[path to wikirecent]' WITH (tail = true)
FORMAT REGEX '^data: (?P<data>.*)';

3

This source takes the lines from the stream, finds those that begins with data:, and then captures the rest of the line in a column called data

You can see the columns that get generated for this source:

SHOW COLUMNS FROM wikirecent;

4

Because this stream comes in as JSON, we’ll need to normalize the data to perform aggregations on it. Materialize offers the ability to do this easily using our built-in jsonb functions.

CREATE MATERIALIZED VIEW recentchanges AS
    SELECT
        val->>'$schema' AS r_schema,
        (val->'bot')::bool AS bot,
        val->>'comment' AS comment,
        (val->'id')::float::int AS id,
        (val->'length'->'new')::float::int AS length_new,
        (val->'length'->'old')::float::int AS length_old,
        val->'meta'->>'uri' AS meta_uri,
        val->'meta'->>'id' as meta_id,
        (val->'minor')::bool AS minor,
        (val->'namespace')::float AS namespace,
        val->>'parsedcomment' AS parsedcomment,
        (val->'revision'->'new')::float::int AS revision_new,
        (val->'revision'->'old')::float::int AS revision_old,
        val->>'server_name' AS server_name,
        (val->'server_script_path')::text AS server_script_path,
        val->>'server_url' AS server_url,
        (val->'timestamp')::float AS r_ts,
        val->>'title' AS title,
        val->>'type' AS type,
        val->>'user' AS user,
        val->>'wiki' AS wiki
    FROM (SELECT data::jsonb AS val FROM wikirecent);

5

From here we can start building our aggregations. The simplest place to start is simply counting the number of items we’ve seen:

CREATE MATERIALIZED VIEW counter AS
    SELECT COUNT(*) FROM recentchanges;

6

However, we can also see more interesting things from our stream. For instance, who are making the most changes to Wikipedia?

CREATE MATERIALIZED VIEW useredits AS
    SELECT user, count(*) FROM recentchanges GROUP BY user;

SELECT * FROM useredits ORDER BY count DESC;

7

If this is a factoid we often want to know, we could also create a view of just the top 10 editors we’ve seen.

CREATE MATERIALIZED VIEW top10 AS
    SELECT * FROM useredits ORDER BY count DESC LIMIT 10;

8

We can then quickly get the answer to who the top 10 editors are:

SELECT * FROM top10 ORDER BY count DESC;

Want to get in touch with someone at Materialize?

Connect with a Materialize engineer for an in-person overview. Lorem ipsum dolor set amit..

  •