Get Started with Cloud
This guide walks you through getting started with Materialize Cloud, from setting up an account to creating your first materialized view on top of streaming data. We’ll cover:
-
Signing up for Materialize Cloud
-
Creating and connecting to a Materialize Cloud deployment
-
Connecting to a streaming data source
-
Creating a materialized view
-
Exploring common patterns like joins and time-windowing
Sign up
-
Sign up for Materialize Cloud at https://cloud.materialize.com.
-
Once your account has been created, log in.
-
If you’ve been invited to an existing workspace, jump directly to Deploy and Connect. Otherwise, a dialog will ask you to create a workspace.
Enter a name and click Next.
Deploy and connect
-
In the deployments page, click Create deployment in the upper right corner. Enter a unique Name (or use the default) and choose Extra small (XS) as the deployment size.
Then, click Create.
Note: The size of new deployments is set to extra small (XS) by default, which is enough to run this walkthrough. For more information on deployment sizes, check Account Limits.
-
Once the status message reads
HEALTHY
, the deployment is ready for connections!Before you can connect, though, you need to install some TLS certificates on your local machine.
-
Click on a deployment to see the individual deployment view.
-
If you are connecting through Tailscale VPN, go to the Integrations card, then the Tailscale tab. Click Enable, then enter the Tailscale auth key in the Tailscale Authentication dialog and click Save. The deployment will automatically restart.
Tailscale integration must enabled separately for each deployment.
For help generating an auth key, please see the Tailscale documentation. Make sure you create a one-off key.
-
In the Connect card, go to the psql tab and click Download certificates.
-
Unzip the certificate package.
The certificate package is named
<deployment-name>.zip
and contains the TLS certificates and keys you need to connect. -
To connect to the deployment, you can use any Materialize-compatible CLI, like
psql
. If you havepsql
installed on your machine, proceed to the next step.Otherwise, follow the installation instructions.
-
Open a terminal window on your machine, copy the
psql
command from the connection dialog and run it from the directory that contains the certificates.If you’re connecting through Tailscale, replace the hostname in the copied command with the address of your Materialize connection in Tailscale.
On connection success, you should see an open prompt waiting for your next commands:
psql (13.2, server 9.5.0) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off) Type "help" for help. materialize=>
Explore a streaming source
Materialize allows you to work with streaming data from multiple external sources using nothing but standard SQL. You write arbitrarily complex queries; Materialize takes care of maintaining the results automatically up to date with very low latency.
We’ll start with some sample real-time data from a PubNub stream receiving the latest market orders for a given marketplace.
-
Let’s create a PubNub source that connects to the market orders channel with a subscribe key:
CREATE SOURCE market_orders_raw FROM PUBNUB SUBSCRIBE KEY 'sub-c-4377ab04-f100-11e3-bffd-02ee2ddab7fe' CHANNEL 'pubnub-market-orders';
The
CREATE SOURCE
statement is a definition of where to find and how to connect to our data source — Materialize won’t start ingesting data just yet.To list the columns created:
SHOW COLUMNS FROM market_orders_raw;
-
The PubNub source produces data as a single text column containing JSON. To extract the JSON fields for each market order, you can use the built-in
jsonb
operators:CREATE VIEW market_orders AS SELECT ((text::jsonb)->>'bid_price')::float AS bid_price, (text::jsonb)->>'order_quantity' AS order_quantity, (text::jsonb)->>'symbol' AS symbol, (text::jsonb)->>'trade_type' AS trade_type, to_timestamp(((text::jsonb)->'timestamp')::bigint) AS ts FROM market_orders_raw;
One thing to note here is that we created a non-materialized view, which doesn’t store the results of the query but simply provides an alias for the embedded
SELECT
statement. -
We can now use this view as a base to create a materialized view that computes the average bid price:
CREATE MATERIALIZED VIEW avg_bid AS SELECT symbol, AVG(bid_price) AS avg FROM market_orders GROUP BY symbol;
The
avg_bid
view is incrementally updated as new data streams in, so you get fresh and correct results with millisecond latency. Behind the scenes, Materialize is indexing the results of the embedded query in memory (i.e. materializing the view). -
Let’s check the results:
SELECT * FROM avg_bid;
symbol | avg ------------+-------------------- Apple | 199.3392717416626 Google | 299.40371152970334 Elerium | 155.04668809209852 Bespin Gas | 202.0260593073953 Linen Cloth | 254.34273792647863
If you re-run the
SELECT
statement at different points in time, you can see the updated results based on the latest data. -
To see the sequence of updates affecting the results over time, you can use
TAIL
:COPY (TAIL avg_bid) TO stdout;
To cancel out of the stream, press CTRL+C.
Joins
Materialize efficiently supports all types of SQL joins under all the conditions you would expect from a traditional relational database. Let’s enrich the PubNub stream with some reference data as an example!
-
Create and populate a table with static reference data:
CREATE TABLE symbols ( symbol text, ticker text ); INSERT INTO symbols SELECT * FROM (VALUES ('Apple','AAPL'), ('Google','GOOG'), ('Elerium','ELER'), ('Bespin Gas','BGAS'), ('Linen Cloth','LCLO') );
Note: We are using a table for convenience to avoid adding complexity to the guide. It’s unlikely that you’ll need to use tables in real-world scenarios.
-
Now we can enrich our aggregated data with the ticker for each stock using a regular
JOIN
:CREATE MATERIALIZED VIEW cnt_ticker AS SELECT s.ticker AS ticker, COUNT(*) AS cnt FROM market_orders m JOIN symbols s ON m.symbol = s.symbol GROUP BY s.ticker;
-
To see the results:
SELECT * FROM cnt_ticker; ticker | cnt --------+----- AAPL | 42 BGAS | 49 ELER | 68 GOOG | 51 LCLO | 70
If you re-run the
SELECT
statement at different points in time, you can see the updated results based on the latest data.
Temporal filters
In Materialize, temporal filters allow you to define time-windows over otherwise unbounded streams of data. This can be useful for things like modeling business processes or limiting resource usage.
-
If, instead of computing and maintaining the overall count of market orders, we want to get the moving count from the past minute, we’d use a temporal filter defined by the
mz_logical_timestamp()
function:CREATE MATERIALIZED VIEW cnt_sliding AS SELECT symbol, COUNT(*) AS cnt FROM market_orders m WHERE EXTRACT(EPOCH FROM (ts + INTERVAL '1 minute'))::bigint * 1000 > mz_logical_timestamp() GROUP BY symbol;
The
mz_logical_timestamp()
function is used to keep track of the logical time that your query executes (similar tonow()
in other systems, as explained more in-depth in “now and mz_logical_timestamp functions”). -
To see the results:
SELECT * FROM cnt_sliding; symbol | cnt -------------+----- Apple | 31 Google | 40 Elerium | 46 Bespin Gas | 35 Linen Cloth | 45
As it advances, only the records that satisfy the time constraint are used in the materialized view and contribute to the in-memory footprint.
Learn more
That’s it! You just got up and running with Materialize Cloud, creating your first materialized view and trying out some common queries enabled by SQL on streams. We encourage you to continue exploring the PubNub source using the supported SQL commands. You can read through the following resources for more comprehensive overviews: