In our last blog about our Quality Assurance (QA) team, we gave an overview of the QA process, including our software and testing methods. One of our key tools during testing is the Materialize Emulator, a Docker image that allows you to maintain a locally hosted version of Materialize.
But there's an important caveat: the Materialize Emulator cannot support production workloads. The Materialize Emulator lacks critical features of our cloud platform, such as fault tolerance and horizontal scalability. However, it does include a local web UI (Materialize Console) available at http://localhost:6874. The Emulator is great for testing and prototyping.
In the following blog, we'll outline a step-by-step walkthrough of how to use the Materialize Emulator.
Materialize Emulator: What Is It?
The Materialize Emulator is an all-in-one Docker image available on Docker Hub for testing and evaluation purposes. The Emulator is not representative of Materialize’s performance and full feature set.
To view a comparison between the Materialize Emulator and the Materialize cloud platform, see the table below:
Materialize Emulator
Materialize SaaS
Production deployments
❌ Not suitable due to performance and license limitations.
✔️
Performance
❌ Limited. Services are bundled in a single container.
✔️ High. Services are scaled across many machines.
Sample data
✔️ Quickstart data source
✔️ Quickstart data source
Data sources
✔️ Connect using a streamlined GUI
✔️ Connect using a streamlined GUI
Version upgrades
✔️ Manual, with no data persistence
✔️ Automated, with data persistence
GUI
✔️ Materialize Console (http://localhost:6874)
✔️ Materialize Console
We've always used the Materialize Emulator for testing, except for the kind that requires cloud integration with Kubernetes.
If you want to use Materialize in production scenarios, sign up for a free trial account or schedule a demo.
Step-by-Step Walkthrough: How to Use the Materialize Emulator
Let's walk through a basic example of how to use the Materialize Emulator with a PostgreSQL source. The only requirements are Docker and the postgres-client (psql).
1 | docker network create mznet |
2 | docker pull materialize/materialized:latest |
3 | docker run --name materialized --network mznet -d \ |
4 | -p 127.0.0.1:6874:6874 \ |
5 | -p 127.0.0.1:6875:6875 \ |
6 | -p 127.0.0.1:6876:6876 \ |
7 | -p 127.0.0.1:6877:6877 \ |
8 | -e MZ_EAT_MY_DATA=1 \ |
9 | materialize/materialized:latest |
10 |
|
We'll publish the ports to localhost, since Materialize is running without authentication. Without a NAT or firewall, anyone on the internet can connect to your Materialize instance. You can specify ports if you want to allow access, such as: -p 6874:6874, -p 6875:6875, or -p 6876:6876.
Note that we set MZ_EAT_MY_DATA=1, which disables commands like fsync, so that we might lose data in case of a system crash. Since we are only using the Materialize emulator for local testing, we don't mind the risk of data loss. If you are interested in running production workloads on Materialize, check out Materialize Cloud, Self-managed Materialize or at least don't set MZ_EAT_MY_DATA=1 in your Materialize emulator.
Now Materialize is running locally.
- Open the Materialize Console (web UI) at
http://localhost:6874 to use the built-in SQL Shell. - Alternatively, connect with
psql:
1 | $ psql postgres://materialize@127.0.0.1:6875/materialize |
2 | NOTICE: connected to Materialize v0.156.0 |
3 | Org ID: 4b733a37-b64d-44a2-8e79-e0ebd8a177ba |
4 | Region: docker/container |
5 | User: materialize |
6 | Cluster: quickstart |
7 | Database: materialize |
8 | Schema: public |
9 | Session UUID: 2631437c-61d6-4984-a68b-433f5751cecf |
10 |
|
11 | Issue a SQL query to get started. Need help? |
12 | View documentation: https://materialize.com/s/docs |
13 | Join our Slack community: https://materialize.com/s/chat |
14 |
|
15 | psql (16.10 (Ubuntu 16.10-0ubuntu0.24.04.1), server 9.5.0) |
16 | Type "help" for help. |
17 |
|
18 | materialize=> |
19 |
|
Let's start up a Postgres server:
1 | docker run --name postgres --network mznet \ |
2 | -e POSTGRES_PASSWORD=postgres \ |
3 | -e POSTGRES_INITDB_ARGS="-c wal_level=logical" \ |
4 | -p 127.0.0.1:5432:5432 -d postgres |
5 |
|
Connect to the Postgres server. Then generate a simple table. We will replicate this table to Materialize.
1 | $ psql postgres://postgres:postgres@127.0.0.1:5432/postgres |
2 | psql (16.10 (Ubuntu 16.10-0ubuntu0.24.04.1), server 17.6 (Debian 17.6-1.pgdg13+1)) |
3 | WARNING: psql major version 16, server major version 17. |
4 | Some psql features might not work. |
5 | Type "help" for help. |
6 |
|
7 | postgres=# CREATE PUBLICATION mz_source FOR ALL TABLES; |
8 | CREATE PUBLICATION |
9 | postgres=# CREATE TABLE t (f1 INTEGER); |
10 | CREATE TABLE |
11 | postgres=# ALTER TABLE t REPLICA IDENTITY FULL; |
12 | ALTER TABLE |
13 | postgres=# INSERT INTO t VALUES (1), (2), (3); |
14 | INSERT 0 3 |
15 |
|
Now use Materialize to connect to the Postgres instance:
1 | materialize=> CREATE SECRET pgpass AS 'postgres'; |
2 | CREATE SECRET |
3 | materialize=> CREATE CONNECTION pg TO POSTGRES ( |
4 | HOST postgres, DATABASE postgres, USER postgres, PASSWORD SECRET pgpass |
5 | ); |
6 | CREATE CONNECTION |
7 | materialize=> CREATE SOURCE mz_source FROM POSTGRES CONNECTION pg ( |
8 | PUBLICATION 'mz_source' |
9 | ) FOR SCHEMAS (public); |
10 | CREATE SOURCE |
11 | materialize=> SELECT * FROM t; |
12 | f1 |
13 | |
14 | 1 |
15 | 2 |
16 | 3 |
17 | (3 rows) |
18 | materialize=> CREATE MATERIALIZED VIEW mv AS SELECT sum(f1) FROM t; |
19 | CREATE MATERIALIZED VIEW |
20 | materialize=> SELECT * FROM mv; |
21 | sum |
22 | |
23 | 6 |
24 | (1 row) |
25 |
|
That's how you replicate the Postgres table in Materialize. Now let's perform a query. Let's execute a one-off query on both Materialize and Postgres. We'll design a heavy workload.
1 | \postgres=# \timing |
2 | Timing is on. |
3 | postgres=# INSERT INTO t (f1) SELECT * FROM generate_series(4, 10000); |
4 | INSERT 0 9997 |
5 | Time: 10.137 ms |
6 | postgres=# SELECT sum(t.f1 + t2.f1) FROM t JOIN t AS t2 ON true; |
7 | sum |
8 | |
9 | 1000100000000 |
10 | (1 row) |
11 | Time: 2323.538 ms (00:02.324) |
12 |
|
Materialize performs the query in 37 seconds, while Postgres performs the query in 2 seconds. This is because Materialize is not designed for one-off queries.
Materialize is optimized for materialized views that update incrementally. Read more about how materialized views work in Materialize. Let's create a materialized view as follows:
1 | materialize=> \timing |
2 | Timing is on. |
3 | materialize=> SELECT sum(t.f1 + t2.f1) FROM t JOIN t AS t2 ON true; |
4 | sum |
5 | |
6 | 1000100000000 |
7 | (1 row) |
8 | Time: 37277.756 ms (00:37.278) |
9 | materialize=> DROP MATERIALIZED VIEW mv; |
10 | DROP MATERIALIZED VIEW |
11 | Time: 7.374 ms |
12 | materialize=> CREATE MATERIALIZED VIEW mv AS |
13 | SELECT sum(t.f1 + t2.f1) FROM t JOIN t AS t2 ON true; |
14 | CREATE MATERIALIZED VIEW |
15 | Time: 11.144 ms |
16 | materialize=> SELECT * FROM mv; |
17 | sum |
18 | |
19 | 1000100000000 |
20 | (1 row) |
21 | Time: 3.260 ms |
22 |
|
With Materialize, every change to the source table (t1) in Postgres will only require a small amount of incremental work to update the mv materialized view. This is done during INSERT, not during SELECT. And you can use declarative SQL to define the whole view.
1 | postgres=# INSERT INTO t (f1) VALUES (10001); |
2 | INSERT 0 1 |
3 | Time: 5.627 ms |
4 |
|
5 | materialize=> SELECT * FROM mv; |
6 | sum |
7 | --------------- |
8 | 1000400050002 |
9 | (1 row) |
10 | Time: 3.251 ms |
11 |
|
You can also subscribe to the the materialized view and receive instant updates about all of the changes:
1 | materialize=> COPY (SUBSCRIBE (SELECT * FROM mv)) TO STDOUT; |
2 | 1727715520600 1 1000400050002 |
3 | 1727715526000 1 1000700160012 |
4 | 1727715526000 -1 1000400050002 |
5 | 1727715528000 -1 1000700160012 |
6 | 1727715528000 1 1001000330036 |
7 |
|
This is the output (timestamp, added (1)/removed (-1), value) when these commands run in Postgres:
1 | postgres=# INSERT INTO t (f1) VALUES (10002); |
2 | INSERT 0 1 |
3 | postgres=# INSERT INTO t (f1) VALUES (10003); |
4 | INSERT 0 1 |
5 |
|
To clean up, we can stop the Docker containers again:
1 | docker stop materialized postgres |
2 | docker rm materialized postgres |
3 | docker network rm mznet |
4 |
|
And that's it! This is how you launch the Docker image, and define a materialized view, using the Materialize Emulator.
Shell Script: Materialize Emulator as a Docker Compose Project
To tie things together, here is a small shell script (run.sh) that runs the Materialize Emulator as a Docker Compose project.
The shell script contains many of Materialize's features, including a materialized view mv that combines the data of all these sources:
Also, the script uses the following to get the mv out of Materialize:
You can copy the full shell script below:
1 | |
2 | set -euo pipefail |
3 |
|
4 | PREF="${PWD##*/}" |
5 |
|
6 | wait_for_health() { |
7 | echo -n "waiting for container '$PREF-$1' to be healthy" |
8 | while [ "$(docker inspect -f '{{.State.Health.Status}}' "$PREF-$1")" != "healthy" ] |
9 | echo -n "." |
10 | sleep 1 |
11 | done |
12 | printf "\ncontainer '%s' is healthy\n" "$PREF-$1" |
13 | } |
14 |
|
15 | cat > docker-compose.yml <<EOF |
16 | version: '3.8' |
17 | services: |
18 | materialized: |
19 | image: materialize/materialized:latest |
20 | container_name: $PREF-materialized |
21 | environment: |
22 | MZ_SYSTEM_PARAMETER_DEFAULT: "enable_copy_to_expr=true" |
23 | networks: |
24 | - network |
25 | ports: |
26 | - "127.0.0.1:6874:6874" |
27 | - "127.0.0.1:6875:6875" |
28 | - "127.0.0.1:6876:6876" |
29 | - "127.0.0.1:6877:6877" |
30 | environment: |
31 | - MZ_EAT_MY_DATA=1 |
32 | healthcheck: |
33 | test: ["CMD", "curl", "-f", "localhost:6878/api/readyz"] |
34 | interval: 1s |
35 | start_period: 60s |
36 |
|
37 | postgres: |
38 | image: postgres:latest |
39 | container_name: $PREF-postgres |
40 | environment: |
41 | POSTGRES_PASSWORD: postgres |
42 | POSTGRES_INITDB_ARGS: "-c wal_level=logical" |
43 | networks: |
44 | - network |
45 | ports: |
46 | - "127.0.0.1:5432:5432" |
47 | healthcheck: |
48 | test: ["CMD", "pg_isready", "-d", "db_prod"] |
49 | interval: 1s |
50 | start_period: 60s |
51 |
|
52 | mysql: |
53 | image: mysql:latest |
54 | container_name: $PREF-mysql |
55 | environment: |
56 | MYSQL_ROOT_PASSWORD: mysql |
57 | networks: |
58 | - network |
59 | ports: |
60 | - "127.0.0.1:3306:3306" |
61 | command: |
62 | - "--log-bin=mysql-bin" |
63 | - "--gtid_mode=ON" |
64 | - "--enforce_gtid_consistency=ON" |
65 | - "--binlog-format=row" |
66 | - "--binlog-row-image=full" |
67 | healthcheck: |
68 | test: ["CMD", "mysqladmin", "ping", "--password=mysql", "--protocol=TCP"] |
69 | interval: 1s |
70 | start_period: 60s |
71 |
|
72 | redpanda: |
73 | image: vectorized/redpanda:latest |
74 | container_name: $PREF-redpanda |
75 | networks: |
76 | - network |
77 | ports: |
78 | - "127.0.0.1:9092:9092" |
79 | - "127.0.0.1:8081:8081" |
80 | command: |
81 | - "redpanda" |
82 | - "start" |
83 | - "--overprovisioned" |
84 | - "--smp=1" |
85 | - "--memory=1G" |
86 | - "--reserve-memory=0M" |
87 | - "--node-id=0" |
88 | - "--check=false" |
89 | - "--set" |
90 | - "redpanda.enable_transactions=true" |
91 | - "--set" |
92 | - "redpanda.enable_idempotence=true" |
93 | - "--set" |
94 | - "--advertise-kafka-addr=redpanda:9092" |
95 | healthcheck: |
96 | test: ["CMD", "curl", "-f", "localhost:9644/v1/status/ready"] |
97 | interval: 1s |
98 | start_period: 60s |
99 |
|
100 | minio: |
101 | image: minio/minio:latest |
102 | container_name: $PREF-minio |
103 | environment: |
104 | MINIO_STORAGE_CLASS_STANDARD: "EC:0" |
105 | networks: |
106 | - network |
107 | ports: |
108 | - "127.0.0.1:9000:9000" |
109 | - "127.0.0.1:9001:9001" |
110 | entrypoint: ["sh", "-c"] |
111 | command: ["mkdir -p /data/$PREF && minio server /data --console-address :9001"] |
112 | healthcheck: |
113 | test: ["CMD", "curl", "-f", "localhost:9000/minio/health/live"] |
114 | interval: 1s |
115 | start_period: 60s |
116 |
|
117 | networks: |
118 | network: |
119 | driver: bridge |
120 | EOF |
121 | docker compose down || true |
122 | docker compose up -d |
123 |
|
124 | wait_for_health postgres |
125 | psql postgres://postgres:postgres@127.0.0.1:5432/postgres <<EOF |
126 | CREATE PUBLICATION mz_source FOR ALL TABLES |
127 | CREATE TABLE pg_table (f1 INTEGER) |
128 | ALTER TABLE pg_table REPLICA IDENTITY FULL |
129 | INSERT INTO pg_table VALUES (1), (2), (3) |
130 | EOF |
131 |
|
132 | wait_for_health mysql |
133 | mysql --protocol=tcp --user=root --password=mysql <<EOF |
134 | CREATE DATABASE public |
135 | USE public |
136 | CREATE TABLE mysql_table (f1 INTEGER) |
137 | INSERT INTO mysql_table VALUES (1), (2), (3) |
138 | EOF |
139 |
|
140 | wait_for_health redpanda |
141 | docker compose exec -T redpanda rpk topic create redpanda_table |
142 | docker compose exec -T redpanda rpk topic produce redpanda_table <<EOF |
143 | {"f1": 1} |
144 | {"f1": 2} |
145 | {"f1": 3} |
146 | EOF |
147 |
|
148 | wait_for_health materialized |
149 | psql postgres://materialize@127.0.0.1:6875/materialize <<EOF |
150 | -- Create a Postgres source |
151 | CREATE SECRET pgpass AS 'postgres' |
152 | CREATE CONNECTION pg TO POSTGRES ( |
153 | HOST '$PREF-postgres', DATABASE postgres, USER postgres, PASSWORD SECRET pgpass |
154 | ) |
155 | CREATE SOURCE mz_source FROM POSTGRES CONNECTION pg ( |
156 | PUBLICATION 'mz_source' |
157 | ) FOR SCHEMAS (public) |
158 |
|
159 | -- Create a MySQL source |
160 | CREATE SECRET mysqlpass AS 'mysql' |
161 | CREATE CONNECTION mysql TO MYSQL ( |
162 | HOST '$PREF-mysql', USER root, PASSWORD SECRET mysqlpass |
163 | ) |
164 | CREATE SOURCE mysql_source FROM MYSQL CONNECTION mysql FOR ALL TABLES |
165 |
|
166 | -- Create a Webhook source |
167 | CREATE SOURCE webhook_table FROM WEBHOOK BODY FORMAT TEXT |
168 |
|
169 | -- Create a Redpanda (Kafka-compatible) source |
170 | CREATE CONNECTION kafka_conn TO KAFKA ( |
171 | BROKER '$PREF-redpanda:9092', SECURITY PROTOCOL PLAINTEXT |
172 | ) |
173 | CREATE CONNECTION csr_conn TO CONFLUENT SCHEMA REGISTRY ( |
174 | URL 'http://$PREF-redpanda:8081/' |
175 | ) |
176 | CREATE SOURCE redpanda_table FROM KAFKA CONNECTION kafka_conn ( |
177 | TOPIC 'redpanda_table' |
178 | ) FORMAT JSON |
179 |
|
180 | -- Simple materialized view, incrementally updated, with data from all sources |
181 | CREATE MATERIALIZED VIEW mv AS |
182 | SELECT sum(pg_table.f1 + mysql_table.f1 + webhook_table.body::int + |
183 | (redpanda_table.data->'f1')::int) |
184 | FROM pg_table |
185 | JOIN mysql_table ON TRUE |
186 | JOIN webhook_table ON TRUE |
187 | JOIN redpanda_table ON TRUE |
188 |
|
189 | -- Create a sink to Redpanda so that the topic will always be up to date |
190 | CREATE SINK sink FROM mv INTO KAFKA CONNECTION kafka_conn (TOPIC 'mv') |
191 | FORMAT AVRO USING CONFLUENT SCHEMA REGISTRY CONNECTION csr_conn |
192 | ENVELOPE DEBEZIUM |
193 |
|
194 | -- One-off export of our materialized view to S3-compatible MinIO |
195 | CREATE SECRET miniopass AS 'minioadmin' |
196 | CREATE CONNECTION minio TO AWS ( |
197 | ENDPOINT 'http://minio:9000', |
198 | REGION 'minio', |
199 | ACCESS KEY ID 'minioadmin', |
200 | SECRET ACCESS KEY SECRET miniopass |
201 | ) |
202 | COPY (SELECT * FROM mv) TO 's3://$PREF/mv' WITH ( |
203 | AWS CONNECTION = minio, |
204 | FORMAT = 'csv' |
205 | ) |
206 |
|
207 | -- Allow HTTP API read requests without a token |
208 | CREATE ROLE anonymous_http_user |
209 | GRANT SELECT ON TABLE mv TO anonymous_http_user |
210 | EOF |
211 |
|
212 | |
213 | curl -d "1" -X POST http://127.0.0.1:6876/api/webhook/materialize/public/webhook_table |
214 | curl -d "2" -X POST http://127.0.0.1:6876/api/webhook/materialize/public/webhook_table |
215 | curl -d "3" -X POST http://127.0.0.1:6876/api/webhook/materialize/public/webhook_table |
216 |
|
217 | |
218 | docker compose exec -T redpanda rpk topic consume mv --num 1 |
219 |
|
220 | |
221 | docker compose exec -T minio mc ls data/mzemulator/mv |
222 |
|
223 | |
224 | psql postgres://materialize@127.0.0.1:6875/materialize <<EOF |
225 | SELECT * FROM pg_table |
226 | SELECT * FROM mysql_table |
227 | SELECT * FROM webhook_table |
228 | SELECT * FROM redpanda_table |
229 | SELECT * FROM mv |
230 | EOF |
231 |
|
232 | |
233 | curl -s -X POST -H "Content-Type: application/json" \ |
234 | --data '{"queries": [{"query": "SELECT * FROM mv"}]}' \ |
235 | http://localhost:6876/api/sql | jq -r ".results[0].rows[0][0]" |
236 |
|
Now you can start up a Materialize Emulator in under a minute:
1 | $ cd mzemulator |
2 | $ cat run.sh |
3 | #!/bin/bash |
4 | [...] |
5 | $ time ./run.sh |
6 | [...] |
7 | ./run.sh 0.34s user 0.36s system 1% cpu 45.462 total |
8 | $ psql postgres://materialize@127.0.0.1:6875/materialize -c "SELECT * FROM mv" |
9 | [...] |
10 | sum |
11 | ----- |
12 | 648 |
13 | (1 row) |
14 | $ docker compose down |
15 | [+] Running 6/6 |
16 | ✔ Container mzemulator-redpanda Removed 1.3s |
17 | ✔ Container mzemulator-mysql Removed 10.6s |
18 | ✔ Container mzemulator-postgres Removed 1.0s |
19 | ✔ Container mzemulator-minio Removed 0.7s |
20 | ✔ Container mzemulator-materialized Removed 1.2s |
21 | ✔ Network mzemulator_network Removed 0.4s |
22 |
|
It's that simple — just use the shell script to launch your Materialize Emulator. And if you prefer a GUI, open the local Materialize Console at http://localhost:6874.
Materialize Emulator: Test Quickly During Development
While Materialize is best experienced in our cloud, the Materialize Emulator allows you to quickly test your releases in a non-production environment.
Although the Materialize Emulator lacks many critical features included in the cloud version, the ability to test rapidly is helpful during development.
Try our Materialize Emulator right now to build your apps more efficiently! And sign up for a free trial of Materialize to see what our full cloud product is like.