In our last blog about our Quality Assurance (QA) team, we gave an overview of the QA process, including our software and testing methods. One of our key tools during testing is the Materialize Emulator, a Docker image that allows you to maintain a locally hosted version of Materialize.
But there's an important caveat: the Materialize Emulator cannot support production workloads. The Materialize Emulator lacks critical features of our cloud platform, including fault tolerance and GUI support. But the Emulator is great for testing and prototyping.
In the following blog, we'll outline a step-by-step walkthrough of how to use the Materialize Emulator.
Materialize Emulator: What Is It?
The Materialize Emulator is an all-in-one Docker image available on Docker Hub for testing and evaluation purposes. The Emulator is not representative of Materialize’s performance and full feature set.
To view a comparison between the Materialize Emulator and the Materialize cloud platform, see the table below:
We've always used the Materialize Emulator for testing, except for the kind that requires cloud integration with Kubernetes.
If you want to use Materialize in production scenarios, sign up for a free trial account or schedule a demo.
Step-by-Step Walkthrough: How to Use the Materialize Emulator
Let's walk through a basic example of how to use the Materialize Emulator with a PostgreSQL source. The only requirements are Docker and the postgres-client (psql).
1 | docker network create mznet |
2 | docker pull materialize/materialized:latest |
3 | docker run --name materialized --network mznet -d -p 127.0.0.1:6875:6875 \ |
4 | -p 127.0.0.1:6876:6876 materialize/materialized:latest |
5 |
|
We'll publish the port to localhost, since Materialize is running without authentication. Without a NAT or firewall, anyone on the internet can connect to your Materialize instance. You can specify ports if you want to allow access, such as: -p 6875:6875 or -p 6876:6876.
Now Materialize is running locally and we can connect to it:
1 | $ psql postgres://materialize@127.0.0.1:6875/materialize |
2 | NOTICE: connected to Materialize v0.118.0 |
3 | Org ID: 4b733a37-b64d-44a2-8e79-e0ebd8a177ba |
4 | Region: docker/container |
5 | User: materialize |
6 | Cluster: quickstart |
7 | Database: materialize |
8 | Schema: public |
9 | Session UUID: 2631437c-61d6-4984-a68b-433f5751cecf |
10 |
|
11 | Issue a SQL query to get started. Need help? |
12 | View documentation: https://materialize.com/s/docs |
13 | Join our Slack community: https://materialize.com/s/chat |
14 |
|
15 | psql (15.7 (Ubuntu 15.7-0ubuntu0.23.10.1), server 9.5.0) |
16 | Type "help" for help. |
17 |
|
18 | materialize=> |
19 |
|
Let's start up a Postgres server:
1 | docker run --name postgres --network mznet \ |
2 | -e POSTGRES_PASSWORD=postgres \ |
3 | -e POSTGRES_INITDB_ARGS="-c wal_level=logical" \ |
4 | -p 127.0.0.1:5432:5432 -d postgres |
5 |
|
Connect to the Postgres server. Then generate a simple table. We will replicate this table to Materialize.
1 | $ psql postgres://postgres:postgres@127.0.0.1:5432/postgres |
2 | psql (15.7 (Ubuntu 15.7-0ubuntu0.23.10.1), server 16.4 (Debian 16.4-1.pgdg120+1)) |
3 | WARNING: psql major version 15, server major version 16. |
4 | Some psql features might not work. |
5 | Type "help" for help. |
6 |
|
7 | postgres=# CREATE PUBLICATION mz_source FOR ALL TABLES; |
8 | CREATE PUBLICATION |
9 | postgres=# CREATE TABLE t (f1 INTEGER); |
10 | CREATE TABLE |
11 | postgres=# ALTER TABLE t REPLICA IDENTITY FULL; |
12 | ALTER TABLE |
13 | postgres=# INSERT INTO t VALUES (1), (2), (3); |
14 | INSERT 0 3 |
15 |
|
Now use Materialize to connect to the Postgres instance:
1 | materialize=> CREATE SECRET pgpass AS 'postgres'; |
2 | CREATE SECRET |
3 | materialize=> CREATE CONNECTION pg TO POSTGRES ( |
4 | HOST postgres, DATABASE postgres, USER postgres, PASSWORD SECRET pgpass |
5 | ); |
6 | CREATE CONNECTION |
7 | materialize=> CREATE SOURCE mz_source FROM POSTGRES CONNECTION pg ( |
8 | PUBLICATION 'mz_source' |
9 | ) FOR SCHEMAS (public); |
10 | CREATE SOURCE |
11 | materialize=> SELECT * FROM t; |
12 | f1 |
13 | |
14 | 1 |
15 | 2 |
16 | 3 |
17 | (3 rows) |
18 | materialize=> CREATE MATERIALIZED VIEW mv AS SELECT sum(f1) FROM t; |
19 | CREATE MATERIALIZED VIEW |
20 | materialize=> SELECT * FROM mv; |
21 | sum |
22 | |
23 | 6 |
24 | (1 row) |
25 |
|
That's how you replicate the Postgres table in Materialize. Now let's perform a query. Let's execute a one-off query on both Materialize and Postgres. We'll design a heavy workload.
1 | \postgres=# \timing |
2 | Timing is on. |
3 | postgres=# INSERT INTO t (f1) SELECT * FROM generate_series(4, 10000); |
4 | INSERT 0 9997 |
5 | Time: 10.137 ms |
6 | postgres=# SELECT sum(t.f1 + t2.f1) FROM t JOIN t AS t2 ON true; |
7 | sum |
8 | |
9 | 1000100000000 |
10 | (1 row) |
11 | Time: 2323.538 ms (00:02.324) |
12 |
|
Materialize performs the query in 37 seconds, while Postgres performs the query in 2 seconds. This is because Materialize is not designed for one-off queries.
Materialize is optimized for materialized views that update incrementally. Read more about how materialized views work in Materialize. Let's create a materialized view as follows:
1 | materialize=> SELECT sum(t.f1 + t2.f1) FROM t JOIN t AS t2 ON true; |
2 | sum |
3 | |
4 | 1000100000000 |
5 | (1 row) |
6 | Time: 37277.756 ms (00:37.278) |
7 | materialize=> CREATE MATERIALIZED VIEW mv AS |
8 | SELECT sum(t.f1 + t2.f1) FROM t JOIN t AS t2 ON true; |
9 | CREATE MATERIALIZED VIEW |
10 | Time: 327.252 ms |
11 | materialize=> SELECT * FROM mv; |
12 | sum |
13 | |
14 | 1000100000000 |
15 | (1 row) |
16 | Time: 27.919 ms |
17 |
|
With Materialize, every change to the source table (t1) in Postgres will only require a small amount of incremental work to update the mv materialized view. This is done during INSERT, not during SELECT. And you can use declarative SQL to define the whole view.
1 | postgres=# INSERT INTO t (f1) VALUES (10001); |
2 | INSERT 0 1 |
3 | Time: 5.627 ms |
4 |
|
5 | materialize=> SELECT * FROM mv; |
6 | sum |
7 | --------------- |
8 | 1000400050002 |
9 | (1 row) |
10 | Time: 40.362 ms |
11 |
|
You can also subscribe to the the materialized view and receive instant updates about all of the changes:
1 | materialize=> COPY (SUBSCRIBE (SELECT * FROM mv)) TO STDOUT; |
2 | 1727715520600 1 1000400050002 |
3 | 1727715526000 1 1000700160012 |
4 | 1727715526000 -1 1000400050002 |
5 | 1727715528000 -1 1000700160012 |
6 | 1727715528000 1 1001000330036 |
7 |
|
This is the output (timestamp, added (1)/removed (-1), value) when these commands run in Postgres:
1 | postgres=# INSERT INTO t (f1) VALUES (10002); |
2 | INSERT 0 1 |
3 | postgres=# INSERT INTO t (f1) VALUES (10003); |
4 | INSERT 0 1 |
5 |
|
To clean up, we can stop the Docker containers again:
1 | docker stop materialized postgres |
2 | docker rm materialized postgres |
3 | docker network rm mznet |
4 |
|
And that's it! This is how you launch the Docker image, and define a materialized view, using the Materialize Emulator.
Shell Script: Materialize Emulator as a Docker Compose Project
To tie things together, here is a small shell script (run.sh) that runs the Materialize Emulator as a Docker Compose project.
The shell script contains many of Materialize's features, including a materialized view mv that combines the data of all these sources:
Also, the script uses the following to get the mv out of Materialize:
You can copy the full shell script below:
1 | #!/bin/bash |
2 | set -euo pipefail |
3 |
|
4 | PREF="${PWD##*/}" |
5 |
|
6 | wait_for_health() { |
7 | echo -n "waiting for container '$PREF-$1' to be healthy" |
8 | while [ "$(docker inspect -f '{{.State.Health.Status}}' "$PREF-$1")" != "healthy" ]; do |
9 | echo -n "." |
10 | sleep 1 |
11 | done |
12 | printf "\ncontainer '%s' is healthy\n" "$PREF-$1" |
13 | } |
14 |
|
15 | cat > docker-compose.yml <<EOF |
16 | version: '3.8' |
17 | services: |
18 | materialized: |
19 | image: materialize/materialized:latest |
20 | container_name: $PREF-materialized |
21 | environment: |
22 | MZ_SYSTEM_PARAMETER_DEFAULT: "enable_copy_to_expr=true" |
23 | networks: |
24 | - network |
25 | ports: |
26 | - "127.0.0.1:6875:6875" |
27 | - "127.0.0.1:6876:6876" |
28 | healthcheck: |
29 | test: ["CMD", "curl", "-f", "localhost:6878/api/readyz"] |
30 | interval: 1s |
31 | start_period: 60s |
32 |
|
33 | postgres: |
34 | image: postgres:latest |
35 | container_name: $PREF-postgres |
36 | environment: |
37 | POSTGRES_PASSWORD: postgres |
38 | POSTGRES_INITDB_ARGS: "-c wal_level=logical" |
39 | networks: |
40 | - network |
41 | ports: |
42 | - "127.0.0.1:5432:5432" |
43 | healthcheck: |
44 | test: ["CMD", "pg_isready", "-d", "db_prod"] |
45 | interval: 1s |
46 | start_period: 60s |
47 |
|
48 | mysql: |
49 | image: mysql:latest |
50 | container_name: $PREF-mysql |
51 | environment: |
52 | MYSQL_ROOT_PASSWORD: mysql |
53 | networks: |
54 | - network |
55 | ports: |
56 | - "127.0.0.1:3306:3306" |
57 | command: |
58 | - "--log-bin=mysql-bin" |
59 | - "--gtid_mode=ON" |
60 | - "--enforce_gtid_consistency=ON" |
61 | - "--binlog-format=row" |
62 | - "--binlog-row-image=full" |
63 | healthcheck: |
64 | test: ["CMD", "mysqladmin", "ping", "--password=mysql", "--protocol=TCP"] |
65 | interval: 1s |
66 | start_period: 60s |
67 |
|
68 | redpanda: |
69 | image: vectorized/redpanda:latest |
70 | container_name: $PREF-redpanda |
71 | networks: |
72 | - network |
73 | ports: |
74 | - "127.0.0.1:9092:9092" |
75 | - "127.0.0.1:8081:8081" |
76 | command: |
77 | - "redpanda" |
78 | - "start" |
79 | - "--overprovisioned" |
80 | - "--smp=1" |
81 | - "--memory=1G" |
82 | - "--reserve-memory=0M" |
83 | - "--node-id=0" |
84 | - "--check=false" |
85 | - "--set" |
86 | - "redpanda.enable_transactions=true" |
87 | - "--set" |
88 | - "redpanda.enable_idempotence=true" |
89 | - "--set" |
90 | - "--advertise-kafka-addr=redpanda:9092" |
91 | healthcheck: |
92 | test: ["CMD", "curl", "-f", "localhost:9644/v1/status/ready"] |
93 | interval: 1s |
94 | start_period: 60s |
95 |
|
96 | minio: |
97 | image: minio/minio:latest |
98 | container_name: $PREF-minio |
99 | environment: |
100 | MINIO_STORAGE_CLASS_STANDARD: "EC:0" |
101 | networks: |
102 | - network |
103 | ports: |
104 | - "127.0.0.1:9000:9000" |
105 | - "127.0.0.1:9001:9001" |
106 | entrypoint: ["sh", "-c"] |
107 | command: ["mkdir -p /data/$PREF && minio server /data --console-address :9001"] |
108 | healthcheck: |
109 | test: ["CMD", "curl", "-f", "localhost:9000/minio/health/live"] |
110 | interval: 1s |
111 | start_period: 60s |
112 |
|
113 | networks: |
114 | network: |
115 | driver: bridge |
116 | EOF |
117 | docker compose down || true |
118 | docker compose up -d |
119 |
|
120 | wait_for_health postgres |
121 | psql postgres://postgres:postgres@127.0.0.1:5432/postgres <<EOF |
122 | CREATE PUBLICATION mz_source FOR ALL TABLES; |
123 | CREATE TABLE pg_table (f1 INTEGER); |
124 | ALTER TABLE pg_table REPLICA IDENTITY FULL; |
125 | INSERT INTO pg_table VALUES (1), (2), (3); |
126 | EOF |
127 |
|
128 | wait_for_health mysql |
129 | mysql --protocol=tcp --user=root --password=mysql <<EOF |
130 | CREATE DATABASE public; |
131 | USE public; |
132 | CREATE TABLE mysql_table (f1 INTEGER); |
133 | INSERT INTO mysql_table VALUES (1), (2), (3); |
134 | EOF |
135 |
|
136 | wait_for_health redpanda |
137 | docker compose exec -T redpanda rpk topic create redpanda_table |
138 | docker compose exec -T redpanda rpk topic produce redpanda_table <<EOF |
139 | {"f1": 1} |
140 | {"f1": 2} |
141 | {"f1": 3} |
142 | EOF |
143 |
|
144 | wait_for_health materialized |
145 | psql postgres://materialize@127.0.0.1:6875/materialize <<EOF |
146 | -- Create a Postgres source |
147 | CREATE SECRET pgpass AS 'postgres'; |
148 | CREATE CONNECTION pg TO POSTGRES ( |
149 | HOST '$PREF-postgres', DATABASE postgres, USER postgres, PASSWORD SECRET pgpass |
150 | ); |
151 | CREATE SOURCE mz_source FROM POSTGRES CONNECTION pg ( |
152 | PUBLICATION 'mz_source' |
153 | ) FOR SCHEMAS (public); |
154 |
|
155 | -- Create a MySQL source |
156 | CREATE SECRET mysqlpass AS 'mysql'; |
157 | CREATE CONNECTION mysql TO MYSQL ( |
158 | HOST '$PREF-mysql', USER root, PASSWORD SECRET mysqlpass |
159 | ); |
160 | CREATE SOURCE mysql_source FROM MYSQL CONNECTION mysql FOR ALL TABLES; |
161 |
|
162 | -- Create a Webhook source |
163 | CREATE SOURCE webhook_table FROM WEBHOOK BODY FORMAT TEXT; |
164 |
|
165 | -- Create a Redpanda (Kafka-compatible) source |
166 | CREATE CONNECTION kafka_conn TO KAFKA ( |
167 | BROKER '$PREF-redpanda:9092', SECURITY PROTOCOL PLAINTEXT |
168 | ); |
169 | CREATE CONNECTION csr_conn TO CONFLUENT SCHEMA REGISTRY ( |
170 | URL 'http://$PREF-redpanda:8081/' |
171 | ); |
172 | CREATE SOURCE redpanda_table FROM KAFKA CONNECTION kafka_conn ( |
173 | TOPIC 'redpanda_table' |
174 | ) FORMAT JSON; |
175 |
|
176 | -- Simple materialized view, incrementally updated, with data from all sources |
177 | CREATE MATERIALIZED VIEW mv AS |
178 | SELECT sum(pg_table.f1 + mysql_table.f1 + webhook_table.body::int + |
179 | (redpanda_table.data->'f1')::int) |
180 | FROM pg_table |
181 | JOIN mysql_table ON TRUE |
182 | JOIN webhook_table ON TRUE |
183 | JOIN redpanda_table ON TRUE; |
184 |
|
185 | -- Create a sink to Redpanda so that the topic will always be up to date |
186 | CREATE SINK sink FROM mv INTO KAFKA CONNECTION kafka_conn (TOPIC 'mv') |
187 | FORMAT AVRO USING CONFLUENT SCHEMA REGISTRY CONNECTION csr_conn |
188 | ENVELOPE DEBEZIUM; |
189 |
|
190 | -- One-off export of our materialized view to S3-compatible MinIO |
191 | CREATE SECRET miniopass AS 'minioadmin'; |
192 | CREATE CONNECTION minio TO AWS ( |
193 | ENDPOINT 'http://minio:9000', |
194 | REGION 'minio', |
195 | ACCESS KEY ID 'minioadmin', |
196 | SECRET ACCESS KEY SECRET miniopass |
197 | ); |
198 | COPY (SELECT |
199 | AWS CONNECTION = minio, |
200 | FORMAT = 'csv' |
201 | ); |
202 |
|
203 | -- Allow HTTP API read requests without a token |
204 | CREATE ROLE anonymous_http_user; |
205 | GRANT SELECT ON TABLE mv TO anonymous_http_user; |
206 | EOF |
207 |
|
208 | # Write additional data into Webhook source |
209 | curl -d "1" -X POST http://127.0.0.1:6876/api/webhook/materialize/public/webhook_table |
210 | curl -d "2" -X POST http://127.0.0.1:6876/api/webhook/materialize/public/webhook_table |
211 | curl -d "3" -X POST http://127.0.0.1:6876/api/webhook/materialize/public/webhook_table |
212 |
|
213 | # Read latest data from Redpanda |
214 | docker compose exec -T redpanda rpk topic consume mv --num 1 |
215 |
|
216 | # CSV exists on S3-compatible MinIO |
217 | docker compose exec -T minio mc ls data/mzemulator/mv |
218 |
|
219 | # Use Postgres wire-compatible |
220 | psql postgres://materialize@127.0.0.1:6875/materialize <<EOF |
221 | SELECT |
222 | SELECT |
223 | SELECT |
224 | SELECT |
225 | SELECT |
226 | EOF |
227 |
|
228 | # Use HTTP API |
229 | curl -s -X POST -H "Content-Type: application/json" \ |
230 | --data '{"queries": [{"query": "SELECT * FROM mv"}]}' \ |
231 | http://localhost:6876/api/sql | jq -r ".results[0].rows[0][0]" |
232 |
|
Now you can start up a Materialize Emulator in under a minute:
1 | $ cd mzemulator |
2 | $ cat run.sh |
3 | #!/bin/bash |
4 | [...] |
5 | $ time ./run.sh |
6 | [...] |
7 | ./run.sh 0.34s user 0.36s system 1% cpu 45.462 total |
8 | $ psql postgres://materialize@127.0.0.1:6875/materialize -c "SELECT * FROM mv" |
9 | [...] |
10 | sum |
11 | ----- |
12 | 648 |
13 | (1 row) |
14 | $ docker compose down |
15 | [+] Running 6/6 |
16 | ✔ Container mzemulator-redpanda Removed 1.3s |
17 | ✔ Container mzemulator-mysql Removed 10.6s |
18 | ✔ Container mzemulator-postgres Removed 1.0s |
19 | ✔ Container mzemulator-minio Removed 0.7s |
20 | ✔ Container mzemulator-materialized Removed 1.2s |
21 | ✔ Network mzemulator_network Removed 0.4s |
22 |
|
It's that simple — just use the shell script to launch your Materialize Emulator.
Materialize Emulator: Test Quickly During Development
While Materialize is best experienced in our cloud, the Materialize Emulator allows you to quickly test your releases in a non-production environment.
Although the Materialize Emulator lacks many critical features included in the cloud version, the ability to test rapidly is helpful during development.
Try our Materialize Emulator right now to build your apps more efficiently! And sign up for a free trial of Materialize to see what our full cloud product is like.