Release: Materialize 0.5

November 24, 2020

We recently released Materialize 0.5! Here’s what’s new and improved.

What’s changed in Materialize 0.5

Version 0.5 includes a number of improvements to help run Materialize in production and connect it to other systems. These include improved Postgres compatibility and beta releases of source caching and tables.

As more customers bring Materialize to production, we have focused our efforts on polishing the features it takes to run Materialize reliably and on supporting connections to enterprise infrastructures.

Expanding our support for Postgres: tables and system catalog

We’ve added more ways to get started with Materialize.

Tables From day one, Materialize supported the Postgres wire protocol. To make Materialize easy to use, wherever possible we support Postgres’ SQL dialect rather than a pseudo-SQL or a SQL-esque format. This allows you to reuse your existing SQL and minimize migration efforts.

To make it easier to send data to Materialize, we now support tables. Tables are great for quickly loading static data into Materialize. You can implement and modify tables with the CREATE TABLE, DROP TABLE, INSERT and SHOW CREATE TABLE statements. Tables are conceptually similar to a source, but the data in a table is managed by Materialize, rather than by Kafka or a filesystem.

Note that table data is currently ephemeral: data inserted into a table does not persist across restarts. To handle long-lived data in Materialize, we recommend you pair your table data with file sources and sinks.

System Catalog Materialize now exposes metadata about the running Materialize instance in the new system catalog, which describes the various sources, tables, and views that can be queried via SQL. This is a stepping stone towards improving support of software across the Postgres ecosystem.

We’re prioritizing support for Postgres-compatible software based on user feedback, so please don’t hesitate to let us know what you’d be interested in!

Supporting production deployments

We added a web-based, interactive memory usage visualization to aid in understanding and diagnosing unexpected memory consumption. This was instrumental in helping reduce Materialize’s memory utilization for a variety of different queries in the 0.5 release.

Source caching

Source caching is a feature we recently introduced to reduce the need to reingest data on Materialize restart in certain scenarios.

A common architectural pattern to use with Materialize is to connect it to a database via a data stream such as Apache Kafka. Users who are concerned about disk storage constraints often rely on stream compaction. However, compaction may not always be available; for example, compacting the stream for Change Data Capture (CDC) users would result in incorrect data. Source caching allows these users to speed up Materialize on restart.

Source caching is now available for all users as an alpha release. We intend to support cloud-based object storage (such as S3) in subsequent versions of source caching, enabling even easier scaling and operations.

What’s coming in 0.6

We’re making it easier to consume data that has been processed by Materialize. To listen to a continually updated view, we’re extending TAIL to support machine-parsable formats. We’ve tested this in .Net (Npgsql) and will continue to extend this support to other native SQL drivers. We recently added the ability to write keys in Kafka sink output and will add support for multiple Kafka partitions and UPSERT semantics next.

We’re continuing to add more Postgres compatibility by supporting list and map types, as well as non-recursive common table expressions like WITH...AS.

Get started today

The full release notes for 0.5 are located here. Sign up for Materialize today to get faster answers to your data questions, and check out our source code on Github!

You can also join our growing Slack community to ask questions or to provide feedback on Materialize.