Materialize 0.6 makes it easier to consume streams and build streaming applications. We’ve also made a number of changes that improve our SQL compatibility. Here’s more details on some noteworthy features we’ve added in this release:

What’s changed in Materialize 0.6

Easily listen to streaming changes TAIL is a Materialize-specific command we recently introduced to stream updates from a source, table, or view as they occur. Whereas a SQL SELECT statement returns a result that captures a moment in time, a tail operation computes how that relation changes over time.

In 0.6, TAIL is significantly more mature and functional. We’ve made TAIL more reliable, improved ordering semantics, and added more language driver compatibility. We’ve verified support for TAIL in two client libraries, Npgsql (C#) and psycopg2 (Python), and we will continue to add support for more libraries based on user feedback. See our documentation for examples of how to use TAIL. Also see our previous blog post for an end-to-end example of how to stream updates to a browser.

Non-recursive common table expressions Common table expressions (CTEs) return a temporary result set that can be used within another SQL statement. CTEs are often used to simplify complex joins and subqueries, and are written with the form WITH ... AS. By supporting non-recursive CTEs as of 0.6, Materialize makes it easier to write more expressive SQL and connect with existing libraries and applications.

Supporting the map data type Materialize now supports a map type. This can be useful to model your data more accurately, and is especially helpful when ingesting Avro streams, where we’ve found numerous examples of datasets that utilize maps.

Enterprise-grade encryption Materialize now has partial support for PostgreSQL’s pgcrypto package. This is useful for enterprise applications, where messages may need to be encrypted/decrypted before they can be properly consumed.

Column defaults Specifying default values for table columns via the new DEFAULT column option in CREATE TABLE is now supported. Special thanks to community member @petrosagg for his contribution!

The full release notes for 0.6 are available here:

What’s coming in 0.7

Query language user-defined functions

While the declarative nature of SQL means it is easy to get started, sometimes you wish to do something that isn’t easily expressed with existing SQL statements. In 0.7, we’ll be starting with query language user-defined functions (UDFs), which are reusable SQL functions that execute an arbitrary list of SQL statements.

Over time, we intend to evolve this to support more generic UDFs, such as procedural language functions. As an example, we are experimenting with using webassembly, which would enable users to generate functions with javascript. Please join the conversation if there are examples you would be interested in using UDFs for!

Deepening connector functionality

Cloud object storage (S3)

It goes without saying that cloud-native object storage like Amazon Web Service’s Simple Storage Service (AWS S3) is widely used today, often for data lake and ETL use-cases. With our recent support for file-based data sources and INSERT table semantics, a common request has been to support ingestion of AWS S3 objects. Users have requested the ability to ingest ETL’d data to join live databases with their datalakes, such as with data from periodic data extracts.

The first versions of Materialize S3 compatibility will support reading single and multiple static objects according to a pattern. Because there’s a large surface area to cover (various use cases and data formats), we’ll continue to evolve our compatibility over time based on user-feedback.

Upsert semantics and Kafka offsets

We recently added the ability to specify keys with sinks, which enables greater flexibility consuming Materialize outputs. Next, we’ll be supporting UPSERT sink envelopes, which means value deletions will follow the convention of empty values.

We’re also adding the ability to consume Kafka streams starting with an offset. Today Materialize consumes a stream of database updates, aka a change-data capture (CDC) stream is only from the beginning, because skipping arbitrary records will cause results to become illogical. However, in practice, we’ve found that customers will also want to skip records that have corrupted values, or which use an obsolete schema.

Get started today

The full release notes for 0.6 are located here. Register for a Materialize account here to get started, or check out our source code on GitHub.

You can also join our growing Slack community to ask questions or to provide feedback on Materialize.

More Articles

Technical Article

Understanding Differential Dataflow

How to write algorithms in differential dataflow, using Conway's Game of Life as an example.

Ruchir Khaitan

Jan 11, 2021

Technical Article

Joins in Materialize

Comprehensive guide to implementing joins in Materialize, covering binary to delta joins for efficient streaming systems.

Frank McSherry

Dec 14, 2020

Key Concept

Kafka is not a Database

In principle, it is possible to use Kafka as a database. But in doing so you will confront every hard problem that database management systems have faced for decades

Arjun Narayan
George Fraser

Dec 8, 2020

Try Materialize Free