Deep-diveGeneralizing linear operators in differential dataflow

Differential dataflows contain many operators, some of which are very complicated, but many of which are relatively simple. The map operator applies a transformation to each record. The filter operator applies a predicate to each record, and drops records that do not pass it. The flat_map operator applies a function to each record that can […]

Deep-diveJoin Kafka with a Database using Debezium and Materialize

The Problem We need to provide (internal or end-user) access to a view of data that combines a fast-changing stream of events from Kafka with a table from a database (which is also changing). Here are a few real-world examples where this problem comes up: Calculate API usage by joining API logs in Kafka with […]

Deep-diveHow Materialize and other databases optimize SQL subqueries

Subqueries are a SQL feature that allow writing queries nested inside a scalar expression in an outer query. Using subqueries is often the most natural way to express a given problem, but their use is discouraged because most databases struggle to execute them efficiently. This post gives a rough map of existing approaches to optimizing […]

Deep-diveTemporal Filters: Enabling Windowed Queries in Materialize

Materialize provides a SQL interface to work with continually changing data. You write SQL queries as if against static data, and then as your data change we keep the results of your queries automatically up to date, in milliseconds. Materialize leans hard into the ideal that SQL is what you know best, and what you […]

Deep-diveA Simple and Efficient Real Time Application Powered by Materialize’s TAIL Command

Within the web development community, there has been a clear shift towards frameworks that implement incremental view maintenance and for good reason. When state is updated incrementally, applications perform better and require fewer resources. Using Materialize, developers and data analysts can adopt the same, event driven techniques in their data processing pipelines, leveraging existing SQL […]

Deep-diveSlicing up Temporal Aggregates in Materialize

Materialize computes and maintains SQL queries as your underlying data change. This makes it especially well-suited to tracking the current state of various SQL queries and aggregates! But, what if you want to root around in the past? Maybe you want to compare today’s numbers to yesterday‘s numbers. Maybe you want to scrub through the […]

Deep-diveLife in Differential Dataflow

I’ve been working at Materialize for almost a year now, and I have really enjoyed learning about and using Differential Dataflow (hereafter just Differential) in my day-to-day work. In this post, I’ll introduce Differential and talk through implementing a few common programming problems like list intersection and everyone’s favorite, FizzBuzz, as dataflow programs. Finally, I’ll […]

Deep-diveJoins in Materialize

This post is also available at my personal blog. Materialize allows you to maintain declarative, relational SQL queries over continually changing data. One of the most powerful features of SQL queries are joins: the ability to correlate records from multiple collections of data. Joins also happen to be one of the harder things to do […]

Deep-diveLateral Joins and Demand-Driven Queries

In today’s post we are going to show off Materialize’s LATERAL join (courtesy @benesch), and how you can use it to implement some pretty neat query patterns in an incremental view maintenance engine! In particular, in the streaming SQL setting, lateral joins automatically turn your SQL prepared statement queries into what is essentially a streaming, […]

Deep-diveWhy not RocksDB for streaming storage?

A roadmap for a storage engine for Materialize

About This Blog

Welcome! On our blog, you’ll hear more about the inner workings of Materialize – what we’ve built, what we plan to build, and how it all works together.

New here? Read these