Blog

Oct 5, 2023

Sometimes you only need a little bit of Materialize

Tech Demo

Sometimes you only need a little bit of Materialize

Get a 30 minute tour of Materialize: Go through how it works and how customers are using it to build operational data products faster.

Register Now ➞
Announcing Webhook Sources

Announcing Webhook Sources

Today Materialize customers can create webhook sources, making it much easier to pipe in events from a long tail of SaaS platforms, services, and tools.

Consistency and Operational Confidence

Consistency and Operational Confidence

Learn about Materialize's consistency guarantees, and how these guarantees are the foundation of confidence in an operational data warehouse. See these guarantees in action, using tests that you can also use to explore the properties of other solutions.

A guided tour through Materialize's product principles

A guided tour through Materialize's product principles

Take a guided tour through Materialize's three pillars of product value, and see how we think about providing value for your operational workloads.

Changelog

Sep 22, 2023

Connection validation

Materialize now automatically runs validation checks on connection creation, and allows you to manually validate connection details for AWS PrivateLink and SSH tunnel connections with the new `VALIDATE CONNECTION` syntax.


Sep 20, 2023

Introduction to the Operational Data Warehouse

Webinar

Introduction to the Operational Data Warehouse

Join us for a technical overview and Q&A on Operational Data Warehouses and Materialize, presented by cofounders Arjun Narayan and Frank McSherry.

Watch Replay
Materialize: An Operational Data Warehouse

Materialize: An Operational Data Warehouse

We've built Materialize as a new kind of data warehouse, optimized to handle operational data work with the same familiar process from analytical warehouses.

Changelog

Sep 1, 2023

Role-based access control (RBAC) 🔒

Use role-based access control to configure and manage a hierarchy of roles and permissions for your Materialize organization.


RBAC now available for all customers

RBAC now available for all customers

Role Based Access Control (RBAC) can now be enabled for any customer environment, giving Materialize users important security controls for production-grade workloads.

How Materialize can lower the cost of freshness for data teams

How Materialize can lower the cost of freshness for data teams

Materialize has a subtly different cost model that is a huge advantage for operational workloads that need fresh data.

How to pull data from Materialize into Excel

How to pull data from Materialize into Excel

Materialize is Postgres wire-compatible, so a standard Postgres ODBC driver can be used to pull data from Materialize into spreadsheets in Excel.

Capturing Change Data Capture (CDC) Data

Capturing Change Data Capture (CDC) Data

An illustration of the unexpectedly high downstream cost of clever optimizations to change data capture.

The uses and abuses of Cloud Data Warehouses

The uses and abuses of Cloud Data Warehouses

Data Warehouses are great for many things but often misused for operational workloads.

Changelog

Jul 26, 2023

PostgreSQL source: support adding and dropping tables

Handle schema evolution and replication errors in PostgreSQL sources using the new `ALTER SOURCE...{ADD|DROP} TABLE` syntax.


Jul 23, 2023

DBeaver native driver

Use the new Materialize database driver in DBeaver to connect to your Materialize region.


Materialize and Confluent partner to expand the streaming ecosystem

Materialize and Confluent partner to expand the streaming ecosystem

Confluent Cloud customers can now quickly and seamlessly integrate with Materialize via an officially-supported integration, bringing performant and fully-featured SQL on Kafka capabilities within reach of all data teams.

Incrementally Maintained Recursive SQL Queries in Materialize

Incrementally Maintained Recursive SQL Queries in Materialize

Support for recursive SQL queries in Materialize is now available in public preview.

MySQL CDC with Debezium in Production

MySQL CDC with Debezium in Production

Use this guide to go beyond proof-of-concept and ship a complete log-based change data capture (CDC) pipeline from MySQL to Snowflake and Materialize.

Changelog

Jul 10, 2023

New SQL shell 🐚

Interact with Materialize right in the web console with the new SQL shell.


Jul 7, 2023

Kafka source: improved JSON support

Save some conversion typing when handling JSON-encoded Kafka topics using the new `FORMAT JSON` option.


Jul 6, 2023

Cluster management (revisited)

Manage clusters without thinking about individual replicas.


Data Freshness, Defined.

Data Freshness, Defined.

What is Data Freshness? It depends on the context. Read on for a working definition of this concept in Data Engineering.

Changelog

Jun 22, 2023

New source creation UI ✨

Create connections to external data sources and start ingesting data with a few clicks using the new source creation UI.


Jun 21, 2023

The Materialize Changelog is here!

We're starting a changelog to keep you up to speed with all new features and improvements landing in and around Materialize.


When and Why to move work from a Data Warehouse to Materialize

When and Why to move work from a Data Warehouse to Materialize

A framework for understanding why and when to shift a workload from traditional cloud data warehouses to Materialize.

Postgres Source Updates: Unlocking real-time materialized views in PostgreSQL

Postgres Source Updates: Unlocking real-time materialized views in PostgreSQL

With major updates to the streaming replication connection to PostgreSQL, users can now set up Materialize as a drop-in enabler of real-time, incrementally updated, materialized views for their PostgreSQL database.

When to use Materialize vs a Stream Processor

When to use Materialize vs a Stream Processor

If you're already familiar with stream processors you may wonder: When is it better to use Materialize vs a Stream Processor? And why?

May 11, 2023

Data Products and Decentralized Data Teams with Materialize

Webinar

Data Products and Decentralized Data Teams with Materialize

Challenges scaling data teams are real, so why does it seem like real examples of “data mesh” and “data products” are so rare in the wild? In this talk, we’re going to try to tease out the good ideas in these patterns and figure out how we can move incrementally towards solving our data team bottlenecks, and how a platform such as Materialize can help us to move towards solutions without a massive upheaval.

Watch Replay
A Terraform Provider for Materialize

A Terraform Provider for Materialize

Materialize maintains an official Terraform Provider you can use to manage your clusters, replicas, connections and secrets as code.

Everything you need to know to be a Materialize power-user

Everything you need to know to be a Materialize power-user

You could just write SQL, get continually updated results on Materialize. But if you want to get more scale, performance, power, here is a gentle introduction to key internals that will help.

The Four ACID Questions

The Four ACID Questions

Four questions, and their answers, to explain ACID transactions and how they are handled within Materialize.

CDC in Production: An Operating Guide

CDC in Production: An Operating Guide

We spent hours interviewing engineers running change data capture in production, and compiled a list of their planning tips, design patterns to use and footguns to avoid here.

Mar 23, 2023

CI/CD workflows for dbt+Materialize

Tech Demo

CI/CD workflows for dbt+Materialize

Join Marta Paes (Head of Developer Experience) for a demo on setting up CI/CD workflows to deploy dbt projects to Materialize.

Watch Replay
Towards Real-Time dbt

Towards Real-Time dbt

Here's a framework for thinking about reducing the time between when raw data is available to transform with dbt, and when it is delivering value to your customers.

The Software Architecture of Materialize

The Software Architecture of Materialize

Materialize aims to be usable by anyone who knows SQL, but for those interested in going deeper and understanding the architecture powering Materialize, this post is for you!

When to Use Indexes and Materialized Views

When to Use Indexes and Materialized Views

If you are familiar with materialized views and indexes from other databases, this article will help you apply that understanding to Materialize.

Building Differential Dataflow from scratch

Building Differential Dataflow from scratch

Let's build (in Python) the Differential Dataflow framework at the heart of Materialize, and explain what it's doing along the way.

Clusters, explained with Data Warehouses

Clusters, explained with Data Warehouses

If you're familiar with data warehouses, this article will help you understand Materialize Clusters in relation to well-known components in Snowflake.

Delta Joins and Late Materialization

Delta Joins and Late Materialization

Understand how to optimize joins with indexes and late materialization.

Recursion in Materialize

Recursion in Materialize

Differential Dataflow is capable of incrementally updated iterative computation (recursion) but we haven't yet wired it up to SQL. Let's talk about what recursion could look like in Materialize, and why it's important.

What is a Streaming Database?

What is a Streaming Database?

Get an overview of how streaming databases differ from traditional DBs. What are the tradeoffs? How are they used?

Our experience with Rust!

Our experience with Rust!

Materialize is written in Rust. Why did we make that decision and how has it turned out for the project?

Real-Time Customer Data Platform Views on Materialize

Real-Time Customer Data Platform Views on Materialize

Let's demonstrate the unique features of Materialize by building the core functionality of a customer data platform.

How and why is Materialize compatible with PostgreSQL?

How and why is Materialize compatible with PostgreSQL?

As an operational data warehouse, Materialize is fundamentally different on the inside, but it's compatible with PostgreSQL in a few important ways.

Oct 13, 2022

The next generation of {Materialize}, an Overview

Community Meetup

The next generation of Materialize , an Overview

Join Andy Hattemer and Joaquin Colacci from our Developer Experience team for a technical overview on the major updates to Materialize.

Watch Replay
Announcing the next generation of Materialize

Announcing the next generation of Materialize

Today, we’re excited to announce a product that we feel is transformational: a persistent, scalable, cloud-native Materialize.

Indexes: A Silent Frenemy

Indexes: A Silent Frenemy

Even in traditional databases, indexes can at different times be the problem and the solution when it comes to scaling. In this article we discuss how indexes change in streaming-first data warehouses.

Jul 21, 2022

Materialize + dbt: From Streaming Analytics to Continuous Testing

Community Meetup

Materialize + dbt: From Streaming Analytics to Continuous Testing

Join Anna Glander from the Developer Experience team for a live session and Q&A on using Materialize + dbt to do continuous testing.

Watch Replay
Real-time data quality tests using dbt and Materialize

Real-time data quality tests using dbt and Materialize

In traditional databases, a SQL query used as a test runs as a point-in-time check. In streaming, the same query can run continually as data changes, creating a SQL-based data monitoring primitive.

Managing streaming analytics pipelines with dbt

Managing streaming analytics pipelines with dbt

Let's explore a hands-on example where we use dbt (data build tool) to manage and document a streaming analytics workflow from a message broker to Metabase.

Virtual Time: The Secret to Strong Consistency and Scalable Performance in Materialize

Virtual Time: The Secret to Strong Consistency and Scalable Performance in Materialize

The key to Materialize's ability to separate compute from storage and scale horizontally without sacrificing consistency is a concept called virtual time.

Let’s talk about Data Apps

Let’s talk about Data Apps

What is a Data Application? How do they help our customers? What new challenges do we face when building Data Apps? Here's our perspective.

Announcing the Materialize Integration with Cube

Announcing the Materialize Integration with Cube

Connect headless BI tool Cube.js to the read-side of Materialize to get Rest/GraphQL API's, Authentication, metrics modelling, and more out of the box.

Materialize's unbundled cloud architecture

Materialize's unbundled cloud architecture

The `materialized` binary is stable and performant, the time has come to break it apart into separate services to enable the next phase: unbounded scale in a cloud architecture.

Creating a Real-Time Feature Store with Materialize

Creating a Real-Time Feature Store with Materialize

Let's use Materialize to deliver a feature store that continuously updates dimensions as new data becomes available without compromising on correctness or speed.

Apr 22, 2022

Technical Overview of Materialize's New Unbundled Architecture

Technical Overview of Materialize's New Unbundled Architecture

Join Materialize cofounder and chief scientist Frank McSherry on June 22nd at 2pm ET for a live session and Q&A on the upcoming unbundled architecture of Materialize.

Watch Replay
Subscribe to changes in a view with Materialize

Subscribe to changes in a view with Materialize

Developers have long wished for the ability to subscribe to changes in a SQL query or a view in a database. Materialize has a SUBSCRIBE primitive that makes it possible.

What's new in Materialize? Vol. 2

What's new in Materialize? Vol. 2

Changelog: AWS roles for S3/Kinesis, PostgreSQL source improvements, Schema Registry SSL, SELECT statements in Tail queries, jsonb subscripting, DBeaver support, & Tailscale in cloud.

Community Meetup

Materialize + dbt + Redpanda Virtual Hack Day 2022

Come hang out with us and build a streaming project alongside folks from all over the world! We’ll help you get the inspiration flowing with a seed project and some ideas for real-time data sources.

Watch Replay
Connecting Materialize directly to PostgreSQL via the Replication stream

Connecting Materialize directly to PostgreSQL via the Replication stream

The PostgreSQL write-ahead log (WAL) was originally created to enable hot standby's and multi-node setups, but it also works great as a source of data for Materialize. This article discusses how and why.

Taming the beast that is a SQL database

Taming the beast that is a SQL database

In this article, we will talk about one of the ways we approach the testing of the SQL engine of the product at Materialize. We hope to cover other modules and interesting angles in the future.

Introducing: Tailscale + Materialize

Introducing: Tailscale + Materialize

Materialize Cloud works with Tailscale, a VPN solution based on the state-of-the-art WireGuard protocol, to help customers connect their Materialize clusters with services on their private networks.

What's new in Materialize? Volume 1

What's new in Materialize? Volume 1

Changelog: Kafka source metadata, protobuf+schema registry for Redpanda, Time bucketing with date_bin, Metabase integration, cloud metrics and monitoring, and new availability region.

Taking streaming analytics further faster with Redpanda + Materialize

Taking streaming analytics further faster with Redpanda + Materialize

Redpanda is a source-available, Kafka-compatible streaming data framework that works both as an upstream data source for Materialize and downstream data sink. Read on to learn how to start building with Redpanda and Materialize.

Oct 19, 2021

Redpanda + Materialize

Redpanda + Materialize

Join us to talk about how Redpanda and Materialize fit together, and how they can be used to build powerful streaming architectures previously only accessible to large enterprise teams.

Watch Replay
Materialize raises a $60M Series C, bringing total funding to over $100M

Materialize raises a $60M Series C, bringing total funding to over $100M

Materialize raises another round of funding to help build a cloud-native streaming data warehouse.

Change Data Capture is having a moment. Why?

Change Data Capture is having a moment. Why?

Change Data Capture (CDC) is finally gaining widespread adoption as a architectural primitive. Why now?

Materialize Cloud Enters Open Beta

Materialize Cloud Enters Open Beta

Release: 0.9

Release: 0.9

Materialize & Datalot: Real-time Application Development

Materialize & Datalot: Real-time Application Development

Release: 0.8

Release: 0.8

Maintaining Joins using Few Resources

Maintaining Joins using Few Resources

Streaming joins must maintain the pre-join datasets in memory, making them potentially costly operations. Materialize uses shared arrangements to allow multiple join statements to share the same pre-join index.

Generalizing linear operators in differential dataflow

Generalizing linear operators in differential dataflow

Differential dataflow uses simple linear operators: `map`, `filter`, `flat_map` and complex: `explode` and temporal filter operators. But, with some thinking, we can generalize them all to a restricted form of join.

Join Kafka with a Database using Debezium and Materialize

Join Kafka with a Database using Debezium and Materialize

Debezium and Materialize can be used as powerful tools for joining high-volume streams of data from Kafka and tables from databases.

Real-time A/B test results with Segment, Kinesis, and Materialize

Real-time A/B test results with Segment, Kinesis, and Materialize

Build a real-time A/B testing stack with Segment, Kinesis and Materialize.

dbt + Materialize demo: Running dbt’s jaffle_shop with Materialize

dbt + Materialize demo: Running dbt’s jaffle_shop with Materialize

Let's demonstrate how to manage streaming SQL in Materialize with dbt by porting the classic dbt jaffle-shop demo scenario to the world of streaming.

Release: 0.7

Release: 0.7

How Materialize and other databases optimize SQL subqueries

How Materialize and other databases optimize SQL subqueries

Subquery optimization is a high-complexity, high-impact task in databases. This post gives a rough map of existing approaches to optimizing subqueries and also describes how Materialize differs from them..

Introducing: dbt + Materialize

Introducing: dbt + Materialize

dbt is a tool for managing SQL data transformations. Materialize is a operational data warehouse. When used together, analytics works the way it always should have: Define transforms in SQL, get results in real-time.

Temporal Filters: Enabling Windowed Queries in Materialize

Temporal Filters: Enabling Windowed Queries in Materialize

Temporal filters give you a powerful SQL primitive for defining time-windowed computations over temporal data.

A Simple and Efficient Real Time Application Powered by Materialize's TAIL Command

A Simple and Efficient Real Time Application Powered by Materialize's TAIL Command

Let's build a python application to demonstrate how developers can create real-time, event-driven experiences for their users, powered by Materialize.

Slicing up Temporal Aggregates in Materialize

Slicing up Temporal Aggregates in Materialize

Life in Differential Dataflow

Life in Differential Dataflow

Let's use Conway's Game of Life to illustrate how to write algorithms in differential dataflow.

Release: 0.6

Release: 0.6

Joins in Materialize

Joins in Materialize

Joins in streaming systems are one of the harder things to do both correctly and efficiently. Let's talk about ways that Materialize maintains them, starting with basic binary joins and working our way up to delta joins.

Kafka is not a Database

Kafka is not a Database

In principle, it is possible to use Kafka as a database. But in doing so you will confront every hard problem that database management systems have faced for decades.

Live Maintained Views on Boston Transit to Run at Home

Live Maintained Views on Boston Transit to Run at Home

Materialize can be used to quickly build scalable backends for real-time apps! In this blog post, we describe two apps that you can try out at home that run on actual, live data.

Materialize Raises a Series B

Materialize Raises a Series B

Release: Materialize 0.5

Release: Materialize 0.5

Materialize under the Hood

Materialize under the Hood

Lateral Joins and Demand-Driven Queries

Lateral Joins and Demand-Driven Queries

In today's post we are going to show off Materialize's `LATERAL` join, and how you can use it to implement some pretty neat query patterns in an incremental view maintenance engine!

Change Data Capture (part 1)

Change Data Capture (part 1)

Here we set the context for and propose a change data capture protocol: a means of writing down and reading back changes to data.

Why Use a Materialized View?

Why Use a Materialized View?

Querying materialized views, unlike querying tables or logical views, can reduce query costs by maintaining results in memory that are only updated when necessary. Read on to learn more!

Why not RocksDB for streaming storage?

Why not RocksDB for streaming storage?

An explanation of our rationale for why Materialize chose not to use RocksDB as its underlying storage engine.

Robust Reductions in Materialize

Robust Reductions in Materialize

Release: Materialize 0.4

Release: Materialize 0.4

Streaming TAIL to the Browser - A One Day Project

Streaming TAIL to the Browser - A One Day Project

Eventual Consistency isn't for Streaming

Eventual Consistency isn't for Streaming

Eventual consistency is common for key-value stores, where the trade-off is well understood and manageable. But in a streaming system, eventual consistency creates unboundedly large and systematic errors.

Rust for Data-Intensive Computation

Rust for Data-Intensive Computation

Materialize: Roadmap to Building a Streaming Database on Timely Dataflow

Materialize: Roadmap to Building a Streaming Database on Timely Dataflow

How do you build a streaming database from scratch? Here is a roadmap: Start with a streaming framework, build a performant single binary, then break it up into a scalable distributed database platform.

CMU DB Talk: Building Materialize

CMU DB Talk: Building Materialize

Arjun Narayan introduces the CMU DB group to streaming databases, the problems they solve, and specific architectural decisions in Materialize.

Release: Materialize 0.3

Release: Materialize 0.3

Managing memory with differential dataflow

Managing memory with differential dataflow

Frameworks that process unbounded streams of data need to be diligent about not also using unbounded amounts of memory. This post discusses some of the tricks used by Differential Dataflow to manage and limit memory use.

What consistency guarantees should you expect from your streaming data platform?

What consistency guarantees should you expect from your streaming data platform?

In-order reliable message delivery is not enough. Showing views over streams of data requires thinking through additional consistency semantics to deliver correct results.

Upserts in Differential Dataflow

Upserts in Differential Dataflow

What’s inside Materialize? An architecture overview

What’s inside Materialize? An architecture overview

Let's review the internal architecture of Materialize, starting with the some context of how it's different than other databases.

Incremental Computation in the Database

Incremental Computation in the Database

Incremental computation systems are used to make frequent compute-intensive tasks faster and more efficient in compilers, front-ends, and IDEs, but what does it look like at the data layer?

What is a Materialized View?

What is a Materialized View?

What is a materialized view, where can you use them, and how are they useful?

Taking Materialize for a spin on NYC taxi data

Taking Materialize for a spin on NYC taxi data

What is Streaming SQL?

What is Streaming SQL?

Get a high-level overview of what Streaming SQL means, why it's useful, how it's being used in the real world, and how you can start using it yourself.

View Maintenance: A New Approach to Data Processing

View Maintenance: A New Approach to Data Processing

Materialize Beta: The Details

Materialize Beta: The Details

Introducing Materialize: the Streaming Data Warehouse

Introducing Materialize: the Streaming Data Warehouse

Despite substantial progress, data still moves too slowly. The solution is a different paradigm: Streaming. Materialize is a streaming data warehouse built on principles of interoperability and consistency at millisecond latency.

Try Materialize Free

Get hands-on with Materialize in a 14-day Free Trial. Bring your own data, or use data sources we provide.