Mar 9, 2021

Release: 0.7

Materialize 0.7 was released on 08 February 2021 with significant improvements around getting data into Materialize.

Key change: Source data from Amazon Web Services S3

S3 sources for Materialize are fully tested but under the experimental flag until 0.8.

With S3 sources, you can:

  • Point Materialize at S3 buckets using the same CREATE SOURCE syntax used for other data.
  • Specify object name filters that ensure Materialize is only downloading and processing the objects you need.
  • Hook in to AWS' built-in SQS API for notifying downstream services of bucket/object changes so Materialize can ingest new objects as soon as they appear. Views defined downstream of S3 sources with SQS notifications enabled will incrementally update as new objects are added to the bucket!
  • Ingest data from S3 as raw text/bytes, CSV, or JSON.

Once Materialize downloads an S3 object it will process each line as an event, much like any other source. Users should source S3 buckets where objects are append-only, Materialize will silently ignore deleted or updated objects in S3.

Examples of where an S3 Source can be useful:

  1. Ingest a full history of events. If you only keep recent data in kafka but have everything in S3, you can ingest the S3 data once before starting the kafka stream to get the full history.
  2. Application logs or database extracts that are stored in S3. If you're okay with the implicit latency in this approach, you can create views that materialize S3 data joined with kafka as well as upstream databases.

Quality-of-life improvements

A noteworthy breaking change: As part of the groundwork towards adding user authentication, Materialize now enforces a valid username when connecting to Materialize.

For the full feed of updates, including upcoming changes, see the Materialize changelog in docs.

More Articles

Ecosystem & Integrations

dbt + Materialize demo: Running dbt’s jaffle_shop with Materialize

Let's demonstrate how to manage streaming SQL in Materialize with dbt by porting the classic dbt jaffle-shop demo scenario to the world of streaming.

Jessica Laughlin

Mar 24, 2021

Technical Article

How Materialize and other databases optimize SQL subqueries

Subquery optimization is a high-complexity, high-impact task in databases. This post gives a rough map of existing approaches to optimizing subqueries and also describes how Materialize differs from them..

Jamie Brandon

Mar 1, 2021

Ecosystem & Integrations

Introducing: dbt + Materialize

dbt is a tool for managing SQL data transformations. Materialize is a streaming database. When used together, analytics works the way it always should have: Define transforms in SQL, get results in real-time.

Jessica Laughlin

Mar 1, 2021

Join the Materialize Community

Join hundreds of other Materialize users and connect directly with our engineers.

Join the Community

© 2022 Materialize, Inc. Terms of Service