Materialize 0.7 was released on 08 February 2021 with significant improvements around getting data into Materialize.
Key change: Source data from Amazon Web Services S3
S3 sources for Materialize are fully tested but under the experimental flag until 0.8.
With S3 sources, you can:
- Point Materialize at S3 buckets using the same CREATE SOURCE syntax used for other data.
- Specify object name filters that ensure Materialize is only downloading and processing the objects you need.
- Hook in to AWS’ built-in SQS API for notifying downstream services of bucket/object changes so Materialize can ingest new objects as soon as they appear. Views defined downstream of S3 sources with SQS notifications enabled will incrementally update as new objects are added to the bucket!
- Ingest data from S3 as raw text/bytes, CSV, or JSON.
Once Materialize downloads an S3 object it will process each line as an event, much like any other source. Users should source S3 buckets where objects are append-only, Materialize will silently ignore deleted or updated objects in S3.
Examples of where an S3 Source can be useful:
- Ingest a full history of events. If you only keep recent data in kafka but have everything in S3, you can ingest the S3 data once before starting the kafka stream to get the full history.
- Application logs or database extracts that are stored in S3. If you’re okay with the implicit latency in this approach, you can create views that materialize S3 data joined with kafka as well as upstream databases.
A noteworthy breaking change: As part of the groundwork towards adding user authentication, Materialize now enforces a valid username when connecting to Materialize.
For the full feed of updates, including upcoming changes, see the Materialize changelog in docs.