A few months ago, we announced a new option to serve results from Materialize that allows you to bulk-export data to Amazon S3. After incorporating feedback from power users running bulk exports in production during Private Preview, we’re now making it available to all Materialize users, along with compatibility for Google Cloud Storage (GCS).
How does it work?
As a recap: we’ve extended the COPY TO
command
to work as a “one-shot sink” that can (efficiently) write files to an S3
(or S3-compatible) path. Here’s what the workflow looks like in Materialize:
1. Create an AWS connection to allow Materialize to securely authenticate with your S3 service:
CREATE CONNECTION aws_conn TO AWS (
ASSUME ROLE ARN = 'arn:aws:iam::001234567890:role/MZS3Exporter'
);
2. Double-check that you didn’t miss any IAM configuration steps (🙋♀️):
VALIDATE CONNECTION aws_conn;
3. And then trigger the copy command with either the object or the query for which you want results to be exported:
COPY my_view TO 's3://mz-to-snow/20241125/'
WITH (
AWS CONNECTION = aws_conn,
FORMAT = 'parquet'
);
What’s new?
Although Materialize doesn’t integrate natively with GCS, GCS is interoperable with Amazon S3 via the Cloud Storage XML API. 🧠 This allows you to export data to GCS using an AWS connection with an HMAC key and the same workflow documented above. Unfortunately, Azure Blob Storage does not provide an S3-compatibility layer (crazy, we know!), so supporting this service requires rolling out a native integration.
In the future, we plan to add native support for Google Cloud Platform (GCP) and Azure connections, so the developer experience is smoother and you can integrate with each cloud provider’s own Identity and Access Management (IAM) system, instead of relying on credentials-based authentication.
How do you get started?
For an overview of how bulk exports work, as well as integration guides with downstream systems like Snowflake, check the reference documentation.
Ready to give it a go? Sign up for a 14-day free trial of Materialize.