Install on AWS
Self-managed Materialize requires: a Kubernetes (v1.29+) cluster; PostgreSQL as a metadata database; and blob storage.
The tutorial deploys Materialize to AWS Elastic Kubernetes Service (EKS) with a PostgreSQL RDS database as the metadata database and AWS S3 for blob storage. The tutorial uses Materialize on AWS Terraform modules to:
- Set up the AWS Kubernetes environment.
- Call terraform-helm-materialize module to deploy Materialize Operator and Materialize instances to that EKS cluster.
When operating in AWS, we recommend:
-
Using the
r8g
,r7g
, andr6g
families when running without local disk. -
Using the
r7gd
andr6gd
families of instances (andr8gd
once available) when running with local disk (Recommended for production. See Operational guidelines for more information.)
Prerequisites
Terraform
If you don’t have Terraform installed, install Terraform.
AWS CLI
If you do not have the AWS CLI installed, install. For details, see the AWS documentation.
kubectl
If you do not have kubectl
, install. See the Amazon EKS: install kubectl
documentation
for details.
Helm 3.2.0+
If you do not have Helm 3.2.0+, install. For details, see the Helm documentation.
Set up AWS Kubernetes environment and install Materialize
To help you get started with Materialize for evaluation purposes, Materialize provides sample Terraform modules. The sample Terraform modules are for evaluation purposes only and not intended for production use. Materialize does not support nor recommends this module for production use.
For simplicity, this tutorial stores various secrets in a file as well as prints them to the terminal. In practice, refer to your organization’s official security and Terraform/infrastructure practices.
Materialize provides Materialize on AWS Terraform
modules
for evaluation purposes only. The modules deploy a sample infrastructure on AWS
(region us-east-1
) with the following components:
- A Kuberneted (EKS) cluster
- A dedicated VPC
- An S3 for blob storage
- An RDS PostgreSQL cluster and database for metadata storage
- Materialize Operator
- Materialize instances (during subsequent runs after the Operator is running)
The tutorial uses the module found in the examples/simple/
directory, which requires minimal user input. For more configuration options,
you can run the modules at the root of the
repository
instead.
For details on the examples/simple/
infrastructure configuration (such as the
node instance type, etc.), see the
examples/simple/main.tf.
-
Open a Terminal window.
-
Configure AWS CLI with your AWS credentials. For details, see the AWS documentation.
-
Clone the Materialize’s sample Terraform repo and checkout the
v0.2.7
tag. For example,-
If cloning via SSH:
git clone --depth 1 -b v0.2.7 git@github.com:MaterializeInc/terraform-aws-materialize.git
-
If cloning via HTTPS:
git clone --depth 1 -b v0.2.7 https://github.com/MaterializeInc/terraform-aws-materialize.git
-
-
Go to the
examples/simple
folder in the Materialize Terraform repo directory.cd terraform-aws-materialize/examples/simple
💡 Tip:The tutorial uses the module found in the
examples/simple/
directory, which requires minimal user input. For more configuration options, you can run the modules at the root of the repository instead.For details on the
examples/simple/
infrastructure configuration (such as the node instance type, etc.), see the examples/simple/main.tf. -
Create a
terraform.tfvars
file (you can copy from theterraform.tfvars.example
file) and specify the following variables:Variable Description namespace
A namespace (e.g., my-demo
) that will be used to form part of the prefix for your AWS resources.
Requirements:
- Maximum of 12 characters
- Must start with a lowercase letter
- Must be lowercase alphanumeric and hyphens onlyenvironment
An environment name (e.g., dev
,test
) that will be used to form part of the prefix for your AWS resources.
Requirements:
- Maximum of 8 characters
- Must be lowercase alphanumeric only# The namespace and environment variables are used to construct the names of the resources # e.g. ${namespace}-${environment}-storage, ${namespace}-${environment}-db etc. namespace = "enter-namespace" // maximum 12 characters, start with a letter, contain lowercase alphanumeric and hyphens only (e.g. my-demo) environment = "enter-environment" // maximum 8 characters, lowercase alphanumeric only (e.g., dev, test)
-
Initialize the terraform directory.
terraform init
-
Use terraform plan to review the changes to be made.
terraform plan
-
If you are satisfied with the changes, apply.
terraform apply
To approve the changes and apply, enter
yes
.Upon successful completion, various fields and their values are output:
Apply complete! Resources: 77 added, 0 changed, 0 destroyed. Outputs: database_endpoint = "my-demo-dev-db.abcdefg8dsto.us-east-1.rds.amazonaws.com:5432" eks_cluster_endpoint = "https://0123456789A00BCD000E11BE12345A01.gr7.us-east-1.eks.amazonaws.com" eks_cluster_name = "my-demo-dev-eks" materialize_s3_role_arn = "arn:aws:iam::000111222333:role/my-demo-dev-mz-role" metadata_backend_url = <sensitive> oidc_provider_arn = "arn:aws:iam::000111222333:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/7D14BCA3A7AA896A836782D96A24F958" persist_backend_url = "s3://my-demo-dev-storage-f2def2a9/dev:serviceaccount:materialize-environment:12345678-1234-1234-1234-12345678912" s3_bucket_name = "my-demo-dev-storage-f2def2a9" vpc_id = "vpc-0abc000bed1d111bd"
-
Note your specific values for the following fields:
eks_cluster_name
(Used to configurekubectl
)
-
Configure
kubectl
to connect to your EKS cluster, replacing:-
<your-eks-cluster-name>
with the name of your EKS cluster. Your cluster name has the form{namespace}-{environment}-eks
; e.g.,my-demo-dev-eks
. -
<your-region>
with the region of your EKS cluster. The simple example usesus-east-1
.
aws eks update-kubeconfig --name <your-eks-cluster-name> --region <your-region>
To verify that you have configured correctly, run the following command:
kubectl get nodes
For help with
kubectl
commands, see kubectl Quick reference. -
-
By default, the example Terraform installs the Materialize Operator. Verify the installation and check the status:
kubectl get all -n materialize
Wait for the components to be in the
Running
state:NAME READY STATUS RESTARTS AGE pod/my-demo-dev-materialize-operator-84ff4b4648-brjhl 1/1 Running 0 12s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/my-demo-dev-materialize-operator 1/1 1 1 12s NAME DESIRED CURRENT READY AGE replicaset.apps/my-demo-dev-materialize-operator-84ff4b4648 1 1 1 12s
If you run into an error during deployment, refer to the Troubleshooting guide.
-
Once the Materialize operator is deployed and running, you can deploy the Materialize instances. To deploy Materialize instances, create a
mz_instances.tfvars
file with the Materialize instance configuration.For example, the following specifies the configuration for a
demo
instance.cat <<EOF > mz_instances.tfvars materialize_instances = [ { name = "demo" namespace = "materialize-environment" database_name = "demo_db" cpu_request = "1" memory_request = "2Gi" memory_limit = "2Gi" } ] EOF
-
Run
terraform plan
with both.tfvars
files and review the changes to be made.terraform plan -var-file=terraform.tfvars -var-file=mz_instances.tfvars
The plan should show the changes to be made, with a summary similar to the following:
Plan: 4 to add, 0 to change, 0 to destroy.
-
If you are satisfied with the changes, apply.
terraform apply -var-file=terraform.tfvars -var-file=mz_instances.tfvars
To approve the changes and apply, enter
yes
.Upon successful completion, you should see output with a summary similar to the following:
Apply complete! Resources: 4 added, 0 changed, 0 destroyed. Outputs: database_endpoint = "my-demo-dev-db.abcdefg8dsto.us-east-1.rds.amazonaws.com:5432" eks_cluster_endpoint = "https://0123456789A00BCD000E11BE12345A01.gr7.us-east-1.eks.amazonaws.com" eks_cluster_name = "my-demo-dev-eks" materialize_s3_role_arn = "arn:aws:iam::000111222333:role/my-demo-dev-mz-role" metadata_backend_url = <sensitive> oidc_provider_arn = "arn:aws:iam::000111222333:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/7D14BCA3A7AA896A836782D96A24F958" persist_backend_url = "s3://my-demo-dev-storage-f2def2a9/dev:serviceaccount:materialize-environment:12345678-1234-1234-1234-12345678912" s3_bucket_name = "my-demo-dev-storage-f2def2a9" vpc_id = "vpc-0abc000bed1d111bd"
-
Verify the installation and check the status:
kubectl get all -n materialize-environment
Wait for the components to be in the
Running
state.NAME READY STATUS RESTARTS AGE pod/create-db-demo-db-6swk7 0/1 Completed 0 33s pod/mzutd2fbabf5-balancerd-6c9755c498-28kcw 1/1 Running 0 11s pod/mzutd2fbabf5-cluster-s2-replica-s1-gen-1-0 1/1 Running 0 11s pod/mzutd2fbabf5-cluster-u1-replica-u1-gen-1-0 1/1 Running 0 11s pod/mzutd2fbabf5-console-57f94b4588-6lg2x 1/1 Running 0 4s pod/mzutd2fbabf5-console-57f94b4588-v65lk 1/1 Running 0 4s pod/mzutd2fbabf5-environmentd-1-0 1/1 Running 0 16s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/mzutd2fbabf5-balancerd ClusterIP None <none> 6876/TCP,6875/TCP 11s service/mzutd2fbabf5-cluster-s2-replica-s1-gen-1 ClusterIP None <none> 2100/TCP,2103/TCP,2101/TCP,2102/TCP,6878/TCP 12s service/mzutd2fbabf5-cluster-u1-replica-u1-gen-1 ClusterIP None <none> 2100/TCP,2103/TCP,2101/TCP,2102/TCP,6878/TCP 12s service/mzutd2fbabf5-console ClusterIP None <none> 8080/TCP 4s service/mzutd2fbabf5-environmentd ClusterIP None <none> 6875/TCP,6876/TCP,6877/TCP,6878/TCP 11s service/mzutd2fbabf5-environmentd-1 ClusterIP None <none> 6875/TCP,6876/TCP,6877/TCP,6878/TCP 16s service/mzutd2fbabf5-persist-pubsub-1 ClusterIP None <none> 6879/TCP 16s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/mzutd2fbabf5-balancerd 1/1 1 1 11s deployment.apps/mzutd2fbabf5-console 2/2 2 2 4s NAME DESIRED CURRENT READY AGE replicaset.apps/mzutd2fbabf5-balancerd-6c9755c498 1 1 1 11s replicaset.apps/mzutd2fbabf5-console-57f94b4588 2 2 2 4s NAME READY AGE statefulset.apps/mzutd2fbabf5-cluster-s2-replica-s1-gen-1 1/1 12s statefulset.apps/mzutd2fbabf5-cluster-u1-replica-u1-gen-1 1/1 11s statefulset.apps/mzutd2fbabf5-environmentd-1 1/1 16s NAME STATUS COMPLETIONS DURATION AGE job.batch/create-db-demo-db Complete 1/1 11s 33s
If you run into an error during deployment, refer to the Troubleshooting.
-
Open the Materialize Console in your browser:
-
Find your console service name.
MZ_SVC_CONSOLE=$(kubectl -n materialize-environment get svc \ -o custom-columns="NAME:.metadata.name" --no-headers | grep console) echo $MZ_SVC_CONSOLE
-
Port forward the Materialize Console service to your local machine:1
( while true; do kubectl port-forward svc/$MZ_SVC_CONSOLE 8080:8080 -n materialize-environment 2>&1 | tee /dev/stderr | grep -q "portforward.go" && echo "Restarting port forwarding due to an error." || break; done; ) &
The command is run in background.
- To list the background jobs, usejobs
.
- To bring back to foreground, usefg %<job-number>
.
- To kill the background job, usekill %<job-number>
. -
Open a browser and navigate to http://localhost:8080.
💡 Tip: If you experience long loading screens or unresponsiveness in the Materialize Console, we recommend increasing the size of themz_catalog_server
cluster. Refer to the Troubleshooting Console Unresponsiveness guide. -
Next steps
-
From the Console, you can get started with the Quickstart.
-
To start ingesting your own data from an external system like Kafka, MySQL or PostgreSQL, check the documentation for sources.
Cleanup
To delete the whole sample infrastructure and deployment (including the Materialize operator and Materialize instances and data), run from the Terraform directory:
terraform destroy
When prompted to proceed, type yes
to confirm the deletion.
terraform destroy
command is unable to delete the S3 bucket
and does not progress beyond “Still destroying…”, empty the S3 bucket first
and rerun the terraform destroy
command.
See also
-
The port forwarding command uses a while loop to handle a known Kubernetes issue 78446, where interrupted long-running requests through a standard port-forward cause the port forward to hang. The command automatically restarts the port forwarding if an error occurs, ensuring a more stable connection. It detects failures by monitoring for “portforward.go” error messages. ↩︎