Install on GCP
Self-managed Materialize requires: a Kubernetes (v1.29+) cluster; PostgreSQL as a metadata database; and blob storage.
This tutorial deploys Materialize to GCP Google Kubernetes Engine (GKE) cluster with a Cloud SQL PostgreSQL database as the metadata database and Cloud Storage bucket for blob storage. Specifically, the tutorial uses Materialize on Google Cloud Terraform modules to:
-
Set up the GCP environment.
-
Call terraform-helm-materialize module to deploy Materialize Operator and Materialize instances to the GKE cluster.
Prerequisites
Google cloud project
You need a GCP project for which you have a role (such as
roles/resourcemanager.projectIamAdmin
or roles/owner
) that includes
permissions to manage access to the
project.
gcloud CLI
If you do not have gcloud CLI, install. For details, see the Install the gcloud CLI documentation.
Google service account
The tutorial assumes the use of a service account. If you do not have a service account to use for this tutorial, create a service account. For details, see Create service accounts.
Terraform
If you do not have Terraform installed, install Terraform.
kubectl and plugins
gcloud
to install kubectl
will also install the needed plugins.
Otherwise, you will need to manually install the gke-gcloud-auth-plugin
for
kubectl
.
-
If you do not have
kubectl
, installkubectl
. To install, see Install kubectl and configure cluster access for details. You will configurekubectl
to interact with your GKE cluster later in the tutorial. -
If you do not have
gke-gcloud-auth-plugin
forkubectl
, install thegke-gcloud-auth-plugin
. For details, see Install the gke-gcloud-auth-plugin.
Helm 3.2.0+
If you do not have Helm version 3.2.0+ installed, install. For details, see the Helm documentation.
jq (Optional)
Optional. jq
is used to parse the EKS cluster name and region from the
Terraform outputs. Alternatively, you can manually specify the name and region.
If you want to use jq
and do not have jq
installed, install.
A. Configure GCP project andservice account
-
Open a Terminal window.
-
Initialize the gcloud CLI (
gcloud init
) to specify the GCP project you want to use. For details, see the Initializing the gcloud CLI documentation.💡 Tip: You do not need to configure a default Compute Region and Zone as you will specify the region. -
Enable the following services for your GCP project, if not already enabled:
gcloud services enable container.googleapis.com # For creating Kubernetes clusters gcloud services enable sqladmin.googleapis.com # For creating databases gcloud services enable cloudresourcemanager.googleapis.com # For managing GCP resources gcloud services enable servicenetworking.googleapis.com # For private network connections gcloud services enable iamcredentials.googleapis.com # For security and authentication
-
To the service account that will run the Terraform script, grant the following IAM roles:
roles/editor
roles/iam.serviceAccountAdmin
roles/servicenetworking.networksAdmin
roles/storage.admin
roles/container.admin
-
Enter your GCP project ID.
read -s PROJECT_ID
-
Find your service account email for your GCP project
gcloud iam service-accounts list --project $PROJECT_ID
-
Enter your service account email.
read -s SERVICE_ACCOUNT
-
Grant the service account the neccessary IAM roles.
gcloud projects add-iam-policy-binding $PROJECT_ID \ --member="serviceAccount:$SERVICE_ACCOUNT" \ --role="roles/editor" gcloud projects add-iam-policy-binding $PROJECT_ID \ --member="serviceAccount:$SERVICE_ACCOUNT" \ --role="roles/iam.serviceAccountAdmin" gcloud projects add-iam-policy-binding $PROJECT_ID \ --member="serviceAccount:$SERVICE_ACCOUNT" \ --role="roles/servicenetworking.networksAdmin" gcloud projects add-iam-policy-binding $PROJECT_ID \ --member="serviceAccount:$SERVICE_ACCOUNT" \ --role="roles/storage.admin" gcloud projects add-iam-policy-binding $PROJECT_ID \ --member="serviceAccount:$SERVICE_ACCOUNT" \ --role="roles/container.admin"
-
For the service account, authenticate to allow Terraform to interact with your GCP project. For details, see Terraform: Google Cloud Provider Configuration reference.
For example, if using User Application Default Credentials, you can run the following command:
gcloud auth application-default login
💡 Tip: If usingGOOGLE_APPLICATION_CREDENTIALS
, use absolute path to your key file.
B. Set up GCP Kubernetes environment and install Materialize
To help you get started with Materialize for evaluation purposes, Materialize provides sample Terraform modules. The sample Terraform modules are for evaluation purposes only and not intended for production use. Materialize does not support nor recommends this module for production use.
For simplicity, this tutorial stores various secrets in a file as well as prints them to the terminal. In practice, refer to your organization’s official security and Terraform/infrastructure practices.
Materialize provides sample Terraform
modules for
evaluation purposes only. The modules deploy a sample infrastructure on GCP
(region us-central1
) with the following components:
- Google Kubernetes Engine (GKE) cluster
- Cloud SQL PostgreSQL database for metadata storage
- Cloud Storage bucket for blob storage
- A dedicated VPC
- Service accounts with proper IAM permissions
- Materialize Operator
- Materialize instances (during subsequent runs after the Operator is running)
The tutorial uses the module found in the examples/simple/
directory, which requires minimal user input. For more configuration options,
you can run the modules at the root of the
repository
instead.
For details on the examples/simple/
infrastructure configuration (such as the
node instance type, etc.), see the
examples/simple/main.tf.
-
Clone the Materialize’s sample Terraform repo and checkout the
v0.1.7
tag. For example,-
If cloning via SSH:
git clone --depth 1 -b v0.1.7 git@github.com:MaterializeInc/terraform-google-materialize.git
-
If cloning via HTTPS:
git clone --depth 1 -b v0.1.7 https://github.com/MaterializeInc/terraform-google-materialize.git
-
-
Go to the
examples/simple
folder in the Materialize Terraform repo directory.cd terraform-google-materialize/examples/simple
💡 Tip:The tutorial uses the module found in the
examples/simple/
directory, which requires minimal user input. For more configuration options, you can run the modules at the root of the repository instead.For details on the
examples/simple/
infrastructure configuration (such as the node instance type, etc.), see the examples/simple/main.tf. -
Create a
terraform.tfvars
file (you can copy from theterraform.tfvars.example
file) and specify:-
Your GCP project ID.
-
A prefix (e.g.,
mz-simple
) for your resources. Prefix has a maximum of 10 characters and contains only alphanumeric characters and dashes. -
The region for the GKE cluster.
project_id = "enter-your-gcp-project-id" prefix = "enter-your-prefix" // Maximum of 15 characters, contain lowercase alphanumeric and hyphens only (e.g., mz-simple) region = "us-central1"
-
-
Initialize the terraform directory.
terraform init
-
Run terraform plan and review the changes to be made.
terraform plan
-
If you are satisfied with the changes, apply.
terraform apply
To approve the changes and apply, enter
yes
.Upon successful completion, various fields and their values are output:
Apply complete! Resources: 20 added, 0 changed, 0 destroyed. Outputs: connection_strings = <sensitive> gke_cluster = <sensitive> service_accounts = { "gke_sa" = "mz-simple-gke-sa@my-project.iam.gserviceaccount.com" "materialize_sa" = "mz-simple-materialize-sa@my-project.iam.gserviceaccount.com" }
-
Configure
kubectl
to connect to your EKS cluster, specifying:-
<cluster name>
. Your cluster name has the form<your prefix>-gke
; e.g.,mz-simple-gke
. -
<region>
. By default, the example Terraform module uses theus-central1
region. -
<project>
. Your GCP project ID.
gcloud container clusters get-credentials <cluster-name> \ --region <region> \ --project <project>
Alternatively, you can use the following command to get the cluster name and region from the Terraform output and the project ID from the environment variable set earlier.
gcloud container clusters get-credentials $(terraform output -json gke_cluster | jq -r .name) \ --region $(terraform output -json gke_cluster | jq -r .location) --project $PROJECT_ID
To verify that you have configured correctly, run the following command:
kubectl cluster-info
For help with
kubectl
commands, see kubectl Quick reference. -
-
By default, the example Terraform installs the Materialize Operator. Verify the installation and check the status:
kubectl get all -n materialize
Wait for the components to be in the
Running
state:NAME READY STATUS RESTARTS AGE pod/materialize-mz-simple-materialize-operator-74d8f549d6-lkjjf 1/1 Running 0 36m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/materialize-mz-simple-materialize-operator 1/1 1 1 36m NAME DESIRED CURRENT READY AGE replicaset.apps/materialize-mz-simple-materialize-operator-74d8f549d6 1 1 1 36m
If you run into an error during deployment, refer to the Troubleshooting.
-
Once the Materialize operator is deployed and running, you can deploy the Materialize instances. To deploy Materialize instances, create a
mz_instances.tfvars
file with the Materialize instance configuration.For example, the following specifies the configuration for a
demo
instance.cat <<EOF > mz_instances.tfvars materialize_instances = [ { name = "demo" namespace = "materialize-environment" database_name = "demo_db" cpu_request = "1" memory_request = "2Gi" memory_limit = "2Gi" } ] EOF
-
Run
terraform plan
with both.tfvars
files and review the changes to be made.terraform plan -var-file=terraform.tfvars -var-file=mz_instances.tfvars
The plan should show the changes to be made, with a summary similar to the following:
Plan: 4 to add, 0 to change, 0 to destroy.
-
If you are satisfied with the changes, apply.
terraform apply -var-file=terraform.tfvars -var-file=mz_instances.tfvars
To approve the changes and apply, enter
yes
.Upon successful completion, you should see output with a summary similar to the following:
Apply complete! Resources: 4 added, 0 changed, 0 destroyed. Outputs: connection_strings = <sensitive> gke_cluster = <sensitive> service_accounts = { "gke_sa" = "mz-simple-gke-sa@my-project.iam.gserviceaccount.com" "materialize_sa" = "mz-simple-materialize-sa@my-project.iam.gserviceaccount.com" }
-
Verify the installation and check the status:
kubectl get all -n materialize-environment
Wait for the components to be in the
Running
state.NAME READY STATUS RESTARTS AGE pod/create-db-demo-db-jcpnn 0/1 Completed 0 2m11s pod/mzpzk74xji8b-balancerd-669988bb94-5vbps 1/1 Running 0 98s pod/mzpzk74xji8b-cluster-s2-replica-s1-gen-1-0 1/1 Running 0 96s pod/mzpzk74xji8b-cluster-u1-replica-u1-gen-1-0 1/1 Running 0 96s pod/mzpzk74xji8b-console-5dc9f87498-hqxdw 1/1 Running 0 91s pod/mzpzk74xji8b-console-5dc9f87498-x95qj 1/1 Running 0 91s pod/mzpzk74xji8b-environmentd-1-0 1/1 Running 0 113s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/mzpzk74xji8b-balancerd ClusterIP None <none> 6876/TCP,6875/TCP 98s service/mzpzk74xji8b-cluster-s2-replica-s1-gen-1 ClusterIP None <none> 2100/TCP,2103/TCP,2101/TCP,2102/TCP,6878/TCP 97s service/mzpzk74xji8b-cluster-u1-replica-u1-gen-1 ClusterIP None <none> 2100/TCP,2103/TCP,2101/TCP,2102/TCP,6878/TCP 96s service/mzpzk74xji8b-console ClusterIP None <none> 8080/TCP 91s service/mzpzk74xji8b-environmentd ClusterIP None <none> 6875/TCP,6876/TCP,6877/TCP,6878/TCP 99s service/mzpzk74xji8b-environmentd-1 ClusterIP None <none> 6875/TCP,6876/TCP,6877/TCP,6878/TCP 113s service/mzpzk74xji8b-persist-pubsub-1 ClusterIP None <none> 6879/TCP 113s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/mzpzk74xji8b-balancerd 1/1 1 1 98s deployment.apps/mzpzk74xji8b-console 2/2 2 2 91s NAME DESIRED CURRENT READY AGE replicaset.apps/mzpzk74xji8b-balancerd-669988bb94 1 1 1 98s replicaset.apps/mzpzk74xji8b-console-5dc9f87498 2 2 2 91s NAME READY AGE statefulset.apps/mzpzk74xji8b-cluster-s2-replica-s1-gen-1 1/1 97s statefulset.apps/mzpzk74xji8b-cluster-u1-replica-u1-gen-1 1/1 96s statefulset.apps/mzpzk74xji8b-environmentd-1 1/1 113s NAME STATUS COMPLETIONS DURATION AGE job.batch/create-db-demo-db Complete 1/1 13s 2m11s
If you run into an error during deployment, refer to the Troubleshooting.
-
Open the Materialize Console in your browser:
-
Find your console service name.
MZ_SVC_CONSOLE=$(kubectl -n materialize-environment get svc \ -o custom-columns="NAME:.metadata.name" --no-headers | grep console) echo $MZ_SVC_CONSOLE
-
Port forward the Materialize Console service to your local machine:1
( while true; do kubectl port-forward svc/$MZ_SVC_CONSOLE 8080:8080 -n materialize-environment 2>&1 | tee /dev/stderr | grep -q "portforward.go" && echo "Restarting port forwarding due to an error." || break; done; ) &
The command is run in background.
- To list the background jobs, usejobs
.
- To bring back to foreground, usefg %<job-number>
.
- To kill the background job, usekill %<job-number>
. -
Open a browser and navigate to http://localhost:8080.
💡 Tip: If you experience long loading screens or unresponsiveness in the Materialize Console, we recommend increasing the size of themz_catalog_server
cluster. Refer to the Troubleshooting Console Unresponsiveness guide. -
Next steps
-
From the Console, you can get started with the Quickstart.
-
To start ingesting your own data from an external system like Kafka, MySQL or PostgreSQL, check the documentation for sources.
Cleanup
To delete the whole sample infrastructure and deployment (including the Materialize operator and Materialize instances and data), run from the Terraform directory:
terraform destroy
When prompted to proceed, type yes
to confirm the deletion.
See also
-
The port forwarding command uses a while loop to handle a known Kubernetes issue 78446, where interrupted long-running requests through a standard port-forward cause the port forward to hang. The command automatically restarts the port forwarding if an error occurs, ensuring a more stable connection. It detects failures by monitoring for “portforward.go” error messages. ↩︎