Install on Azure

Self-managed Materialize requires: a Kubernetes (v1.29+) cluster; PostgreSQL as a metadata database; and blob storage.

The tutorial deploys Materialize to Azure Kubernetes Service (AKS) with a PostgreSQL database as the metadata database and Azure Blob Storage for blob storage. The tutorial uses Materialize on Azure Terraform modules to:

  • Set up the Azure Kubernetes environment
  • Call terraform-helm-materialize module to deploy Materialize Operator and Materialize instances to that AKS cluster
WARNING!

The Terraform modules used in this tutorial are intended for evaluation/demonstration purposes and for serving as a template when building your own production deployment. The modules should not be directly relied upon for production deployments: future releases of the modules will contain breaking changes. Instead, to use as a starting point for your own production deployment, either:

  • Fork the repo and pin to a specific version; or

  • Use the code as a reference when developing your own deployment.

For simplicity, this tutorial stores various secrets in a file as well as prints them to the terminal. In practice, refer to your organization’s official security and Terraform/infrastructure practices.

Prerequisites

Azure subscription

If you do not have an Azure subscription to use for this tutorial, create one.

Azure CLI

If you don’t have Azure CLI installed, install Azure CLI.

Terraform

If you don’t have Terraform installed, install Terraform.

kubectl

If you do not have kubectl, install kubectl.

Python (v3.12+) and pip

If you don’t have Python (v3.12 or greater) installed, install it. See Python.org. If pip is not included with your version of Python, install it.

Helm 3.2.0+

If you don’t have Helm version 3.2.0+ installed, install. For details, see to the Helm documentation.

jq (Optional)

Optional. jq is used to parse the AKS cluster name and region from the Terraform outputs. Alternatively, you can manually specify the name and region. If you want to use jq and do not have jq installed, install.

A. Authenticate with Azure

  1. Open a Terminal window.

  2. Authenticate with Azure.

    az login
    

    The command opens a browser window to sign in to Azure. Sign in.

  3. Select the subscription and tenant to use. After you have signed in, back in the terminal, your tenant and subscription information is displayed.

    Retrieving tenants and subscriptions for the selection...
    
    [Tenant and subscription selection]
    
    No     Subscription name    Subscription ID                       Tenant
    -----  -------------------  ------------------------------------  ----------------
    [1]*   ...                  ...                                   ...
    
    The default is marked with an *; the default tenant is '<Tenant>' and
    subscription is '<Subscription Name>' (<Subscription ID>).
    

    Select the subscription and tenant.

  4. Set ARM_SUBSCRIPTION_ID to the subscription ID.

    export ARM_SUBSCRIPTION_ID=<subscription-id>
    

B. Set up Azure Kubernetes environment and install Materialize

WARNING!

The Terraform modules used in this tutorial are intended for evaluation/demonstration purposes and for serving as a template when building your own production deployment. The modules should not be directly relied upon for production deployments: future releases of the modules will contain breaking changes. Instead, to use as a starting point for your own production deployment, either:

  • Fork the repo and pin to a specific version; or

  • Use the code as a reference when developing your own deployment.

Materialize provides the Materialize on Azure Terraform modules for evaluation purposes only. The modules deploy a sample infrastructure on Azure with the following components:

  • AKS cluster for Materialize workloads
  • Azure Database for PostgreSQL Flexible Server for metadata storage
  • Azure Blob Storage for persistence
  • Required networking and security configurations
  • Managed identities with proper RBAC permissions
  • Materialize Operator
  • Materialize instances (during subsequent runs after the Operator is running)
💡 Tip:

The tutorial uses the main.tf found in the examples/simple/ directory, which requires minimal user input. For details on the examples/simple/ infrastructure configuration (such as the node instance type, etc.), see the examples/simple/main.tf.

For more configuration options, you can run the main.tf file at the root of the repository instead. When running with the root main.tf:

  • Starting in v0.2.0, you must define the required providers. See Providers Configuration for details.

  • Starting in v0.2.0, you must specify the network_config. In previous versions, a default value was provided.

  1. Open a Terminal window.

  2. Fork the Materialize’s sample Terraform repo.

  3. Set MY_ORGANIZATION to your github organization name, substituting your organization’s name for <enter-your-organization>:

    MY_ORGANIZATION=<enter-your-organization>
    
  4. Clone your forked repo and checkout the v0.2.0 tag. For example,

    • If cloning via SSH (substitute YOUR_ORGANIZATION with your organization’s name):

      git clone --depth 1 -b v0.2.0 git@github.com:${MY_ORGANIZATION}/terraform-azurerm-materialize.git
      
    • If cloning via HTTPS (substitute YOUR_ORGANIZATION with your organization’s name):

      git clone --depth 1 -b v0.2.0 https://github.com/${MY_ORGANIZATION}/terraform-azurerm-materialize.git
      
  5. Go to the examples/simple folder in the Materialize Terraform repo directory.

    cd terraform-azurerm-materialize/examples/simple
    
    💡 Tip:

    The tutorial uses the main.tf found in the examples/simple/ directory, which requires minimal user input. For details on the examples/simple/ infrastructure configuration (such as the node instance type, etc.), see the examples/simple/main.tf.

    For more configuration options, you can run the main.tf file at the root of the repository instead. When running with the root main.tf:

    • Starting in v0.2.0, you must define the required providers. See Providers Configuration for details.

    • Starting in v0.2.0, you must specify the network_config. In previous versions, a default value was provided.

  6. Optional. Create a virtual environment, specifying a path for the new virtual environment:

    python3 -m venv <path to the new virtual environment>
    

    Activate the virtual environment:

    source <path to the new virtual environment>/bin/activate
    
  7. Install the required packages.

    pip install -r requirements.txt
    
  8. Create a terraform.tfvars file (you can copy from the terraform.tfvars.example file) and specify:

    • The prefix for the resources. Prefix has a maximum of 12 characters and contains only alphanumeric characters and hyphens; e.g., mydemo.

    • The location for the AKS cluster.

    prefix="enter-prefix"  //  maximum 12 characters, containing only alphanumeric characters and hyphens; e.g. mydemo
    location="eastus2"
    
  9. Initialize the terraform directory.

    terraform init
    
  10. Use terraform plan to review the changes to be made.

    terraform plan
    
  11. If you are satisfied with the changes, apply.

    terraform apply
    

    To approve the changes and apply, enter yes.

    Upon successful completion, various fields and their values are output:

    Apply complete! Resources: 21 added, 0 changed, 0 destroyed.
    
    Outputs:
    
    aks_cluster = <sensitive>
    connection_strings = <sensitive>
    kube_config = <sensitive>
    resource_group_name = "mydemo-rg"
    
  12. Configure kubectl to connect to your cluster:

    • <cluster_name>. Your cluster name has the form <your prefix>-aks; e.g., mz-simple-aks.

    • <resource_group_name>, as specified in the output.

    az aks get-credentials --resource-group <resource_group_name> --name <cluster_name>
    

    Alternatively, you can use the following command to get the cluster name and resource group name from the Terraform output:

    az aks get-credentials --resource-group $(terraform output -raw resource_group_name) --name $(terraform output -json aks_cluster | jq -r '.name')
    

    To verify that you have configured correctly, run the following command:

    kubectl cluster-info
    

    For help with kubectl commands, see kubectl Quick reference.

  13. By default, the example Terraform installs the Materialize Operator. Verify the installation and check the status:

    kubectl get all -n materialize
    

    Wait for the components to be in the Running state:

    NAME                                                              READY       STATUS    RESTARTS   AGE
    pod/materialize-mydemo-materialize-operator-74d8f549d6-lkjjf   1/1         Running   0          36m
    
    NAME                                                         READY       UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/materialize-mydemo-materialize-operator   1/1         1            1           36m
    
    NAME                                                                        DESIRED   CURRENT   READY   AGE
    replicaset.apps/materialize-mydemo-materialize-operator-74d8f549d6       1         1         1       36m
    

    If you run into an error during deployment, refer to the Troubleshooting.

  14. Once the Materialize operator is deployed and running, you can deploy the Materialize instances. To deploy Materialize instances, create a mz_instances.tfvars file with the Materialize instance configuration.

    For example, the following specifies the configuration for a demo instance.

    cat <<EOF > mz_instances.tfvars
    
    materialize_instances = [
        {
          name           = "demo"
          namespace      = "materialize-environment"
          database_name  = "demo_db"
          cpu_request    = "1"
          memory_request = "2Gi"
          memory_limit   = "2Gi"
        }
    ]
    EOF
    
  15. Run terraform plan with both .tfvars files and review the changes to be made.

    terraform plan -var-file=terraform.tfvars -var-file=mz_instances.tfvars
    

    The plan should show the changes to be made, with a summary similar to the following:

    Plan: 4 to add, 0 to change, 0 to destroy.
    
  16. If you are satisfied with the changes, apply.

    terraform apply -var-file=terraform.tfvars -var-file=mz_instances.tfvars
    

    To approve the changes and apply, enter yes.

    Upon successful completion, you should see output with a summary similar to the following:

    Apply complete! Resources: 4 added, 0 changed, 0 destroyed.
    
    Outputs:
    
    aks_cluster = <sensitive>
    connection_strings = <sensitive>
    kube_config = <sensitive>
    resource_group_name = "mydemo-rg"
    
  17. Verify the installation and check the status:

    kubectl get all -n materialize-environment
    

    Wait for the components to be ready and in the Running state.

    NAME                                             READY   STATUS      RESTARTS      AGE
    pod/create-db-demo-db-pw7mj                      0/1     Completed   0             39s
    pod/mzl88mc8f6if-balancerd-b66f4c485-rnvxj       1/1     Running     0             15s
    pod/mzl88mc8f6if-cluster-s2-replica-s1-gen-1-0   1/1     Running     0             18s
    pod/mzl88mc8f6if-cluster-u1-replica-u1-gen-1-0   1/1     Running     0             18s
    pod/mzl88mc8f6if-console-689565cfcc-4dkzf        1/1     Running     0             7s
    pod/mzl88mc8f6if-console-689565cfcc-g2bqv        1/1     Running     0             7s
    pod/mzl88mc8f6if-environmentd-1-0                1/1     Running     0             23s
    
    NAME                                               TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                        AGE
    service/mzl88mc8f6if-balancerd                     ClusterIP   None            <none>        6876/TCP,6875/TCP                              15s
    service/mzl88mc8f6if-cluster-s2-replica-s1-gen-1   ClusterIP   None            <none>        2100/TCP,2103/TCP,2101/TCP,2102/TCP,6878/TCP   18s
    service/mzl88mc8f6if-cluster-u1-replica-u1-gen-1   ClusterIP   None            <none>        2100/TCP,2103/TCP,2101/TCP,2102/TCP,6878/TCP   18s
    service/mzl88mc8f6if-console                       ClusterIP   None            <none>        8080/TCP                                       7s
    service/mzl88mc8f6if-environmentd                  ClusterIP   None            <none>        6875/TCP,6876/TCP,6877/TCP,6878/TCP            15s
    service/mzl88mc8f6if-environmentd-1                ClusterIP   None            <none>        6875/TCP,6876/TCP,6877/TCP,6878/TCP            23s
    service/mzl88mc8f6if-persist-pubsub-1              ClusterIP   None            <none>        6879/TCP                                       23s
    
    NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/mzl88mc8f6if-balancerd   1/1     1            1           15s
    deployment.apps/mzl88mc8f6if-console     2/2     2            2           7s
    
    NAME                                               DESIRED   CURRENT   READY      AGE
    replicaset.apps/mzl88mc8f6if-balancerd-b66f4c485   1         1         1          16s
    replicaset.apps/mzl88mc8f6if-console-689565cfcc    2         2         2          8s
    
    NAME                                                        READY   AGE
    statefulset.apps/mzl88mc8f6if-cluster-s2-replica-s1-gen-1   1/1     19s
    statefulset.apps/mzl88mc8f6if-cluster-u1-replica-u1-gen-1   1/1     19s
    statefulset.apps/mzl88mc8f6if-environmentd-1                1/1     24s
    
    NAME                          STATUS     COMPLETIONS   DURATION   AGE
    job.batch/create-db-demo-db   Complete   1/1           10s        40s
    

    If you run into an error during deployment, refer to the Troubleshooting.

  18. Open the Materialize Console in your browser:

    1. Find your console service name.

      MZ_SVC_CONSOLE=$(kubectl -n materialize-environment get svc \
        -o custom-columns="NAME:.metadata.name" --no-headers | grep console)
      echo $MZ_SVC_CONSOLE
      
    2. Port forward the Materialize Console service to your local machine:1

      (
        while true; do
           kubectl port-forward svc/$MZ_SVC_CONSOLE 8080:8080 -n materialize-environment 2>&1 | tee /dev/stderr |
           grep -q "portforward.go" && echo "Restarting port forwarding due to an error." || break;
        done;
      ) &
      

      The command is run in background.
      - To list the background jobs, use jobs.
      - To bring back to foreground, use fg %<job-number>.
      - To kill the background job, use kill %<job-number>.

    3. Open a browser and navigate to http://localhost:8080.

    💡 Tip: If you experience long loading screens or unresponsiveness in the Materialize Console, we recommend increasing the size of the mz_catalog_server cluster. Refer to the Troubleshooting Console Unresponsiveness guide.

Next steps

  • From the Console, you can get started with the Quickstart.

  • To start ingesting your own data from an external system like Kafka, MySQL or PostgreSQL, check the documentation for sources.

Cleanup

To delete the whole sample infrastructure and deployment (including the Materialize operator and Materialize instances and data), run from the Terraform directory:

terraform destroy

When prompted to proceed, type yes to confirm the deletion.

💡 Tip: If the terraform destroy command is unable to delete the subnet because it is in use, you can rerun the terraform destroy command.

See also


  1. The port forwarding command uses a while loop to handle a known Kubernetes issue 78446, where interrupted long-running requests through a standard port-forward cause the port forward to hang. The command automatically restarts the port forwarding if an error occurs, ensuring a more stable connection. It detects failures by monitoring for “portforward.go” error messages. ↩︎

Back to top ↑