How to connect AWS MSK to Materialize
This guide goes through the required steps to connect Materialize to an AWS MSK cluster, including some of the more complicated bits around configuring security settings in MSK.
If you already have an MSK cluster, you can skip step 1 and directly move on to Make the cluster public and enable SASL. You can also skip steps 3 and 4 if you already have Kafka installed and running, and have created a topic that you want to read from in Materialize.
The process to connect Materialize to Amazon MSK consists of the following steps:
-
Create an Amazon MSK cluster
If you already have an Amazon MSk cluster set up, then you can skip this step.
a. Sign in to the AWS Management console and open the Amazon MSK console
b. Choose Create cluster
c. Enter a cluster name, and leave all other settings unchanged
d. From the table under All cluster settings, copy the values of the following settings and save them because you need them later in this tutorial: VPC, Subnets, Security groups associated with VPC
e. Choose Create cluster
Note: This creation can take about 15 minutes.
-
Make the cluster public and enable SASL
Turn on SASL
a. Navigate to the Amazon MSK console
b. Choose the MSK cluster you just created in Step 1
c. Click on the Properties tab
d. In the Security settings section, choose Edit
e. Check the checkbox next to SASL/SCRAM authentication
f. Click Save changes
You can find more details about updating an MSK cluster’s security configurations here.
Create a symmetric key
a. Now go to the Amazon KMS (Key Management Service) console
b. Click Create Key
c. Choose Symmetric and click Next
d. Give the key and Alias and click Next
e. Under Administrative permissions, check the checkbox next to the AWSServiceRoleForKafka and click Next
f. Under Key usage permissions, again check the checkbox next to the AWSServiceRoleForKafka and click Next
g. Click on Create secret
h. Review the details and click Finish
You can find more details about creating a symmetric key here.
Store a new Secret
a. Go to the AWS Secrets Manager console
b. Click Store a new secret
c. Choose Other type of secret (e.g. API key) for the secret type
d. Under Key/value pairs click on Plaintext
e. Paste the following in the space below it and replace
<your-username>
and<your-password>
with the username and password you want to set for the cluster{ "username": "<your-username>", "password": "<your-password>" }
f. On the next page, give a Secret name that starts with
AmazonMSK_
g. Under Encryption Key, select the symmetric key you just created in the previous sub-section from the dropdown
h. Go forward to the next steps and finish creating the secret. Once created, record the ARN (Amazon Resource Name) value for your secret
You can find more details about creating a secret using AWS Secrets Manager here.
Associate secret with MSK cluster
a. Navigate back to the Amazon MSK console and click on the cluster you created in Step 1
b. Click on the Properties tab
c. In the Security settings section, under SASL/SCRAM authentication, click on Associate secrets
d. Paste the ARN you recorded in the previous subsection and click Associate secrets
Create the cluster’s configuration
a. Go to the Amazon CloudShell console
b. Create a file (eg. msk-config.txt) with the following line
allow.everyone.if.no.acl.found = false
c. Run the following AWS CLI command, replacing
<config-file-path>
with the path to the file where you saved your configuration in the previous stepaws kafka create-configuration --name "MakePublic" \ --description "Set allow.everyone.if.no.acl.found = false" \ --kafka-versions "2.6.2" \ --server-properties fileb://<config-file-path>/msk-config.txt
You can find more information about making your MSK cluster public here.
-
Create a client machine
If you already have a client machine set up that can interact with your MSK cluster, then you can skip this step.
If not, you can create an EC2 client machine and then add the security group of the client to the inbound rules of the cluster’s security group from the VPC console. You can find more details about how to do that here.
-
Install Kafka and create a topic
To start using Materialize with Kafka, you need to create a Materialize source over a Kafka topic. If you already have Kafka installed and a topic created, you can skip this step.
Otherwise, you can install Kafka on your client machine from the previous step and create a topic. You can find more information about how to do that here.
-
Create a source in Materialize
a. Open the Amazon MSK console and select your cluster
b. Click on View client information
c. Copy the url under Private endpoint and against SASL/SCRAM. This will be your
<broker-url>
going forwardd. Install and start a Materialize instance locally
e. From the psql terminal, run the following command. Replace
<kafka-name>
with whatever you want to name your source. The broker url is what you copied in step c of this subsection. The<topic-name>
is the name of the topic you created in Step 4. The<your-username>
and<your-password>
is from Store a new secret under Step 2.CREATE SOURCE <kafka-name> FROM KAFKA BROKER '<broker-url>' TOPIC '<topic-name>' WITH ( security_protocol = 'SASL_SSL', sasl_mechanisms = 'SCRAM-SHA-512', sasl_username = '<your-username>', sasl_password = '<your-password>' ) FORMAT text;
f. If the command executes without an error and outputs CREATE SOURCE, it means that you have successfully connected Materialize to your MSK cluster.
Note: The example above walked through creating a source which is a way of connecting Materialize to an external data source. We used the
WITH
options to provide the username and password and connect to MSK using SASL. For input formats, we usedtext
, however, Materialize supports various other options as well. For example, you can ingest Kafka messages formatted in JSON, Avro and Protobuf. You can find more details about the various different supported formats and possible configurations here.