CREATE SOURCE: Text or bytes over Kinesis
CREATE SOURCE connects Materialize to an external data source and lets you interact
with its data as if the data were in a SQL table.
This document details how to connect Materialize to a text- or byte–formatted Kinesis stream.
Sources represent connections to resources outside Materialize that it can read data from. For more information, see API Components: Sources.
|MATERIALIZED||Materializes the source’s data, which retains all data in memory and makes sources directly selectable. For more information, see API Components: Materialized sources.|
|src_name||The name for the source, which is used as its table name within SQL.|
|col_name||Override default column name with the provided identifier. If used, a col_name must be provided for each column in the created source.|
|KINESIS ARN arn||The AWS ARN of the Kinesis Data Stream.|
|WITH ( option_list )||Options affecting source creation. For more detail, see
|TEXT||Format the source’s data as ASCII-encoded text.|
|FORMAT BYTES||Leave data received from the source as unformatted bytes stored in a column named
|ENVELOPE NONE||(Default) Use an append-only envelope. This means that records will only be appended and cannot be updated or deleted.|
The following options are valid within the
||A valid access key ID for the AWS resource.|
||A valid secret access key for the AWS resource.|
||The session token associated with the credentials, if the credentials are temporary|
If you do not provide credentials via with options then
materialized will examine the standard
AWS authorization chain:
- Environment variables:
credential_processcommand in the AWS config file, usually located at
- AWS credentials file. Usually located at
- IAM instance profile. Will only work if running on an EC2 instance with an instance profile/role.
Credentials fetched from a container or instance profile expire on a fixed schedule. Materialize will attempt to refresh the credentials automatically before they expire, but the source will become inoperable if the refresh operation fails. For details about the IAM account whose details you provide, see Kinesis source details.
Kinesis source details
- A Kinesis source represents a single Kinesis stream.
- By default, Materialize will try to read credentials automatically via Rusoto’s ChainProvider. If credentials are explicitly provided, those will be used instead.
- The IAM account whose credentials you provide requires
kinesis-readpermissions and access to
- Kinesis sources will only have one column, which will be named
Text format details
Text-formatted sources reads lines from a file.
- Data from text-formatted sources is treated as newline-delimited.
- Data is assumed to be UTF-8 encoded, and discarded if it cannot be converted to UTF-8.
- Text-formatted sources have one column, which, by default, is named
Raw byte format details
Raw byte-formatted sources provide Materialize the raw bytes received from the source without applying any formatting or decoding.
Raw byte-formatted sources have one column, which, by default, is named
Append-only envelope means that all records received by the source is treated as an insert. This is Materialize’s default envelope (i.e. if no envelope is specified), and can be specified with ENVELOPE NONE.
Upsert on a Kinesis stream
CREATE SOURCE kinesis_source FROM KINESIS ARN ... WITH ( access_key_id = ..., secret_access_key = ... ) FORMAT BYTES;
This creates a source that…
- Is append-only.
- Has one column,
data, which represents the stream’s incoming bytes.