Materialize Documentation
Join the Community github/materialize

CREATE SOURCE: JSON from local file

CREATE SOURCE connects Materialize to an external data source and lets you interact with its data as if the data were in a SQL table.

This document details how to connect Materialize to a JSON-formatted local text file.

Conceptual framework

Sources represent connections to resources outside Materialize that it can read data from. For more information, see API Components: Sources.

Syntax

CREATE MATERIALIZED SOURCE IF NOT EXISTS src_name ( col_name , ) FROM FILE path COMPRESSION NONE GZIP with_options FORMAT BYTES ENVELOPE NONE

with_options

WITH ( field = val , )
Field Use
MATERIALIZED Materializes the source’s data, which retains all data in memory and makes sources directly selectable. For more information, see API Components: Materialized sources.
src_name The name for the source, which is used as its table name within SQL.
col_name Override default column name with the provided identifier. If used, a col_name must be provided for each column in the created source.
FILE path The absolute path to the file you want to use as the source.
COMPRESSION algorithm The compression algorithm used with this file.
WITH ( option_list ) Options affecting source creation. For more detail, see WITH options.
FORMAT BYTES Leave data received from the source as unformatted bytes, and store them in a column named data. However, you can easily transform the data to JSON. For more details, see Byte format details (JSON).
ENVELOPE NONE (Default) Use an append-only envelope. This means that records will only be appended and cannot be updated or deleted.

WITH options

The following options are valid within the WITH clause.

Field Value type Description
tail boolean Continually check the file for new content.

Details

File source details

path values must be the file’s absolute path, e.g.

CREATE SOURCE server_source FROM FILE '/Users/sean/server.log'...

Compression

Raw byte format details

Raw byte-formatted sources provide Materialize the raw bytes received from the source without applying any formatting or decoding.

Raw byte-formatted sources have one column, which, by default, is named data.

Extracting JSON data from bytes

Materialize cannot receive JSON data directly from a source. Instead, you must create a source that stores the data it receives as raw bytes (FORMAT BYTES), and then construct views that provides access to your JSON data by casting the source’s bytea column (named data) to text, and then to jsonb.

CREATE MATERIALIZED VIEW jsonified_bytes AS
SELECT CAST(data AS JSONB) AS data
FROM (
    SELECT CONVERT_FROM(data, 'utf8') AS data
    FROM bytea_source
)

Append-only envelope

Append-only envelope means that all records received by the source is treated as an insert. This is Materialize’s default envelope (i.e. if no envelope is specified), and can be specified with ENVELOPE NONE.

Examples

Creating a JSON source from a local file

  1. Generate a local JSON file. For example:

    echo -ne '{"a":1,"b":2}\n{"a":3,"b":4}' > source.json
    
  2. Create a materialized source from the file:

    CREATE SOURCE local_json_file
    FROM FILE '/Users/sean/materialize/materialize/source.json'
    FORMAT BYTES;
    
  3. Create a materialized view to convert the source’s bytes into jsonb:

    CREATE MATERIALIZED VIEW jsonified_bytes AS
    SELECT CAST(data AS JSONB) AS data
    FROM (
        SELECT CONVERT_FROM(data, 'utf8') AS data
        FROM local_json_file
    );
    

    You can see that this column contains jsonb data using the field accessor:

    SELECT data->'a' AS field_access FROM jsonified_bytes ORDER BY field_access;
    
     field_access
    --------------
    1.0
    3.0
    
Did this info help?
Yes No