CREATE SOURCE: Load generator

CREATE SOURCE connects Materialize to an external system you want to read data from, and provides details about how to decode and interpret that data.

Load generator sources produce synthetic data for use in demos and performance tests.

Syntax

CREATE SOURCE IF NOT EXISTS src_name FROM LOAD GENERATOR AUCTION COUNTER TPCH ( load_generator_option , ) FOR ALL TABLES FOR TABLES ( table_name AS subsrc_name , ) WITH ( field = val , )

load_generator_option

TICK INTERVAL interval SCALE FACTOR scale_factor
Field Use
src_name The name for the source.
COUNTER Use the counter load generator.
AUCTION Use the auction load generator.
TPCH Use the tpch load generator.
IF NOT EXISTS Do nothing (except issuing a notice) if a source with the same name already exists.
TICK INTERVAL The interval at which the next datum should be emitted. Defaults to one second.
SCALE FACTOR The scale factor for the TPCH generator. Defaults to 0.01 (~ 10MB).
FOR ALL TABLES Creates subsources for all tables in the load generator.
FOR TABLES ( table_list ) Creates subsources for specific tables in the load generator.

WITH options

Field Value Description
SIZE text Required. The size for the source. Accepts values: 3xsmall, 2xsmall, xsmall, small, medium, large, xlarge.

Description

Materialize has several built-in load generators, which provide a quick way to get up and running with no external dependencies before plugging in your own data sources. If you would like to see an additional load generator, please submit a feature request.

Counter

The counter load generator produces the sequence 1, 2, 3, …. Each tick interval, the next number in the sequence is emitted.

Auction

The auction load generator simulates an auction house, where users are bidding on an ongoing series of auctions. The auction source will be automatically demuxed into multiple subsources when the CREATE SOURCE command is executed. This will create the following subsources:

  • organizations describes the organizations known to the auction house.

    Field Type Describes
    id bigint A unique identifier for the organization.
    name text The organization’s name.
  • users describes the users that belong to each organization.

    Field Type Describes
    id bigint A unique identifier for the user.
    org_id bigint The identifier of the organization to which the user belongs. References organizations.id.
    name text The user’s name.
  • accounts describes the account associated with each organization.

    Field Type Describes
    id bigint A unique identifier for the account.
    org_id bigint The identifier of the organization to which the account belongs. References organizations.id.
    balance bigint The balance of the account in dollars.
  • auctions describes all past and ongoing auctions.

    Field Type Describes
    id bigint A unique identifier for the auction.
    seller bigint The identifier of the user selling the item. References users.id.
    item text The name of the item being sold.
    end_time timestamp with time zone The time at which the auction closes.
  • bids describes the bids placed in each auction.

    Field Type Describes
    id bigint A unique identifier for the bid.
    buyer bigint The identifier vof the user placing the bid. References users.id.
    auction_id text The identifier of the auction in which the bid is placed. References auctions.id.
    amount bigint The bid amount in dollars.
    bid_time timestamp with time zone The time at which the bid was placed.

The organizations, users, and accounts are fixed at the time the source is created. Each tick interval, either a new auction is started, or a new bid is placed in the currently ongoing auction.

TPCH

The TPCH load generator implements the TPC-H benchmark specification. The TPCH source must be used with FOR ALL TABLES, which will create the standard TPCH relations. If TICK INTERVAL is specified, after the initial data load, an order and its lineitems will be changed at this interval. If not specified, the dataset will not change over time.

Examples

Creating a counter load generator

To create a load generator source that emits the next number in the sequence every second:

CREATE SOURCE counter
  FROM LOAD GENERATOR COUNTER
  WITH (SIZE = '3xsmall');

To examine the counter:

SELECT * FROM counter;
 counter
---------
       1
       2
       3

Creating an auction load generator

To create the load generator source and its associated subsources:

CREATE SOURCE auction_house
  FROM LOAD GENERATOR AUCTION
  FOR ALL TABLES
  WITH (SIZE = '3xsmall');

To display the created subsources:

SHOW SOURCES;
     name      |      type      |  size
---------------+----------------+---------
 accounts      | subsource      |
 auction_house | load-generator | 3xsmall
 auctions      | subsource      |
 bids          | subsource      |
 organizations | subsource      |
 users         | subsource      |

To examine the simulated bids:

SELECT * from bids;
 id | buyer | auction_id | amount |          bid_time
----+-------+------------+--------+----------------------------
 10 |  3844 |          1 |     59 | 2022-09-16 23:24:07.332+00
 11 |  1861 |          1 |     40 | 2022-09-16 23:24:08.332+00
 12 |  3338 |          1 |     97 | 2022-09-16 23:24:09.332+00

Creating a TPCH load generator

To create the load generator source and its associated subsources:

CREATE SOURCE tpch
  FROM LOAD GENERATOR TPCH (SCALE FACTOR 1)
  FOR ALL TABLES
  WITH (SIZE = '3xsmall');

To display the created subsources:

SHOW SOURCES;
   name   |      type      |  size
----------+----------------+---------
 tpch     | load-generator | 3xsmall
 supplier | subsource      |
 region   | subsource      |
 partsupp | subsource      |
 part     | subsource      |
 orders   | subsource      |
 nation   | subsource      |
 lineitem | subsource      |
 customer | subsource      |

To run the Pricing Summary Report Query (Q1), which reports the amount of billed, shipped, and returned items:

SELECT
    l_returnflag,
    l_linestatus,
    sum(l_quantity) AS sum_qty,
    sum(l_extendedprice) AS sum_base_price,
    sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
    sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
    avg(l_quantity) AS avg_qty,
    avg(l_extendedprice) AS avg_price,
    avg(l_discount) AS avg_disc,
    count(*) AS count_order
FROM
    lineitem
WHERE
    l_shipdate <= date '1998-12-01' - interval '90' day
GROUP BY
    l_returnflag,
    l_linestatus
ORDER BY
    l_returnflag,
    l_linestatus;
 l_returnflag | l_linestatus | sum_qty  | sum_base_price | sum_disc_price  |    sum_charge     |      avg_qty       |     avg_price      |      avg_disc       | count_order
--------------+--------------+----------+----------------+-----------------+-------------------+--------------------+--------------------+---------------------+-------------
 A            | F            | 37772997 |    56604341792 |  54338346989.17 |  57053313118.2657 | 25.490380624798817 | 38198.351517998075 | 0.04003729114831228 |     1481853
 N            | F            |   986796 |     1477585066 |   1418531782.89 |   1489171757.0798 | 25.463731840115603 |  38128.27564317601 | 0.04007431682708436 |       38753
 N            | O            | 74281600 |   111337230039 | 106883023012.04 | 112227399730.9018 |  25.49430183051871 | 38212.221432873834 | 0.03999775539657235 |     2913655
 R            | F            | 37770949 |    56610551077 |   54347734573.7 |  57066196254.4557 | 25.496431466814634 |  38213.68205054471 | 0.03997848687172654 |     1481421

Sizing a source

To provision a specific amount of CPU and memory to a source on creation, use the SIZE option:

CREATE SOURCE auction_load
  FROM LOAD GENERATOR AUCTION
  WITH (SIZE = '3xsmall');

To resize the source after creation:

ALTER SOURCE auction_load SET (SIZE = 'large');

The smallest source size (3xsmall) is a resonable default to get started. For more details on sizing sources, check the CREATE SOURCE documentation page.

Back to top ↑