Indexes

Overview

In Materialize, indexes represent query results stored in memory within a cluster. You can create indexes on sources, views, or materialized views.

Indexes on sources

NOTE: In practice, you may find that you rarely need to index a source without performing some transformation using a view, etc.

In Materialize, you can create indexes on a source to maintain in-memory up-to-date source data within the cluster you create the index. This can help improve query performance when serving results directly from the source or when using joins. However, in practice, you may find that you rarely need to index a source directly.

CREATE INDEX idx_on_my_source ON my_source (...);

Indexes on views

In Materialize, you can create indexes on a view to maintain up-to-date view results in memory within the cluster you create the index.

CREATE INDEX idx_on_my_view ON my_view_name(...) ;

During the index creation on a view, the view is executed and the view results are stored in memory within the cluster. As new data arrives, the index incrementally updates the view results in memory.

Within the cluster, querying an indexed view is:

  • fast because the results are served from memory, and

  • computationally free because no computation is performed on read.

For best practices on using indexes, and understanding when to use indexed views vs. materialized views, see Usage patterns.

Indexes on materialized views

In Materialize, materialized view results are stored in durable storage and incrementally updated as new data arrives. Indexing a materialized view makes the already up-to-date view results available in memory within the cluster you create the index. That is, indexes on materialized views require no additional computation to keep results up-to-date.

NOTE: A materialized view can be queried from any cluster whereas its indexed results are available only within the cluster you create the index. Querying a materialized view, whether indexed or not, from any cluster is computationally free. However, querying an indexed materialized view within the cluster where the index is created is faster since the results are served from memory rather than from storage.

For best practices on using indexes, and understanding when to use indexed views vs. materialized views, see Usage patterns.

CREATE INDEX idx_on_my_mat_view ON my_mat_view_name(...) ;

Indexes and clusters

Indexes are local to a cluster. Queries in a different cluster cannot use the indexes in another cluster.

For example, to create an index in the current cluster:

CREATE INDEX idx_on_my_view ON my_view_name(...) ;

You can also explicitly specify the cluster:

CREATE INDEX idx_on_my_view IN CLUSTER active_cluster ON my_view (...);

Usage patterns

Indexes on views vs. materialized views

In Materialize, both indexes on views and materialized views incrementally update the view results when Materialize ingests new data. Whereas materialized views persist the view results in durable storage and can be accessed across clusters, indexes on views compute and store view results in memory within a single cluster.

Maintaining a materialized view in durable storage has resource and latency costs that should be carefully considered depending on the main usage of the view, while maintaining an index has memory costs.

Some general guidelines for usage patterns include:

Usage Pattern General Guideline
View results are accessed from a single cluster only View with an index
View results are accessed across clusters Materialized view
Final consumer of the view is a sink or a SUBSCRIBE operation Materialized view
View maintenance and query serving are scaled independently Materialized view

Indexes and query optimizations

By making up-to-date results available in memory, indexes can help optimize query performance, such as:

  • Provide faster sequential access than unindexed data.

  • Provide fast random access for lookup queries (i.e., selecting individual keys).

Specific instances where indexes can be useful to improve performance include:

  • When used in ad-hoc queries.

  • When used by multiple queries within the same cluster.

  • When used to enable delta joins.

For more information, see Optimization.

Best practices

Before creating an index, consider the following:

  • If you create stacked views (i.e., views that depend on other views) to reduce SQL complexity, we recommend that you create an index only on the view that will serve results, taking into account the expected data access patterns.

  • Materialize can reuse indexes across queries that concurrently access the same data in memory, which reduces redundancy and resource utilization per query. In particular, this means that joins do not need to store data in memory multiple times.

  • For queries that have no supporting indexes, Materialize uses the same mechanics used by indexes to optimize computations. However, since this underlying work is discarded after each query run, take into account the expected data access patterns to determine if you need to index or not.

Back to top ↑