Behavioral Analytics
Application Databases
3rd-Party Data
Incremental Engine
view: raw_users
view: feat_users
Continually updated
SQL Materialized Views
ML Monitoring
Decision Engines


Operate ML in Production
with a Streaming Database

We decided to look into Materialize to handle personalization and feature-serving in real-time, and by that evening we were up and running.

Tom Cooper
Tom Cooper Head of Data, Superscript

Put Data to Work in Machine Learning Use Cases

Real-time online feature serving

Operate on multiple data sources

Strict serializability

Why Materialize?

Modern Data Applications need Modern Solutions

undefined

Traditional Warehouses: Too Slow

undefined

Stream Processors: Too Complicated

Materialize packages the speed of stream processors in a familiar database abstraction.

Streaming EngineResultsWriteRead

Streaming Engine

Read: What is a Streaming Database?  →
undefined

PostgreSQL Serving Layer

Read: Materialize Postgres Compatibility Explained  →

Managed in standard SQL

Incrementally Maintained Views

Write complex SQL transformations as materialized views that efficiently update themselves as inputs change.

Learn More

Sliding Windows

Chevron Down

Write queries that filter to a window of time anchored to the present, Materialize will update results as time advances.

Learn More

SQL Alerting

Chevron Down

Write alerts as SQL queries with filters and subscribe to new rows as they appear.

Learn More
incremental.sql
CREATE MATERIALIZED VIEW my_view AS
	SELECT userid, COUNT(api.id), COUNT(pageviews.id)
	FROM users
	JOIN pageviews on users.id = pageviews.userid
	JOIN api ON users.id = api.userId
	GROUP BY userid;
userID api_calls pageviews
VPLaKV 400 20
MN37Mt 60 9
1fT4KY 72 42
sT4QY 10 342

Incrementally Maintained Views

Write complex SQL transformations as materialized views that efficiently update themselves as inputs change.

Learn More
incremental.sql
CREATE MATERIALIZED VIEW my_view AS
	SELECT userid, COUNT(api.id), COUNT(pageviews.id)
	FROM users
	JOIN pageviews on users.id = pageviews.userid
	JOIN api ON users.id = api.userId
	GROUP BY userid;
userID api_calls pageviews
VPLaKV 400 20
MN37Mt 60 9
1fT4KY 72 42
sT4QY 10 342

Sliding Windows

Write queries that filter to a window of time anchored to the present, Materialize will update results as time advances.

Learn More
sliding.sql
CREATE MATERIALIZED VIEW my_window AS
	SELECT date_trunc('minute', received_at),
	COUNT(*) as order_ct, SUM(amount) as revenue
	FROM orders
	WHERE mz_now() < received_at + interval '5 minutes'
	GROUP BY 1;
minute order_ct revenue

SQL Alerting

Write alerts as SQL queries with filters and subscribe to new rows as they appear.

Learn More
alerting.sql
SELECT userID, email, MAX(orders.id) as last_order
  FROM users
  JOIN orders ON orders.userID = users.id
  GROUP BY userId, email
  -- Use a filter to surface users with a high % of fraud
  HAVING SUM(is_fraud) / COUNT(orders.id)::FLOAT > 0.5;
userID email last_order
REOtIb 13/12/2022
Y5KBE8 9/12/2022
Wj7JQ0 13/12/2022
tPCQ0 13/11/2022
Checkmark
Checkmark
Checkmark
Checkmark

Streaming Inputs

Built for JOINs

Active Replication

Event-Driven Primitives

Secure and Compliant


The Warehouse-Native Approach
to Machine Learning

“Building an ML pipeline requires stitching multiple systems together”

“We need to train our machine learning models faster”

“Machine learning operations requires hiring for a specialized set of skills”

“We can manage our feature store with our data warehouse”

“We need to train our models with multiple data inputs and can’t move to streaming”

“Moving to real-time training will results in errors from eventual consistency”

More Use Cases

Real-Time & User-Facing Analytics

Real-Time & User-Facing Analytics

Automation and Alerting

Automation and Alerting

Segmentation and Personalization

Segmentation and Personalization

Try Materialize Free