This blog examines the performance of Materialize vs. Aurora PostgreSQL read replicas for computationally intensive workloads. We demonstrate that Materialize outperforms Aurora for complex queries over relatively small data volumes. 

Specifically, for the same on-demand cost, Materialize delivers 100x greater throughput with 1000x lower latency. And unlike other solutions that offload computation from OLTP databases, Materialize does so without sacrificing correctness or requiring external change data capture (CDC) tools to move data between systems.

Read on to learn what our benchmark tests uncovered about Materialize and Aurora PostgreSQL.   

Aurora PostgreSQL: Not Designed for Complex, Read-Intensive Queries 

OLTP databases like Aurora PostgreSQL are the backbone of modern applications. They excel at handling vast amounts of transactional operations, ensure data integrity, and deliver fast, atomic transactions. 

However, these strengths can become limitations. Intricate joins, aggregations, and data transformations that modern applications thrive on can bog down an OLTP system, leading to performance bottlenecks and a degraded user experience.

To mitigate this load, many organizations implement read replicas to distribute read operations across multiple database copies. While this offloading can help with performance, it does not fully address the challenges posed by complex queries. Read replicas are designed for straightforward read operations and often struggle with tasks involving multiple joins, large aggregations, and otherwise non-trivial data transformations.

Materialize offers a radically different approach by providing efficient incremental computation. Unlike traditional read replicas, Materialize shifts the computational burden from the read phase to the write phase by precomputing views and incrementally materializing results. 

This ensures data remains fresh and up-to-date, delivering low-latency results. Materialize transforms how we handle complex queries—making them faster while maintaining consistency—and ultimately solves the challenges that traditional read replicas and caching mechanisms cannot.

Benchmarking Use Case: Dynamic Pricing for an Online Retailer

Now that we’ve explained the conceptual differences, let’s perform a quantitative comparison of Materialize and Aurora PostgreSQL. We’ll use dynamic pricing for an online retailer as a sample use case.  

In this use case, the price of any given item fluctuates based on available inventory, snap promotions, popularity, and other factors. The goal is a scalable solution that offers low latency and always displays the current price for an item. 

The logic for dynamic pricing is encapsulated in a standard SQL view definition, which references six tables [1]. The entire database contains only 1GB of data across all tables. You can see the code for the view below:

sql
CREATE VIEW dynamic_pricing AS
WITH recent_prices AS (
    SELECT grp.product_id, AVG(price) AS avg_price
    FROM (SELECT DISTINCT product_id FROM sales) grp, 
    LATERAL (
        SELECT product_id, price 
        FROM sales
        WHERE sales.product_id = grp.product_id 
        ORDER BY sale_date DESC LIMIT 10
    ) sub
    GROUP BY grp.product_id
),

promotion_effect AS (
    SELECT 
        p.product_id,
        MIN(pr.promotion_discount) AS promotion_discount
    FROM promotions pr
    JOIN products p ON pr.product_id = p.product_id
    WHERE pr.active = TRUE
    GROUP BY p.product_id
),

popularity_score AS (
    SELECT 
        s.product_id,
        RANK() OVER (PARTITION BY p.category_id ORDER BY COUNT(s.sale_id) DESC) AS popularity_rank,
        COUNT(s.sale_id) AS sale_count
    FROM sales s
    JOIN products p ON s.product_id = p.product_id
    GROUP BY s.product_id, p.category_id
),

inventory_status AS (
    SELECT 
        i.product_id,
        SUM(i.stock) AS total_stock,
        RANK() OVER (ORDER BY SUM(i.stock) DESC) AS stock_rank
    FROM inventory i
    GROUP BY i.product_id
),

high_demand_products AS (
    SELECT 
        p.product_id,
        AVG(s.sale_price) AS avg_sale_price,
        COUNT(s.sale_id) AS total_sales
    FROM products p
    JOIN sales s ON p.product_id = s.product_id
    GROUP BY p.product_id
    HAVING COUNT(s.sale_id) > (SELECT AVG(total_sales) FROM (SELECT COUNT(*) AS total_sales FROM sales GROUP BY product_id) subquery)
),

dynamic_pricing AS (
    SELECT 
        p.product_id,
        p.base_price,
        CASE 
            WHEN pop.popularity_rank <= 3 THEN 1.2
            WHEN pop.popularity_rank BETWEEN 4 AND 10 THEN 1.1
            ELSE 0.9
        END AS popularity_adjustment,
        rp.avg_price,
        COALESCE(1.0 - (pe.promotion_discount / 100), 1) AS promotion_discount,
        CASE 
            WHEN inv.stock_rank <= 3 THEN 1.1
            WHEN inv.stock_rank BETWEEN 4 AND 10 THEN 1.05
            ELSE 1
        END AS stock_adjustment,
        CASE 
            WHEN p.base_price > rp.avg_price THEN 1 + (p.base_price - rp.avg_price) / rp.avg_price
            ELSE 1 - (rp.avg_price - p.base_price) / rp.avg_price
        END AS demand_multiplier,
        hd.avg_sale_price,
        CASE 
            WHEN p.product_name ilike '%cheap%' THEN 0.8
            ELSE 1.0
        END AS additional_discount
    FROM products p 
    LEFT JOIN recent_prices rp ON p.product_id = rp.product_id
    LEFT JOIN promotion_effect pe ON p.product_id = pe.product_id
    JOIN popularity_score pop ON p.product_id = pop.product_id
    LEFT JOIN inventory_status inv ON p.product_id = inv.product_id
    LEFT JOIN high_demand_products hd ON p.product_id = hd.product_id
)

SELECT 
    dp.product_id,
    dp.base_price * dp.popularity_adjustment * dp.promotion_discount * dp.stock_adjustment * dp.demand_multiplier * dp.additional_discount AS adjusted_price
FROM dynamic_pricing dp;

The benchmark test is to query dynamic_price for the current price of a single product at different scales. All measurements are taken via dbbench on an otherwise idle c5a.4xlarge EC2 instance:

sql
SELECT * FROM dymamic_pricing WHERE product_id = $1

Configurations for Benchmark Testing

Aurora PostgreSQL Configuration

To set up Aurora PostgreSQL for benchmarking, we connected to a db.r6gd.xlarge read replica instance with read optimizations enabled. All queries in PostgreSQL were executed against this instance. 

The primary keys for each table were set, along with secondary indexes based on the view’s needs. Foreign keys were also established to provide the query optimizer with as much information as possible to enhance performance. You can view the code below: 

sql
ALTER TABLE products 
ADD CONSTRAINT products_pkey PRIMARY KEY (product_id);

ALTER TABLE categories 
ADD CONSTRAINT categories_pkey PRIMARY KEY (category_id);

ALTER TABLE suppliers 
ADD CONSTRAINT suppliers_pkey PRIMARY KEY (supplier_id);

ALTER TABLE sales 
ADD CONSTRAINT sales_pkey PRIMARY KEY (sale_id);

ALTER TABLE inventory 
ADD CONSTRAINT inventory_pkey PRIMARY KEY (inventory_id);

ALTER TABLE promotions 
ADD CONSTRAINT promotions_pkey PRIMARY KEY (promotion_id);

ALTER TABLE public.inventory
ADD CONSTRAINT inventory_product_id_fkey FOREIGN KEY (product_id) REFERENCES public.products (product_id); 

ALTER TABLE public.promotions
ADD CONSTRAINT promotions_product_id_fkey FOREIGN KEY (product_id) REFERENCES public.products (product_id);

ALTER TABLE public.sales
ADD CONSTRAINT sales_product_id_fkey FOREIGN KEY (product_id) REFERENCES public.products (product_id);

CREATE INDEX idx_products_product_name ON products(product_name);
CREATE INDEX idx_sales_product_id ON sales(product_id); 
CREATE INDEX idx_sales_sale_date ON sales(sale_date);
CREATE INDEX idx_sales_product_id_sale_date ON sales(product_id, sale_date);
CREATE INDEX idx_promotions_product_id ON promotions(product_id);
CREATE INDEX idx_promotions_active ON promotions(active);
CREATE INDEX idx_promotions_product_id_active ON promotions(product_id, active);
CREATE INDEX idx_inventory_product_id ON inventory(product_id);

Materialize Configuration

We configured a 50cc cluster on the Materialize side, which has approximately the same on-demand cost as the Aurora read-replica. Materialize also connects directly to the primary writer instance to consume changes using the built-in Postgres source [2]

The system maintains consistency by applying all changes within an upstream transaction atomically. Unlike other CDC tools, Materialize will never show partial writes. Within Materialize, the same view is defined and indexed directly. 

Indexing a view in Materialize initiates the pre-computation of results and ensures they are kept up-to-date as the upstream database is modified. Queries executed against this view leverage the index to short-circuit any work that is already completed, significantly improving query performance. The code is as follows: 

sql
CREATE INDEX ON dynamic_pricing (product_id);

Overview of Test Scenarios 

We designed three test scenarios to comprehensively evaluate the performance of Materialize versus read-optimized Aurora PostgreSQL read replicas. These scenarios simulate common real-world use cases, and stress test the systems under varying conditions.

Scenario 1: Single Database Connection with Continuous Writes

In this scenario, a single database connection runs one SELECT query at a time as fast as possible. Simultaneously, a continuous stream of writes (inserts, updates, and deletes) is performed on the database. This tests the ability of each system to maintain read performance under write-heavy conditions.

Scenario 2: Ten Database Connections with Continuous Writes

Here, ten database connections issue SELECT queries concurrently while a continuous stream of writes is performed on the database. This scenario tests each system’s overall robustness and concurrency handling under mixed read/write workloads.

Scenario 3: 100 Database Connections with Continuous Writes

In this final scenario, 100 database connections issue SELECT queries concurrently while the database undergoes a continuous stream of writes. This scenario assesses the robustness and scalability of each system under a read-heavy workload with simultaneous writes.

Test Scenarios: Results for Materialize and Aurora PostgreSQL

Scenario 1 

Aurora PostgreSQL

  • Performance: 80 transactions (0.131 TPS)

  • Mean Latency: 7.639s ± 13.38ms

  • P99 Latency: 7.757s

  • Max Latency: 7.892s

4.294967296s - 8.589934592s \[    80]: ██████████████████████████████████████████████████

Materialize

  • Performance: 46,254 transactions (77.088 TPS)

  • Mean Latency: 12.953ms ± 176.35µs

  • P99 Latency: 56.124ms

  • Max Latency: 268.435ms

Transactions:
4.194304ms -   8.388608ms \[ 28845]: ██████████████████████████████████████████████████
8.388608ms -  16.777216ms \[  9252]: ████████████████
16.777216ms -  33.554432ms \[  6623]: ███████████▍
33.554432ms -  67.108864ms \[  1195]: ██
67.108864ms - 134.217728ms \[   244]: ▍

Scenario 2

Aurora PostgreSQL

  • Performance: 277 transactions (0.451 TPS)

  • Mean Latency: 21.983s ± 225.77ms

  • P99 Latency: 24.516s

  • Max Latency: 25.076s

Transactions:
8.589934592s - 17.179869184s \[     3]: ▌
17.179869184s - 34.359738368s \[   274]: ██████████████████████████████████████████████████

Materialize

  • Performance:  96221 transactions (160.365 TPS)

  • Mean Latency: 29.35ms ± 114.129µs

  • P99 Latency: 644.438ms

  • Max Latency: 825.438ms

Transactions:
4.194304ms -   8.388608ms \[     5]: ▏
8.388608ms -  16.777216ms \[  7445]: ██████▎
16.777216ms -  33.554432ms \[ 78498]: ██████████████████████████████████████████████████
33.554432ms -  67.108864ms \[  3777]: ███████████████████▍
67.108864ms - 134.217728ms \[   662]: █▍
134.217728ms - 268.435456ms \[    44]: ▏
268.435456ms - 536.870912ms \[   687]: ▌
536.870912ms - 825.438664ms \[  5103]: ████▎

Scenario 3

Aurora PostgreSQL

The Aurora read replica crashed under the load, and we were unable to complete the benchmark.

Materialize

  • Performance: 276,486 transactions (460.719 TPS)

  • Latency: 216.996ms±583.073µs

  • P99 Latency: 713.052ms

  • Max Latency: 3.975s

Transactions:
16.777216ms -  33.554432ms \[    52]: ▏
33.554432ms -  67.108864ms \[   949]: ▏
67.108864ms - 134.217728ms \[ 20931]: █████
134.217728ms - 268.435456ms \[207898]: ██████████████████████████████████████████████████
268.435456ms - 536.870912ms \[ 36042]: ████████▋
536.870912ms - 1.073741824s \[ 10397]: ██▌
1.073741824s - 2.147483648s \[   216]: ▏
2.147483648s - 3.975392663s \[     1]: ▏

Conclusion: Materialize Outperforms Aurora PostgreSQL

In modern applications, computationally intensive workloads are the norm. And our benchmarking demonstrated that for complex queries, Materialize significantly outperforms Aurora PostgreSQL read replicas. As an operational data platform, Materialize is ideal for offloading queries from OLTP systems.

Some of the most critical finds of our benchmarking tests include: 

  • Higher throughput and lower latency: Materialize achieves 100x greater throughput and 1000x lower latency than Aurora PostgreSQL for complex queries over small data volumes.

  • Efficient resource utilization: Materialize’s CPU usage never exceeded 13%, compared to Aurora’s frequent CPU alerts and eventual crash under high load.

  • Consistent fresh data: Materialize maintains a lag of less than 1 second, ensuring near-real-time data accuracy without the risk of stale results. Additionally, Materialize has a feature in private preview called real-time recency that guarantees users can always read their writes across systems, making it ideal for use cases that can never tolerate stale results.

Stay tuned for more benchmarking tests from us! And feel free to post on our Twitter or LinkedIn to request benchmarking tests.

Appendix

[1] Table Definitions
sql
CREATE TABLE products ( 
product_id SERIAL PRIMARY KEY,
product_name VARCHAR(255) NOT NULL,
base_price NUMERIC(10, 2) NOT NULL,
category_id INTEGER NOT NULL,
supplier_id INTEGER NOT NULL,
available BOOLEAN NOT NULL 
);

CREATE TABLE categories (
category_id SERIAL PRIMARY KEY,
category_name VARCHAR(255) NOT NULL
);

CREATE TABLE suppliers (
supplier_id SERIAL PRIMARY KEY,
supplier_name VARCHAR(255) NOT NULL
); 

CREATE TABLE sales ( 
sale_id SERIAL PRIMARY KEY,
product_id INTEGER NOT NULL,
sale_price NUMERIC(10, 2) NOT NULL,
sale_date TIMESTAMP NOT NULL, price NUMERIC(10, 2) NOT NULL
); 

CREATE TABLE inventory (
inventory_id SERIAL PRIMARY KEY,
product_id INTEGER NOT NULL,
stock INTEGER NOT NULL,
warehouse_id INTEGER NOT NULL,
restock_date TIMESTAMP NOT NULL
); 

CREATE TABLE promotions ( 
promotion_id SERIAL PRIMARY KEY,
product_id INTEGER NOT NULL,
promotion_discount NUMERIC(10, 2) NOT NULL,
start_date TIMESTAMP NOT NULL,
end_date TIMESTAMP NOT NULL,
active BOOLEAN NOT NULL
);
[2] Materialize also offers a built-in MySQL source that offers comparable functionality for MySQL databases.

More Articles

Key Concept

Incremental View Maintenance Replicas: Improve Database Stability and Accelerate Workloads

IVMRs can deliver 1000x performance for read-heavy workloads, without losing freshness, and do so at a fraction of the price of a traditional replica.

Nate Stewart

Aug 14, 2024

Conceptual Article

OLTP Queries: Transfer Expensive Workloads to Materialize

There are many different methods for OLTP offload, and in the following blog, we will examine the most popular options.

Kevin Bartley

Aug 1, 2024

Key Concept

OLTP Workloads: Offload Complex Queries From Your Operational Database

Read the following blog to learn about OLTP vs. OLAP, problems with complex OLTP workloads, and the case for OLTP offload.

Kevin Bartley

Jul 23, 2024

Try Materialize Free