The core execution engine of Materialize is built with Timely Dataflow and Differential Dataflow, both of which are written in Rust (more about that choice here). So it was only natural to build the rest of our services in Rust. However, we found this was an excellent choice for more reasons than just convenience. In this post, I want to discuss what we’ve enjoyed about using Rust.

Why Rust and Materialize are a Good Match

Rust: A language empowering everyone to build reliable and efficient software. (the Rust website)

Rust is designed to be a good choice for many niches. It’s particularly well-suited for the kinds of programs we are writing at Materialize: high-performance concurrency and network services. These are niches that are commonly filled with other languages. However, Rust has worked well bridging this gap for us.

Guaranteeing Correctness

Rust uses its strong type system and heavy analysis to help programmers write code that is guaranteed to be correct. Some examples:

Rust’s Type System

Rust’s type system is inspired by languages in the ML family. This has a few advantages, like inference:

rust
let v = Vec::new(); // We don’t specify WHAT is in the vector here!

v.push("World"); // Now the compile knows what the vector contains

println!("Hello {}", v[0]); // And can statically guarantee the type is something we can print!

and using types to prevent common bugs:

rust
let s = "rust is great!";

match s.find("great") {
  Some(idx) => println!("substring: {}", &s[idx..idx + 5]),
  None => {
    // hmm, I didn't find the substring, so I'll have to handle it somehow
  }
}

The above example shows that find doesn’t return an index that could be null, or nil, or raise a NullPointerException. Instead, it returns a different type, Option<usize>, that forces the user to handle the case where the substring isn’t found.

The Borrow Checker

Languages like C++ typically perform well, but they come at a cost: memory unsafety. Rust performs the same as these languages. The difference is that its compiler statically guarantees the absence of memory unsafety in normal Rust code. For example, code equivalent to this in C++ would exhibit “undefined behavior” (in practice, usually a seg-fault):

rust
let mut v = [1, 2, 4].to_vec();
let end = &mut v[2];

// Add something to the vector
v.push(4);

// change something in the vector
*end = 3;

However, in Rust, you get this helpful error message:

rust
error[E0499]: cannot borrow `v` as mutable more than once at a time
  --> src/main.rs:8:5
   |
5  |     let end = &mut v[2];
   |                    - first mutable borrow occurs here
...
8  |     v.push(4);
   |     ^^^^^^^^^ second mutable borrow occurs here
...
11 |     *end = 3;
   |     -------- first borrow later used here

This prevents you from introducing a crashing bug! This example is small and contrived. However, bugs like this are extremely prevalent (research confirms it).

Every day, at Materialize and elsewhere, Rust’s type system and borrow checker work together to categorically prevent whole classes of bugs like this one. This does more than save time writing code. Reviewers can spend effectively 100% of their time reviewing the business logic of changes, instead of worrying about subtle problems that may show up. This is in stark contrast to languages that fill the same niches as Rust, like C++, which require careful review for basic correctness properties.

How Using Rust for Materialize Gave Us Actually Fearless Concurrency

Rust’s authors designed its type system and borrow checker to guarantee data-race-freedom. This is the only language, as far as I know, that has succeeded in doing this, especially considering that it doesn’t have a garbage collector. I can’t understate the second-order effects of this guarantee. At Materialize, we introduced concurrency as an optimization without fear of data races, reducing mental overhead.

Batteries Included

Rust comes, by default, with a lot more than a compiler:

  • cargo does package management, runs builds, runs tests, and is generally a swiss-army knife for useful functions
  • rustfmt does standard formatting across pretty much all projects
  • rustup makes it easy to keep your Rust version up to date, and test with other versions

The standardization of tools across the Rust ecosystem makes it easier to get started. It also means that documentation and tutorials pretty much always apply to what you’re doing. This reduces the ramp-up time for the language.

The Community and the Ecosystem

Materialize is a large distributed system. It needs to manage complex networks of components and interact with outside systems to boot. We’ve found that the Rust community is welcoming, helpful, and encourages collaboration. Additionally, the community maintains a large number of high-quality libraries and frameworks that make our job easier.

For example, the tokio organization gives us performant asynchronous networking, protobuf bindings, Kubernetes bindings, tracing (one of the best tracing libraries ever), an http framework (axum), and more! Also, the tokio community discord (and the broader rust discord) are invaluable resources for getting our questions answered.

Problems

As a core technology, Rust has offered huge benefits to Materialize. However, we’ve also hit some problems. Here’s how we’ve worked around them.

  • Rust is a relatively young language. (Well, it’s actually 12 years old, but languages operate on geological timelines.) This means libraries and ecosystems are still developing, which produces churn as APIs change and the community fixes teething bugs.
    • Materialize ends up maintaining forks of some core libraries to stay ahead of improvements and bug fixes. As a result, Materialize is a power user of multiple libraries.
  • Rust has a relatively steep learning curve. The type system and borrow-checker are harder to use than in many other common production languages.
    • We’ve found that hiring Rust experts who can help teach and unblock people when they hit problems saves us a lot of time.
    • Documentation for people learning Rust continues to get better - but there are still gaps.
  • Async Rust has complex semantics that can be hard to use. While some of this is just async programming being difficult, some core concepts and libraries are missing.
    • The async working group continues to work on improving these gaps.

Why We’re Happy Using Rust for Materialize

Any language has trade-offs that you should evaluate against a project’s requirements. For Materialize, Rust was an important early decision that continues to have positive effects that far outweigh the negatives. That’s especially been true as Materialize evolved from a single binary to a distributed platform. If you have experience developing systems in Rust, or even if you don’t but want to start, Materialize is hiring!

More Articles

Technical Article

Recursion in Materialize

Understanding recursion in Materialize & its significance in differential dataflow for SQL updates.
Frank McSherry

Jan 11, 2023

Technical Article

Real-Time Customer Data Platform Views on Materialize

Let's demonstrate the unique features of Materialize by building the core functionality of a customer data platform.

Andy Hattemer

Oct 19, 2022

Key Concept

How and why is Materialize compatible with PostgreSQL?

As an operational data warehouse, Materialize is fundamentally different on the inside, but it's compatible with PostgreSQL in a few important ways.
Andy Hattemer

Oct 18, 2022

Try Materialize Free