The core execution engine of Materialize is built with Timely Dataflow and Differential Dataflow, both of which are written in Rust (more about that choice here), so it was only natural to build the rest of our services in Rust. However, we have found this an excellent choice, for more reasons than just convenience! In this post, I want to spend some time going over what we have enjoyed about using Rust!

Overview

A language empowering everyone to build reliable and efficient software.

(the Rust website)

Rust is designed to be a good choice for many niches, but it is particularly well-suited for the kinds of programs we are writing at Materialize: high-performance concurrency, and network services. These are niches that are commonly filled with other languages, but Rust has worked well bridging this gap for us, detailed below!

Guaranteeing Correctness

Rust uses its strong type system and heavy analysis to help the programmer write code that is guaranteed to be correct. Some examples:

Rust’s Type System

Rust’s type system is inspired by languages in the ML family. This has a few advantages, like inference:

rust
let v = Vec::new(); // We don’t specify WHAT is in the vector here!

v.push(World); // Now the compile knows what the vector contains

println!(Hello {}, v[0]); // And can statically guarantee the type is something we can print!

and using types to prevent common bugs:

rust
let s = "rust is great!";

match s.find("great") {
  Some(idx) => println!("substring: {}", &s[idx..idx + 5]),
  None => {
    // hmm, i didn't find the substring, so I'll have to handle it somehow
  }
}

The above example shows that find doesn’t return an index that could be null, or nil, or raise a NullPointerException, but instead returns a different type: Option<usize>, that forces the user to handle the case where the substring isn’t found.

The Borrow Checker

The performance of languages like C++ typically comes at a cost: memory unsafety. Rust performs the same as these languages, but its compiler statically guarantees the absence of memory unsafety in normal Rust code.

For example, code equivalent to this in C++ would exhibit “undefined behavior” (in practice, usually a seg-fault):

rust
let mut v = [1, 2, 4].to_vec();
let end = &mut v[2];

// Add something to the vector
v.push(4);

// change something in the vector
*end = 3;

However, in Rust, you get this helpful error message:

rust
error[E0499]: cannot borrow `v` as mutable more than once at a time
  --> src/main.rs:8:5
   |
5  |     let end = &mut v[2];
   |                    - first mutable borrow occurs here
...
8  |     v.push(4);
   |     ^^^^^^^^^ second mutable borrow occurs here
...
11 |     *end = 3;
   |     -------- first borrow later used here

Which prevents you from a bug! This example is small and contrived, but bugs like this are extremely prevalent (people have done research).

Every day, at Materialize and elsewhere, Rust’s type system and borrow checker work together to categorically prevent whole classes of bugs like this one. This doesn’t just save the time of the engineer writing the code; reviewers can spend effectively 100% of their time reviewing the business logic of changes, instead of worrying about subtle problems that may show up. This is in stark contrast to the experience of using languages that fill the same niches as Rust, like C++, which require careful review for basic correctness properties.

Actually Fearless Concurrency

Rust is designed to compose its type system and its borrow checker in a way that guarantees data-race-freedom. This is the only language, as far as I know, that has succeeded in doing this, especially considering that it does not have a garbage collector.

The second-order effects of this guarantee cannot be understated. At Materialize, we are able to introduce concurrency as an optimization with no fear of data races, reducing mental overhead.

Batteries included

Rust comes, by default, with a lot more than a compiler:

  • cargo does package management, runs builds, runs tests, and is generally a swiss-army knife for useful functions
  • rustfmt does standard formatting across pretty much all projects
  • rustup makes it easy to keep your Rust version up to date, and test with other versions
  • and more!

The standardization of tools across the Rust ecosystem not only makes it easier to get started but also means that documentation and tutorials pretty much always apply to what you are doing! This can reduce the ramp-up time.

The Community and the Ecosystem

Materialize is a large distributed system, and it needs to be able to manage complex networks of components and interact with outside systems. We have found that the Rust community is welcoming and helpful, and encourages collaboration. Additionally, The community maintains a large number of high-quality libraries and frameworks, that make our job easier.

For example, the tokio organization gives us performant asynchronous networking, protobuf bindings, Kubernetes bindings, tracing (one of the best tracing libraries ever), an http framework (axum) and more! Also, the tokio community discord (and the broader rust discord) are invaluable resources for getting questions answered!

Hiring

A yearly survey by StackOverflow has shown that Rust has been the most loved language for 6 years! Engineers are excited about the technology we use to build our product, which helps us in the hiring process!

(We are hiring!)

Problems

As a core technology, Rust has offered huge benefits to Materialize, but here are some problems that we have hit, and how we work around them!

  • Rust is a relatively young language (well, it’s actually 12 years old but languages operate on geological timelines), which means libraries and ecosystems are still developing. This can mean some churn as APIs change and teething bugs are fixed.
    • Materialize ends up maintaining some forks of some core libraries to be able to stay ahead of improvements and bugfixes (Materialize ends up being a power user of many libraries).
  • Rust has a relatively steep learning curve. The type system and borrow-checker are more complex to interact with than many other common production languages.
    • We have found that hiring Rust experts that can help teach and unblock people when they hit problems saves us a lot of time.
    • Documentation for people learning Rust continues to get better, but there are still gaps!
  • Async Rust has complex semantics that can be hard to interact with. While some of this is just async programming being difficult, some core concepts and libraries are missing.

Conclusion

Any language is going to have a set of trade-offs that should be evaluated against the requirements of a project. For Materialize, Rust was an important early decision that continues to have positive effects that far outweigh the negatives, especially as the project evolved from a single binary to a distributed platform. If you have experience developing systems in Rust, or even if you don’t but want to start, Materialize is hiring!

More Articles

Technical Article

Building Differential Dataflow from scratch

Let's build (in Python) the Differential Dataflow framework at the heart of Materialize, and explain what it's doing along the way.

Ruchir Khaitan

Feb 9, 2023

Technical Article

Direct PostgreSQL Replication Stream Setup | Materialize

Comprehensive guide on using PostgreSQL's write-ahead log as a data source for Materialize, with technical insights & benefits.

Arjun Narayan
Petros Angelatos

Feb 16, 2022

Technical Article

Taming the beast that is a SQL database

In this article, we will talk about one of the ways we approach the testing of the SQL engine of the product at Materialize. We hope to cover other modules and interesting angles in the future.
Philip Stoev

Feb 1, 2022

Work for Materialize