Another week, another big improvement to sinks! This time, we’ve focused on dramatically improving restart times.
When a Kafka sink is restarting, it learns where to resume by reading from the metadata stored in a progress topic on the Kafka cluster. We strongly recommend enabling compaction and disabling tiered storage on this topic for optimal performance. But we know that’s not always feasible. And without those settings, restarting long-running sinks could take… well, a while.
With this weeks changes, Materialize now reads the progress topic backwards. Instead of scanning from the start, we jump straight to the end, cutting down unnecessary work. This makes restarting faster, sometimes a lot faster.
For one user we saw resumptions drop from 48 minutes to 35 seconds!

For others, we saw over a 4x improvement:

We still strongly recommend our best practices settings for the best possible sink experience and to get sink restart times down to seconds. But for those operating with Kafka clusters where this is not possible, sink times should be significantly improved across the board. Enjoy! 🚀