Recent articles
Building resilient event pipelines on Kafka 3.7
A pragmatic look at exactly-once semantics, idempotent producers, and how to recover from broker failures without data loss.
12 min readWhy io_uring is finally ready for production
After three years of incremental improvements, Linux 6.6 makes io_uring the obvious default for high-throughput network services.
9 min readReplacing CMake: an evening with Bazel modules
Lessons learned migrating a 90k-line C++ project to bzlmod. Spoiler: it took two weeks and was worth it.
15 min readPostgres 17 logical replication: still tricky
DDL replication finally lands, but row filters and column lists still have sharp edges we walked into in production.
11 min readDesigning for tail latency at scale
How hedged requests, request reissue, and adaptive timeouts cut p99 by 70% in a real service handling 200k req/s.
14 min readA field guide to OpenTelemetry sampling
Tail-based vs head-based, where each fits, and how to keep your bill bounded without losing critical traces.
10 min read