Mechanical sympathy: cache, branches, false sharing

Three hardware ideas that decide whether your low-latency code is fast or pretending to be: how the cache hierarchy works, why branch prediction can change runtime by 5×, and how false sharing makes lock-free code slower than mutexes.

May 3, 2026 · 7 min · HFT Engineer's Roadmap