<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Posts on HFT Engineer's Roadmap</title><link>https://hftengineer.com/posts/</link><description>Recent content in Posts on HFT Engineer's Roadmap</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sat, 20 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://hftengineer.com/posts/index.xml" rel="self" type="application/rss+xml"/><item><title>Lock-free SPSC ring buffer: the queue under every trading system</title><link>https://hftengineer.com/posts/spsc-ring-buffer/</link><pubDate>Sat, 20 Jun 2026 00:00:00 +0000</pubDate><guid>https://hftengineer.com/posts/spsc-ring-buffer/</guid><description>A single-producer/single-consumer ring buffer is the fastest way to move data between two threads — and the canonical low-latency interview question. We build one in C++, prove it correct with acquire/release ordering, and then watch a textbook false-sharing &amp;lsquo;fix&amp;rsquo; make it slower before the real optimisation takes it 14× faster. All numbers measured and ThreadSanitizer-clean.</description></item><item><title>Mechanical sympathy: cache, branches, false sharing</title><link>https://hftengineer.com/posts/mechanical-sympathy/</link><pubDate>Sun, 03 May 2026 00:00:00 +0000</pubDate><guid>https://hftengineer.com/posts/mechanical-sympathy/</guid><description>Three hardware ideas that decide whether your low-latency code is fast or pretending to be: how the cache hierarchy works, why branch prediction can change runtime by 5×, and how false sharing makes lock-free code slower than mutexes.</description></item></channel></rss>