latency-optimization
⌘ P
System Thinking Case Study

Destroying Latency:
From 6s to 0.8s in Booking Systems

High Performance Event-Driven Arch5 min read

Architecture is a series of trade-offs. In this case, the trade-off was between the simplicity of REST polling and the performance of an asynchronous, event-driven engine.

Legacy State
6.0s
Average Booking Latency

Polling-heavy REST APIs creating "Thundering Herd" problems. Servers spent more time handling HTTP handshakes than processing business logic.

Optimized State
0.8s
87% Reduction in Latency

Real-time WebSocket streaming with an event-driven backend. Instant state propagation without the overhead of repeated request-response cycles.

The Mental Model: Polling vs. Push

The biggest bottleneck in the legacy system was Competitive Polling. When thousands of users are "waiting" for a booking confirmation, their clients spam the server with REST requests every 500ms.

// The Anti-pattern: High-Frequency Polling
GET /api/booking/status?id=123 (600ms RTT)
GET /api/booking/status?id=123 (600ms RTT)
GET /api/booking/status?id=123 -> [Found!]

This creates a massive "Thundering Herd" where the server is bombarded with redundant queries. The fix wasn't "faster code"—it was a **Fundamental Shift in Communication**.

Event-Driven WebSocket Architecture

Instead of the client asking "Are we there yet?", we moved to a model where the server says "It's ready, here's the data."

1

Asynchronous API Chaining

Clients fire a single booking request and immediately get a "Processing" ACK (20ms), freeing up the main thread.

2

Stateful WebSocket Connections

The client opens a single persistent socket. No more TCP/TLS handshake overhead for every status check.

3

Internal Event Bus (Pub/Sub)

Once the worker completes the booking logic, it publishes an event. The WebSocket gateway catches this and "pushes" the update to the specific client instantly.

The Engineering Trade-off

This transition reduced perceived transaction latency from 6s to 0.8s. However, it introduced complexity: We now had to manage **Stateful Connections** and **Sticky Sessions**. This required a load balancer that understood the WebSocket protocol (L7) and a distributed Redis layer for session tracking. But for the end-user, the experience went from "clunky and slow" to "instantaneous."

Context

System Design Philosophy

/blog/latency-optimization
system_status:active