Introduction #
As part of this post, we will be designing a real-time matching and dispatch system for a ride-hailing platform focusing on geospatial indexing, low-latency matching, streaming data, pricing and consistency trade-offs.
1. Requirements #
Functional Requirements #
Rider:
- Request a ride (pickup, dropoff, ride type)
- Get quote with ETA and estimated price
- Confirm ride, see assigned driver and live location
Driver:
- Receive ride offers, accept or reject.
- Stream location while en route and during trip.
- Go online/offline.
System:
- Select a best driver (multi-factor, not just nearest)
- Compute ETAs for pickup and drop
- Run dynamic pricing / surge per area.
- Persist drips, payments, ratings and audit logs.
Non-Functional Requirements #
Key tradeoffs:
- Availability over strong consistency on read paths: Riders should almost always see some ETA and price, even if it’s slightly stale.
- Low latency: Matching decision < 200 - 300ms in typical load. Location update end-to-end (driver -> rider map) in < 500ms.
- Scalability: Handle >= 1M GPS updates globally. Scale horizontally across regions and data centers.
- Correctness: A driver should never be fully committed to two trips at once - strong consistency around assignment.
- Observability and cost efficiency: Streaming-first, cheap writes, heavy analytics off hot path.
Note: We accept eventual consistency for ETAs and surge levels, but require strong consistency for ride assignment and payment settlement.
2. High-Level Architecture #
Core Components #
We can think in terms of 3 planes:
- Online request plane - matching and real-time user interactions.
- Streaming plane - GPS ingestion, event processing, feature computation.
- Batch/analytics plane - offline modeling and business insights.
Mobile Apps -> API Gateway -> Connection Service (WebSockets)
|
v
--- Ride Service -> Payment
|
Location Service | Matching Service -> Geospatial Index
GPS -> Ingestion -> Stream ------> ETA / Routing -> Pricing and Surge
|
|----> Analytics PlaneAPIs #
-
POST
/rides/quote- Input:
pickup_lat,pickup_lng,dropoff_lat,dropoff_lng,ride_type - Output:
estimated_price,estimated_eta
- Input:
-
POST
/rides- Input:
pickup,drop,price,ride_type,client_token - Output:
ride_id,status(eg: requested)
- Input:
-
POST
/rides/{ride_id}/cancel(for both rider and driver) -
POST
/rides/{ride_id}/complete(driver) -
POST
/driver/{driver_id}/status(whether driver is online or offline)
Real Time Over Sockets #
-
Rider Channel:
ride_update(matched, driver en route, started, completed).driver_location(stream)
-
Driver Channel:
ride_offer(new ride request with countdown)ride_update(GPS updates, cancelled ride, ride started, ride completed)
Geospatial Indexing: Quadtrees vs S2 vs H3 #
We need to handle millions of GPS updates per second, and indexing raw lat/longitude in a traditional RDBMS does not scale. Instead, we will use a geospatial index that will:
- Partition the world into cells
- Map each
(lat, lng)to a cell ID - Maintain in-memory state per cell and small neighbourhoods.
Quadtrees / Geohash #
- Idea: Recursively subdivide the world into quadrants; each level adds precision.
- Geohash encodes lat/lng into a string; prefix similarity means proximity.
- Pros:
- Simple to implement
- String prefix ranged queries.
- Cons:
- Cell shapes distort with latitude
- Neighbourhood queries across cell boundaries are messier.
Google S2 #
S2 partitions the sphere into hierarchical cells using a space-filling curve (Hilbert) over faces of a cube projected onto Earth.
- Pros:
- Native spherical geometry (good for global systems).
- Good locality properties via Hilbert curve.
- Widely used (Google Maps, BigQuery GIS, MongoDB etc).
- Cons:
- High complexity
Uber H3 #
H3 is Uber’s open-source hexagonal hierarchical spatial index.
- Hexagonal grid over the globe, resolutions 0 - 15.
- Each cell has a 64-bit index; supports k-ring neighbours, polygons etc.
- Pros:
- Hexagons have more uniform neighbour distances than squares.
- Built-in functions make Nearest neighbour, k-ring (neighbours within N rings), aggregation across resolutions straightforward.
Location Tracking: Handling Millions of GPS Pings #
When a driver sends a GPS update every 1-5 seconds:
- Ingress
- Driver -> WebSocket -> Connection Service -> Location Service
- Matching / Tracking
- Convert
(lat, lng)-> cell ID (eg: H3 index) - Update in-memory driver state:
drivers_by_cell[cell_id]-> set of available drivers in that celldriver_state[driver_id]-> last location, status
- Redis cache
- Set TTL for driver locations to automatically remove stale drivers.
- Convert
This ensures matching queries never hit the database and are served from memory.
Partitioning and Scaling #
To handle high write volume:
- Partition by region and cell ID:
- Each shard manages a subset of cells.
- Topic partition key: region, cell_id prefix
- Tunable update frequency:
- Adaptive throttling (slower updates when driver is stopped)
- Backpressure and drop policies:
- If downstream is overloaded, drop older updates for the same driver - only the latest matters per driver.
Note: We aim for a write-optimized, streaming-first architecture: We treat the GPS stream as the ground truth, update an in-memory index for sub-second queries, and persist raw events separately to a data lake for analytics and ML.
Matching Logic: Finding the “Best” Driver #
Best is rarely “closest” - it’s a weighted combination of:
- ETA to pickup (road, traffic)
- Driver rating and cancellation history
- Driver’s current trip (direction)
- Vehicle constraints (type)
- Platform incentives (driver utilization etc)
Step 1: Candidate Retrieval Given a pickup point:
- Map
(lat,lng)->cell_id - Query current cell + k-ring neighbours (eg: 2-3 km).
- Collect available drivers from these cells, with a cap (eg: top 100 drivers by proximity).
Using H3:
geoToH3(lat,lng,res)->cell_idkRing(cell_id, k)-> set of neighbour cell ids
Step 2: Scoring Function
Define a match score:
score = w1*eta + w2*rating + w3*vehicle_type + w4*driver_utilization + ...
Step 3: Offer and Reservation To avoid double-assignment, we use short-lived reservations:
- Matching Service chooses top candidate driver.
- It calls a Driver Allocation Store (Redis) with compare-and-set (CAS):
- Set
driver_status = "reserved"if current status is"available".
- Set
- If CAS succeeds:
- Create RideOffer
- Send offer via WebSocket to driver.
- Driver accepts or rejects via WebSocket -> ConnectionService -> Matching / Ride Service
- If accepted, Ride Service updates trip status to
"assigned"and notifies both rider and driver. - On timeout / rejection, Matching Service removes reservation, makes driver available and tries next candidate.
Note: 2 riders trying to get the same driver concurrently won’t happen due to CAS/locks that enforce exclusive assignment.
WebSockets for Real-Time Signalling #
- Both rider and driver maintain persistent WebSocket connections to Connection Service.
- The Connection Service knows which instance holds the WebSocket for each user.
Benefits:
- Full-duplex, low latency updates vs polling.
- Separates stateful connection management from stateless matching logic.
Data Consistency: Preventing Double Booking #
- Per-driver linearizability: A single “home” shard / key for each driver. All state transitions go through that shard.
- Idempotent APIs: Client includes
request_idon mutation calls. Server stores(request_id -> result)with some TTL. - Ride state machine with explicit transitions:
Requested -> OfferPending -> Matched -> Ongoing -> Completed/Cancelled. Every transition is a compare-and-set on current state.
ETA Engine #
A basic ETA engine has 2 layers:
- Routing layer: Represents the road network as a graph
- Nodes: intersection / waypoints.
- Edges: road segments with base travel times and constraints
Given source and destination:
- Use A* or Dijkstra’s algorithm to find the shortest path
- Integrate with real-time traffic data to adjust edge weights
Pricing and Surge #
We can come up with a base fare model:
base_price = (per_km * distance) + (per_min * time) + (surge_multiplier * base_fare)
Measuring Supply and Demand For each cell:
- Supply: number of available drivers in / around that cell.
- Demand: recent ride requests / search queries originating from that cell.
Computing Price:
- Map pick-up and drop to cells
- Estimate ETA and trip duration using ETA engine
- Compute base fare components
- Fetch surge multiplier for pickup cell
- Compute final price and show to user with time-limited validity (eg: 2 minutes).
3. Potential Bottlenecks and Mitigations #
- Hotspots (eg: New Year’s eve): Sudden spike in demand in a few cells can cause:
- a high queue backlog in matching service
- Connection Service Saturation i.e. too many websockets in a single region.
Mitigations can include:
- Limit search radius or candidate count for driver matching during overload
- Degrade gracefully i.e. approximate ETAs, cached routes, cheaper heuristics
- Admission Control: if Matching Service queue length exceeds limit, surface an error - “High demand in your area, try again later”.
- Connection Service Scaling: Use consistent hashing of
ride_idsacross nodes. During high demand, auto scale and rebalance.
4. Global Scaling #
- Multi-AZ (Availability Zone) deployment with automatic failover to prevent downtime. In this setup, you have a Primary instance in one AZ and a Standby instance in a second AZ. If the Primary goes dark, the system automatically swaps them so your application stays online.
- Per-region H3/S2 index: Global coordination only at higher levels (e.g., cross-border trips).
- Caching layers: Surge multipliers, popular routes, static map data.
- Backpressure signals from downstream services: ETA / routing service indicates overload → Matching Service switches to heuristic ETAs.
- Replayability: All requests and location streams written to Kafka enable reprocessing with improved algorithms and debugging incidents post hoc.