How We Track Riders in Real-Time for $0.003 per Ride

Introduction: Building for Madagascar

Dago, a product by the company Tarondro, is a rapidly growing ride-hailing and delivery platform designed specifically for the realities of Madagascar. The stakes are uniquely high. Network connections can be unstable, device capabilities vary drastically, and unit economics dictate the entire business model. In a market where profit margins are fine, spending standard cloud computing rates per ride is simply not sustainable.

As the CTO of Tarondro, my role was to design a backend system that could handle thousands of concurrent users in real-time without inflating infrastructure costs. The challenge was clear. We needed enterprise-grade reliability on a startup budget.

The $0.003 Problem

Every ride-hailing business lives or dies by one experience: the customer watching a dot move on a map.

That dot is not a trick. Behind it sits a real-time data pipeline. GPS signals are collected from a rider's phone every few seconds, relayed across the internet, processed by a server, and pushed to the customer's screen fast enough to feel live.

When we ran the numbers, the results were clear. At 500 rides per day, our total infrastructure breaks down to an average of $0.003 per ride (calculated over 15,000 monthly rides). Under one cent. On a single $40/month server.

This is the architecture that gets you there. More importantly, we cover the decisions behind it.

Why a $40 Single-Server VPS Rather Than Firebase + Google Maps?

Many startups are naturally drawn to managed services: in a few hours, you plug in Firebase, Google Maps, and the demo works.

The problem is that this initial speed rarely tells the whole story. As soon as real usage begins (more users, more real-time events, more local constraints), the simplicity of the demo can turn into unpredictable latency and a climbing bill.

Our decision wasn't ideological. It was economic and operational.

Our main constraint wasn't saving 2 weeks of development. It was guaranteeing a stable unit cost during the adoption phase.

At a glance:

Criterion	Single-Server VPS (Our Choice)	Firebase + Google Maps
Cost	Fixed initially (~$40-60/month)	Variable based on traffic and API calls
Predictability	High (capped budget)	Medium to low during growth
Local Performance	End-to-end optimizable	Depends on multiple external services
Data	Full control (infra + logs)	Distributed across providers
Startup	More ops initially	Faster for prototyping
Lock-in	Low	Higher

The real trade-off was simply:

Go faster in demo, with a bill that can quickly become unpredictable.
Invest a bit more in the backend, to keep a stable cost per ride.

For Dago, in an emerging market with tight margins, option 2 was more rational.

To be clear: we didn't reject Firebase or Google Maps. We simply deferred them until their product convenience brings more value than their marginal cost.

This choice also gives us a clean scaling path: as long as the cost per ride remains under control, we keep the architecture lean; when a business milestone is reached, we can selectively reintroduce managed services (traffic ETA, analytics, push) without rewriting the real-time core.

The Architecture Behind the Moving Dot

The Naive Approach Fails at Scale

The real problem isn't just "receiving GPS coordinates". It's preserving two things simultaneously: customer trust (a smooth dot, no lag) and company margins (a sustainable cost per ride).

Here is what most teams try first: every second, the rider's app sends GPS coordinates to the server via HTTP. The server saves them to the database. The customer's app polls the database every second for the latest position.

This works in a demo. In production, it kills you.

At 100 concurrent riders, that's 100 writes per second and 100 reads per second. This is just for location data, before any ride logic. At 500 riders, your database is under constant hammering, your server drowns in HTTP handshakes, and your cloud bill grows faster than your revenue.

In other words: the naive approach degrades the user experience and your margins at the same time. The thoughtful approach trades a little design work for 85% cost reduction and better performance.

The Solution: The 5ms Pipeline

The architecture follows a single rule: use each tool for what it's best at.

💡 In plain terms: Think of HTTP polling like calling someone every 3 seconds to ask "where are you now?". WebSocket is like keeping the phone line open: the position arrives as soon as it changes.

Four blocks, four distinct roles:

WebSocket maintains a constantly open connection between the rider and the server. No HTTP handshake on every update: the door is already open, data flows freely.
Redis keeps every position in RAM with a native geospatial index. Practically, every GEOADD updates a rider's position in sub-milliseconds, and every GEOSEARCH finds the closest riders to a point in a single command. No SQL queries, no disk reads. Redis also instantly broadcasts every movement to the customer watching the ride via Socket.io rooms.
PostgreSQL receives positions in batches every 30 seconds. It never sees the real-time firehose. Its role: archive trips for disputes, analytics, and billing.
OSRM, self-hosted on the same VPS, calculates routes and ETAs using the complete Madagascar road network. It's the equivalent of Google Maps Directions, but free and local. The data stays in Madagascar.

The result: a position update flows through the entire pipeline in under 5ms. At 500 concurrent riders, the system absorbs over 10,000 operations per minute without flinching, where a classic HTTP + PostGIS architecture would already be struggling at 100.

// The entire WebSocket handler - one location update, four outcomes
@SubscribeMessage('location:update')
async handleLocationUpdate(client: Socket, dto: LocationUpdateDto) {
  const riderId = client.data.userId;
 
  // 1. Drop if this rider sent an update under 2s ago (Redis TTL trick)
  if (!(await this.redisGeo.checkRateLimit(riderId))) return;
 
  // 2. Update position in Redis geo index (under 1ms)
  await this.redisGeo.updatePosition(riderId, dto.lat, dto.lng);
 
  // 3. Push to every customer watching this ride (instant)
  this.server.to(`ride:${dto.rideId}`).emit('location:update', dto);
 
  // 4. Buffer for batch DB write (NOT a direct insert)
  this.batchService.buffer(riderId, dto);
}

Technical Deep Dive

The 85% Bandwidth Reduction: Two-Speed Tracking

This is the single insight that changed our cost model the most.

Not every rider needs to be tracked at the same frequency. A rider sitting idle at a motorbike stand, waiting for a booking, doesn't need to send GPS updates every 3 seconds. Nobody is watching them. Their position for matching only needs to be approximate.

So we designed two tracking speeds:

Rider state	GPS frequency	Bandwidth cost	Purpose
Idle (online, waiting)	Every 60 seconds	~0.25 KB/min	Approximate position for matching
On mission (actively riding)	Every 3–10 seconds	~7 KB/min	Real-time tracking for the customer
Offline	Nothing	$0	Not tracked

With 500 online riders and 50 on mission at any given time, we go from 10,000 messages/min to 1,450 messages/min. Same user experience. 85% fewer messages.

The moment a rider accepts a booking, their app switches to fast mode. The moment the ride ends, it drops back to the slow heartbeat. The customer never sees the mechanics.

// Flutter - the rider app adapts its GPS frequency to the mission state
Duration get _interval => switch (_state) {
  RiderState.offline   => Duration.zero,        // not tracked
  RiderState.idle      => Duration(seconds: 60), // cheap heartbeat
  RiderState.onMission => _getSpeedBasedInterval(), // 3-10s adaptive
};

But we went further. Even during a mission, the update rate adapts to the rider's speed:

Speed	Interval	Why
Stopped (traffic, red light)	10 seconds	Position isn't changing
Slow traffic (under 5 m/s)	4 seconds	Moderate precision needed
Moving fast (above 5 m/s)	3 seconds	Maximum precision for smooth animation

And the app won't send an update unless the rider has moved at least 10 meters. This prevents GPS jitter from flooding the server with meaningless data.

Finding the Nearest Rider in Under 1 Millisecond

When a customer requests a ride, we need to find the closest available rider. Not in 2 seconds. Not in 500 milliseconds. In under 1 millisecond, because dispatch speed is a competitive advantage.

The 60-second heartbeat positions are already stored in Redis using a geospatial index. This lets us ask "give me all riders within 5 km of this coordinate" as a single, sub-millisecond operation.

// One Redis command finds the 10 closest riders - instant
const nearby = await redis.call(
  'GEOSEARCH', 'active_riders',
  'FROMLONLAT', customerLng, customerLat,
  'BYRADIUS', 5, 'km', 'ASC', 'COUNT', 10
);

We then check which nearby riders are idle and send the ride offer to the closest one. If they decline, the next closest gets the offer. The whole matching decision happens in memory, with no database query.

Why Not a More Sophisticated Matching Engine?

We studied how Uber's DISCO matching system works. It relies on Google S2 cells, consistent hashing, and a Ringpop gossip protocol for distributing millions of connections. Impressive engineering for millions of drivers.

For Dago's scale (hundreds of riders, not millions), Redis GEORADIUS gives us the same result with one command instead of a distributed system:

Our Approach	Uber's Approach	When We'd Upgrade
Redis GEORADIUS	S2/H3 hexagonal grid	50K+ concurrent drivers (need sharding)
Single Redis instance	Consistent hashing across nodes	Multiple Redis servers needed
Sequential offer (closest first)	Batched optimization (ETA-weighted)	When lost rides from rejection become measurable
OSRM for road distance	Google Maps Directions API	When traffic-aware ETA becomes a differentiator

The 60-second heartbeat gives positions at most 60 seconds old. Since a rider needs 2–5 minutes to reach the customer, this adds negligible error to matching. No need for a more complex system yet.

Smooth Animations Without a Google Maps Bill

The customer sees a rider icon gliding smoothly across the map. In reality, GPS updates arrive every 3–10 seconds. They do not arrive 60 times per second.

The trick is client-side interpolation. When a new GPS position arrives, the app doesn't jump the marker. It animates smoothly at 60 frames per second across the gap:

// Flutter - animate between GPS pings at 60fps
Timer.periodic(Duration(milliseconds: 16), (timer) {
  final t = (elapsed / duration).clamp(0.0, 1.0);
  final lat = start.latitude + (end.latitude - start.latitude) * t;
  final lng = start.longitude + (end.longitude - start.longitude) * t;
  _updateMarker(LatLng(lat, lng));
});

For mapping infrastructure, we avoided vendor lock-in entirely:

Component	Provider	Cost
Map tiles	Mapbox free tier	$0 (up to 50K web loads or 25K mobile MAUs)
Routing & ETA	OSRM self-hosted	$0 (Madagascar road data, runs on our VPS)
Geocoding	Nominatim	$0 (open-source)

💡 In plain terms: OSRM is a free routing engine. It is like Google Maps directions, but we run it ourselves on our own server with Madagascar's full road network. It costs nothing, and the data stays in Madagascar.

The only cost that could scale is Mapbox once we exceed their generous free tiers (e.g., 50,000 map loads on web or 25,000 Monthly Active Users on mobile). Even then, the marginal cost remains fractions of a cent per ride at the revenue levels that would justify it.

A Server That Protects Itself

Running on a single VPS with 8GB of RAM means we can't auto-scale when traffic spikes. Instead, we built the server to degrade gracefully rather than crash.

Memory Budget: Every Byte Counts

Component	RAM
OS + Docker overhead	~800 MB
PostgreSQL	~1.5 GB
Redis (geo + cache)	~512 MB
NestJS App	~512 MB
OSRM (Madagascar roads)	~1.5 GB
Nginx	~64 MB

Out of 8 GB RAM: ~4.9 GB used, ~2 GB headroom for WebSocket connections, ~1.1 GB safety buffer.

With 2GB of headroom and ~20-50KB per WebSocket connection, there's comfortably room for 5,000+ concurrent connections before reaching any limits.

When memory usage exceeds a threshold, the server automatically switches to a more conservative update frequency - buying time without dropping connections. When connections approach the configured maximum, new connections are rejected with a friendly message rather than causing a silent crash:

// The server monitors itself every 10 seconds
@Cron('*/10 * * * * *')
async checkHealth() {
  const heapUsedMB = process.memoryUsage().heapUsed / 1024 / 1024;
  // Under pressure? Ask riders to send less frequently
  this.degradedMode = heapUsedMB > 400;
}

This is the VPS trade-off: $40/month flat instead of $200–2,000/month cloud auto-scaling. The constraints force better engineering decisions.

Reliable Delivery for Messages That Can't Be Lost

Location updates can be dropped. The next one replaces the last. But some messages cannot be lost: a ride offer, a cancellation, a payment confirmation.

For these, we implemented a lightweight at-least-once delivery pattern inspired by Uber's RAMEN messaging system. The server sends the message, waits for an acknowledgment, and retries if none arrives within 5 seconds:

// Server stores pending message, retries if rider doesn't ACK
await this.redis.setex(`pending:${riderId}:${seq}`, 30, payload);
server.to(`rider:${riderId}`).emit('ride:offer', message);
// After 5s with no ACK → retry. After 2 retries → next rider.

Message Type	Reliable delivery?	Why
`ride:offer`	✅ Yes	Lost offer = lost revenue
`ride:cancelled`	✅ Yes	Rider must know immediately
`location:update`	❌ No	Next update replaces it anyway
`eta:update`	❌ No	Stale ETAs get replaced quickly

This matters most in Madagascar where Telma and Orange networks can be inconsistent. A rider missing a ride offer because of a 3-second network hiccup is lost revenue. The retry layer costs almost nothing and prevents it.

The Business Case in One Table

Stage	Daily rides	Concurrent riders	Monthly cost	Cost per ride
MVP	100	20	$40–60	$0.003
Growth	1,000	200	$80–120	$0.005
Scale	5,000	1,000	$150–200	$0.004

The cost per ride stays under one cent through the first several thousand daily users. Compare this to a naive cloud-first approach: $500–2,000/month at the Growth stage for the same workloads.

The difference isn't engineering complexity. It's design intentionality.

Scaling Path: No Rewrites, Just Configuration

The architecture has a clear upgrade path at every inflection point:

Phase 1:  Single VPS. Everything on one server ($40-60/mo)
              │  Signal: sustained 5K+ WebSocket connections
              ▼
Phase 2:  Split services. PostgreSQL + OSRM on second VPS ($80-120/mo)
              │  Signal: sustained 85%+ RAM usage
              ▼
Phase 3:  Multi-server. Add Redis adapter (1 line of code) ($150-200/mo)
              │  Signal: 50K+ concurrent connections
              ▼
Phase 4:  Cloud migration. Kubernetes, managed infra

The Phase 1→2 transition is one Docker configuration change. Phase 2→3 is literally adding one import:

// One line to go from single-server to multi-server
import { createAdapter } from '@socket.io/redis-adapter';
io.adapter(createAdapter(pubClient, subClient));

No rewrite. No migration. No architectural change.

What We Didn't Build (and What Went Wrong)

An honest architecture document includes the gaps.

"On-Demand" Tracking: A False Good Idea

Our first architecture was an elegant trap. The initial idea seemed brilliant: don't track any rider permanently. When a customer requests a ride, the server sends a silent push notification to all nearby riders. Their phones wake up, get a GPS fix, and send it back to the server. The server compares the responses and dispatches the ride to the closest one. Zero WebSocket connections. Zero cost when nobody is ordering. On paper, it's genius.

In practice, it falls apart for three reasons that only production reveals:

Push notification latency is unpredictable. A silent notification via FCM (Firebase Cloud Messaging) takes anywhere from 500ms to 15 seconds to arrive, depending on the network, OS, and whether the phone is in Doze mode. The customer waiting for a rider sees a spinner for 5 to 15 seconds before matching even begins. On Telma and Orange networks in Antananarivo, it was often closer to 10 seconds. Unacceptable.
The OS kills background apps. Android and iOS aggressively kill background apps to save battery. A silent notification assumes the app is still alive to receive it and execute code. On entry-level phones (the majority of our riders), the app was killed in minutes. The rider appeared "online" to the server, but their phone received nothing.
GPS cold starts add 5 to 30 seconds. When the phone hasn't used the GPS recently, the first fix takes time. Notification (5s) + GPS cold start (10s) + server response (1s) = the customer potentially waits 15 seconds just for matching, not counting the rider's travel time.

Continuous tracking via WebSocket with the 60-second heartbeat costs a bit of bandwidth, but it eliminates all three problems at once. The rider is already connected, their position is already in Redis, and matching happens in sub-milliseconds. The lesson: a "zero cost at rest" architecture is worthless if it destroys the experience at the critical moment.

Accepted Trade-offs (For Now)

We skipped surge pricing zones. Uber uses H3 hexagonal grids to divide cities into demand cells for dynamic pricing. We considered it, but the volume doesn't justify it yet. When it does, the h3-js npm package allows us to map coordinates to hexagonal cells in Node.js, which we can then store in standard Redis keys. It's a clean migration from our current GEOSEARCH.

We skipped traffic-aware ETA. Our OSRM routing gives accurate road-distance ETAs but doesn't account for live traffic conditions. For Antananarivo's traffic patterns, this means our ETAs can be off by 2–5 minutes during rush hour. Google Maps Traffic API would fix this. It costs $7 per 1,000 requests. We'll add it when the cost is justified by ride volume.

We chose JSON over Protobuf. It's fine for now. At our scale, JSON payloads cost us maybe 5x more bandwidth than binary serialization would. At 500 riders, that's the difference between 110 GB and 22 GB per month. Both remain comfortably within VPS bandwidth limits. At 10,000 riders, Protobuf becomes a serious optimization.

We underestimated GPS jitter. Early testing in Antananarivo showed GPS accuracy bouncing by 20–50 meters in dense urban areas. The 10-meter movement threshold catches most jitter, but riders waiting at a station sometimes appear to "drift" on the customer's map. A Kalman filter would smooth this. We haven't implemented it yet.

No monitoring stack. We deliberately skipped Grafana/Prometheus to save ~500MB of RAM on the VPS. Instead, we rely on docker stats and application logs. This works, but we've had two incidents where we only discovered memory pressure after users reported slow updates. A lightweight monitoring solution is next on the list.

What We Learned

Good architecture isn't about copying Uber's infrastructure or stacking trendy managed cloud services. It's about matching technology to your business constraints.

For Dago, this meant accepting that Madagascar's mobile networks aren't perfectly stable, and that a unit cost of half a cent was the project's condition for survival. Two-speed tracking, self-hosted routing, and the WebSocket + Redis pipeline are simply the technical translation of these constraints.

The transportable lesson for any product: every system has a "two-speed" optimization hiding in plain sight. Whether it's IoT sensors, analytics pipelines, or notification systems, most companies pay a premium to process data at high frequency when not all of it holds the same value.

Identify what doesn't require down-to-the-second resolution. Throttle it. Take your database out of the real-time loop. And watch your costs drop by 85%, without ever degrading your user experience.

If you're building a platform with similar constraints (real-time features, cost-conscious infrastructure, emerging market conditions), let's talk about your architecture. The first 30-minute session is free, no commitment.