Making Valhalla Understand Monday Morning
It is Monday, 08:12. A delivery driver leaves the depot on a route the dispatcher built on Sunday afternoon. The routing engine that built it does not know that the arterial out of the warehouse jams between 08:00 and 08:30. The driver does. So does anyone who lives nearby. Every minute the engine does not know costs someone — the driver, the customer, the company.
Predicted traffic is the routing feature most people assume already works. In Valhalla, it can — once someone does the work to feed it well. Over the last several months at Globus, we have done that work.
This post is what we shipped, what we threw away, and why a deliberately small layer ended up mattering more than a flashy one.
What we shipped
Globus predicted traffic for Valhalla ships in two layers.
Layer 1 — Scalar coverage. A per-road day speed and night speed for the entire drivable graph — 41,231,017 directed edges — exported as Valhalla's free_flow / constrained_flow CSV.
Layer 2 — Time-of-week canary. A weekly speed profile encoded in Valhalla's 200-coefficient DCT format. Selective: 86,466 corridors across Germany, the Netherlands, Switzerland, Ukraine, and the US — only the ones where the GPS evidence is strong enough to carry a weekly curve.
The split is the point. Coverage everywhere; temporal detail where the data earns its place.
Why not weekly profiles on every road?
Because the data will not carry them.
Our source is matched GPS — over a billion observations in total, but the median road sees only a handful. Valhalla's DCT grid is 7 days × 24 hours × 12 five-minute buckets = 2,016 slots per week. Try to learn 2,016 numbers from five samples and you get a curve that looks rich and behaves like noise. The dispatcher gets reroutes for ghosts.
That is the most common failure mode in predicted traffic: confusing detail with accuracy. So the foundation we shipped first was deliberately boring — day speed and night speed per road, with shrinkage toward priors where local data is thin, and two reliability features (count-decayed dispersion, count-decayed coverage cap) that let strong roads trust themselves and weak roads lean on cohort behavior.
Not flashy. But it covers the whole graph and it does not lie.
What we tried, and threw away
With Layer 1 stable, the temptation was to add temporal shape on top — discover rush-hour windows from the data, fit per-road interval models, ship a richer product.
We tried three approaches.
A fixed ladder of 1-hour and 3-hour intervals: robust and interpretable, but real congestion does not always start on the hour. An adaptive tree that searched for data-driven splits: looked great on in-sample MAE, then collapsed under date-holdout validation — the trees were memorizing, not predicting. A contrast interval model designed for fast-slow-fast patterns: won on its own selected roads under production scoring, then lost on held-out reconstruction.
All three were plausible. All three failed the test that matters: predicting next Monday, not last month. We shelved them.
The work was not wasted. It gave us an evaluation harness that distrusts itself — date-holdout splits, selected-subset reconstruction, shuffled-timestamp negative controls, abstain-or-predict scoring. Every later candidate had to survive that gauntlet before it went near a tile.
What worked: cohort templates, anchored per road
The model that finally shipped was simpler than the ones we threw away.
Instead of teaching every road its own weekly shape, we grouped similar roads into cohorts — country + road class + density zone — and learned a shared normalized weekly template per cohort. For each road we anchor that template to the road's own scalar baseline, scaled by a strength term that grows with local data:
profile = scalar baseline × cohort weekly shape × strength
strength = support / (support + 20)
The cohort provides the rhythm. The road provides the level. Sparse roads abstain and fall back to scalar — silently, automatically.
We evaluated after Valhalla's lossy 200-coefficient DCT compression, not before. A profile that looks great in memory but does not survive the round trip is not a product.
The numbers that mattered:
- 63.4% selected win rate vs the scalar layer on chosen roads
- +15.0 percentage points above a shuffled-timestamp control — the signal is not noise
- 88.9% of high-support roads beat scalar even after DCT compression — the format is enough
Mean per-road improvement is small in absolute terms. That is expected. Scalar already captures the average day and night. DCT adds the shape on top — and only where shape is worth adding.
The real test: routes, not metrics
A model that wins offline can still ship a worse routing engine. So before publishing the tiles, we ran 132 routes across DCT-heavy and risk-focused groups — motorway corridors, service roads, full-week-ready edges, known high-error edges — at five times of week: Monday 08:00, Monday 12:00, Monday 17:00, Friday 17:00, Sunday 03:00.
- 132 / 132 routes succeeded
- 0 catastrophic ETA jumps
- 0 trafficview profile artifacts
- 94 / 100 Monday-08:00 routes through DCT corridors arrived slower with DCT than without
That last number is the entire point. Scalar already knew the average daytime speed. DCT added the morning peak. Off-peak deltas stayed near zero — also exactly what we wanted. A time-of-week layer should be quiet when the time of week does not matter.
What this means if you ship routes
You get better ETAs on the roads where ETAs matter most, without the operational failure modes that make traffic layers risky:
- Every drivable edge still has a predicted speed — coverage does not regress.
- The DCT tier is a canary, not an irreversible global rollout. Rollback is one tile-set switch.
- Behavior is monotone where you would expect it. Rush-hour routes through covered corridors get slower; quiet Sunday routes barely move.
If you are routing a fleet, the difference at 08:00 on Monday shows up where you would predict: motorway approaches into dense cities, urban arterials during local peak, Friday afternoons in Western Europe. If you are routing on Sunday at 03:00, almost nothing changes — and that is the right answer.
Integration effort is zero. This layer ships inside our regular map and navigation data updates — Globus SDK customers get it on the next refresh, no API change, no new tiles to wire in. Pricing is unchanged: online routing is paid, offline navigation is free once the data is downloaded.
What's next
We have been climbing the same staircase for a while:
default_speeds.json— per-country, per-road-class priors baked into the tiles, refined over the years from our own GPS data.- Scalar predicted traffic —
free_flowandconstrained_flowper road. The current foundation. - Selective DCT profiles — what landed this release, on 86,466 corridors.
- Wider DCT coverage — as the GPS evidence grows on more cohorts and countries.
- Live traffic — the real-time overlay on top of all of the above. That is the next chapter.
Each step has to earn its place before the next one ships. The DCT canary is narrow on purpose. We will expand the footprint when the data earns it — not before.
If you run Valhalla and want to evaluate the tiles, or you are choosing a routing engine and want to know what predicted traffic looks like under the hood — write us at [email protected].
