The problem with the benchmark
Most recovery tools answer the wrong question. They tell a client how they compare to a population: a recovery score against an age band, an HRV against a normative table, a sleep figure against a recommended eight hours. The comparison feels rigorous because it produces a number, but for a coach managing a real person it is mostly noise.
Physiological markers vary enormously between individuals. Resting HRV can differ several-fold between two healthy, equally fit people for reasons that have nothing to do with readiness, genetics, measurement site, age, even the time the wearable sampled. A client sitting in the 30th percentile of a population table may be perfectly recovered. A client in the 80th may be quietly accumulating fatigue. The benchmark cannot tell the difference, because it never knew the person to begin with.
What the baseline actually measures
The useful question is not "how does this client compare to others" but "how does this client compare to themselves." A within-person baseline, a rolling estimate of each marker's normal range built from that individual's own history, turns a meaningless absolute into a meaningful deviation. An HRV of 70 is neither good nor bad. An HRV of 70 for someone who has been steady at 92 for two months is a signal worth acting on.
This is the standard approach in the physiology literature. Day-to-day interpretation of heart rate variability is done against an individual's rolling mean and normal variation, not against group norms, precisely because the between-person variance swamps the within-person signal. The same logic holds for resting heart rate, sleep duration, and training load. The reference point that carries information is the person's own trailing distribution.
Why deviation, not value, drives the decision
A coach does not need to know that a client's restoration is 77 percent. They need to know that it dropped from a steady mid-80s to 77 over four days, that the drop coincided with two short nights, and that training load did not change. That pattern, a metric falling below its own baseline while the usual explanations stay flat, is the thing that should change the session. The raw value is just the surface.
This is why Harlen reads every metric against the client's own normal and surfaces the deviation, not the number. A reading below baseline gets flagged; a reading that looks low on a population table but is normal for that person stays quiet. The goal is to spend your attention only where something actually changed.
The practical version
If you take one thing from this: stop asking whether a number is high or low, and start asking whether it has moved relative to that client's recent history. The benchmark tells you about a population you are not coaching. The baseline tells you about the person in front of you.
