On recovery scores, borrowed doubt, and the category error hiding in plain sight.
When Dustin Poirier was still competing, he stopped using recovery wearables during fight camp. Again and again, the same little verdict arrived: not ready, need rest, red, red, red. But there was a fight coming. A fight camp is one of the few contexts in modern life where the calendar tells the truth. The date does not care what the device thinks of your nervous system.
Poirier had to walk toward violence on command, and he decided the thing on his wrist was making him worse. So he stopped looking. Since retiring, he has said he uses the devices again.1
The lesson is not that rest is weakness. In the same conversation, Poirier talked about getting older and learning to take complete days out of the gym. That is the important complication. He was not a fool with a harder-is-better slogan. He knew rest mattered. What he rejected was not recovery. It was borrowed doubt. It was the transfer of authority from a fighter in a fight camp to a color on a screen.
Anyone who has checked a recovery score before training knows the feeling. You wake up with a plan. Nothing heroic. A run, a lift, a ride, whatever you said you were going to do when you were still thinking clearly. Your body feels ordinary: not perfect, not ruined, just yours. Then the app gives you a number, a color, a verdict. The same body becomes suspect. The same plan starts to seem irresponsible. A stiffness you would have ignored becomes evidence. A reluctance you would have walked through becomes wisdom. The score has not merely described the morning. It has edited it.
I have had that morning more than once. Recently I had a week of it. After a hard run, the score dropped into red the next morning. I trained anyway, though not with the noble clarity that sentence suggests. I looked at the score. I negotiated with it. I felt the small bureaucrat in me reaching for paperwork. A red score would make the decision cleaner. I would not be skipping the work. I would be following guidance.
In the end, I changed the work instead of canceling it. I moved the day from intensity to steady aerobic work on the bike. The score was still red the following day. I trained again. By midweek, it had started climbing back on its own.
Then I walked the same route I had walked before the hard run, at a comparable easy effort. My average heart rate was ten beats per minute lower. That does not prove the device was wrong in any grand scientific sense. It proves something more modest and more useful: the score would have changed my behavior during a window that, by my own later measure, did not require retreat. The device would not have made me wrong. It would have made me worse.
Not because the data were fake. Because of what the data were allowed to mean.
Start with the category error. Heart rate variability is a measurement, or at least a sensor-derived estimate trying to approximate one. So are resting heart rate, respiratory rate, skin temperature, sleep duration, and heart rate. "You are 34 percent recovered today" is not a measurement. It is an interpretation. It is a model's attempt to collapse several signals, each with its own noise and context, into a single behavioral message.2
The apps do not present that message as interpretation. They present it as discovery. Red, yellow, green. Ready, not ready. Push, pull back. The interface does not say: here is a proprietary synthesis of imperfect inputs, weighted by a method we do not fully disclose, producing a simplified output whose validity may vary by person, context, training status, illness, sensor fit, sleep timing, alcohol, and stress. It says red.
That simplicity is the product. The score relieves you of interpretation by performing interpretation on your behalf. One number, one color, one clean handoff from ambiguity to authority. Often it feels like both things at once: science on the surface, permission underneath.
Steven Pressfield called the force that rises against meaningful work Resistance.3 Its most dangerous forms do not usually arrive as cowardice. Cowardice is too easy to identify. Resistance works better when it sounds prudent, when it borrows the language of maturity, safety, self-care, optimization. In wearable culture, Resistance has found an unusually elegant costume. It appears as readiness. It appears as recovery. It appears as adaptive guidance. It arrives with charts.
Resistance wearing a lab coat.
This is why the recovery score is psychologically more interesting than a bad measurement. Bad measurements can be corrected. The deeper problem is not the data point but the alibi built around it. "I do not feel like training" is a weak excuse, and most disciplined people know how to argue with it. "My recovery is low" sounds responsible. It sounds adult. It sounds like obedience to evidence.
I know this talent because I have it: the old human gift for dignifying avoidance.
The score earns its keep when it catches something meaningful you would have missed and redirects you well. A fever starting to rise. An unusual respiratory rate. A pattern of accumulating strain you have refused to acknowledge. A body sending signals you are too stubborn, inexperienced, or compulsive to hear. Those cases exist, and they matter.
But most ordinary training decisions are not made at the extremes. They are made in the middle. A little tired. A little stiff. A little flat. Not sick, not injured, not clearly depleted, just not eager. On obvious days, you did not need the app. On ambiguous days, the app has power. And in the ambiguous middle, the easiest direction to move a plan is downward.
That is where the score gets into your head. It changes the felt meaning of sensation. A normal reluctance becomes information. A heavy morning becomes a warning. A stiff back becomes pathology. The device is not just reading the day. It is framing the day. It tells you what to fear before you have begun to move.
Sleep researchers gave the sleep version of this pattern a name: orthosomnia, the anxious pursuit of perfect sleep data.4 The tracker measures something real, or something close to real, and the measurement becomes an object of vigilance. A person no longer simply sleeps and wakes. Sleep becomes a performance under the eye of a device. Waking becomes audit.
A similar pattern shows up around readiness and recovery scores. Qualitative work on regular exercisers using WHOOP and Oura found that users often treated these scores as useful but limited, adjusting training and lifestyle around them while still emphasizing self-awareness, flexibility, and personal judgment.5 That is the reasonable position. The interesting fact is that people can know the score is limited and still feel moved by it. The score does not have to be believed absolutely to alter behavior. It only has to place a thumb on the scale at the moment of doubt.
Because the story arrives in the language of science, it carries unusual force. "I do not want to" sounds weak. "My recovery is low" has the grammar of prudence. It converts hesitation into compliance. That is what makes it so seductive. It does not tell you to quit. It tells you to be smart.
Broad consumer software has every reason to tilt toward caution. A company would rather be accused of telling you to rest too often than of pushing you into illness or injury. That may be appropriate. But the safety bias is not neutral. It trains the user to interpret heterogeneity through one dominant question: should I back off?
Yet the reasons for a bad score are not one thing. Travel can do it. Alcohol can do it. A hard session can do it. An argument can do it. A bad night with a child can do it. A sedentary week can do it. A block of useful training can do it. So can the beginning of genuine sickness. The app takes all that heterogeneity and compresses it into a generalized caution signal. Then the user, already trained by the interface, reads the signal through a single behavioral lens.
Back off.
Overtraining gives recovery culture its most dramatic cautionary tale. True overtraining is real. So are illness and injury. So is the athlete who needs to be protected from his own appetite for punishment. But that is not the most common story in the life of the ordinary ambitious adult. For that person, the threat is usually not too much discipline. The threat is drift. Missed days. Broken rhythm. The slow erosion of habit by plausible exceptions.
This is the part a red-yellow-green interface cannot express well enough: the right kind of movement often creates the conditions for recovery.
Not always. Not when you are feverish, injured, acutely sick, or plainly depleted. Not when the planned session is maximal and the available body is not. Not when fatigue has become a wall rather than a weather pattern. But more often than a recovery score can say, the answer to a bad state is not absence. It is a change of state.
A walk changes your nervous system. An easy ride changes your mood. A light lift changes your posture toward the day. A run done below the ego can restore appetite, sleep, confidence, and the ordinary sense that the body is an ally rather than a problem to be managed. Training is not merely something you are granted access to after a favorable reading. It is one of the mechanisms by which the reading improves.
This is where the composite score makes its deepest mistake. It can treat the symptom as a contraindication for the cure.
Low HRV after a stressful week may not mean you need less movement. It may mean you need the right kind of movement. A sluggish body made worse by poor sleep, stress, and sitting does not always want another day of managed stillness. It may want air, circulation, rhythm, warmth, contact with effort at a humane dose. That does not mean heroics. It means precision.
Redirect, don't retreat.
Even in trained athletes, autonomic data can punish simple interpretation. Le Meur and colleagues found increased parasympathetic modulation in functionally overreached endurance athletes, and noted that isolated HRV readings can miss training-induced autonomic change because of day-to-day variability.6 The point is not that low HRV is secretly good or that high HRV is secretly bad. The point is that these signals require context. A single score has to flatten context in order to exist.
The alternative is not ignorance. It is higher resolution.
Keep the measurements. They have value. But treat them as components, not commands. Put HRV next to resting heart rate, sleep, soreness, mood, motivation, life stress, recent training, the feel of the warmup, and the actual session planned. Low HRV with good mood and an easy aerobic day ahead means one thing. Low HRV with elevated temperature, a sore throat, and maximal intervals planned means another. The same number can belong to different lives.
The athlete-monitoring literature has long supported this kind of humility. A systematic review by Saw, Main, and Gastin found that subjective self-reported measures often reflected training load with greater sensitivity and consistency than commonly used objective measures.7 That should not surprise anyone who has lived in a body. Readiness is not located in one signal. It is distributed across physiology, perception, history, and intent.
The warmup is data. So are mood, soreness, recent training, life stress, and the work in front of you. Your own report is not a contaminant in the system. It is part of the system.
The honest use of wearable data is not as judge but as prompt. Not: what does the score say I am allowed to do? But: what kind of day is this, and what form of work fits it?
That difference matters because one question asks permission; the other interprets evidence. One makes the body answer to the dashboard; the other lets the dashboard enter the conversation without chairing the meeting.
Composite scores promise relief from that burden. Interpretation is tiring. Self-trust is fragile. Decision-making has a cost. It is pleasant to be told, especially when being told permits the thing some part of you already wanted. A rest day can be necessary. It can also be convenient. The difficulty is learning the difference.
The more you outsource interpretation, the worse you become at noticing the boundary between warning and reluctance, illness and inertia, necessary recovery and ordinary resistance to effort. You lose contact with the grain of your own experience. And once that happens, the device is no longer measuring your training life. It is governing it.
Poirier took the device off because he could not afford the doubt. Most people are not preparing to be locked in a cage under bright lights. But the principle barely softens when the stakes look smaller. The work still has to be done. The habit still has to survive bad mornings. The body still has to learn that it can enter one state and leave in another.
Tomorrow, a lot of people will wake up and look at a score before they have looked honestly at themselves. Some of those scores will be useful. That is what makes the whole thing difficult. Resistance is never powerful because it is obviously absurd. It is powerful because it arrives sounding sensible, careful, evidence-based. And sometimes it is.
The question is not whether the device knows something. It does.
The question is whether it knows enough to be obeyed.
The screen lights up. The verdict arrives. The work is still there.