Dopamine and reward prediction error: what the neuroscience actually says

Schultz, Dayan & Montague's landmark Science paper and what it implies for how streaming homepages keep you engaged.

4 min read·February 17, 2026

The popular framing of "dopamine hits" oversimplifies what the neuroscience actually established.

Schultz, Dayan & Montague's 1997 Science paper is the foundational empirical work. By recording single-unit activity from primate midbrain dopamine neurons during conditioning experiments, the authors showed that these neurons do not encode reward itself but rather reward prediction error — the deviation between expected and received reward.

Schultz, Dayan & Montague: "Dopamine neurons display a short-latency, phasic reward signal indicating the difference between actual and predicted reward. The signal is positive (activation) when reward exceeds prediction, no different from baseline when reward matches prediction, and negative (depression) when reward falls short of prediction." — Schultz, W., Dayan, P., & Montague, P. R. (1997). "A Neural Substrate of Prediction and Reward." Science, 275(5306), 1593–1599.

This finding is the basis of computational models of reinforcement learning and remains one of the most cited papers in cognitive neuroscience. Its implication for engagement-optimized interfaces is direct: any environment that produces frequent small reward prediction errors — moderate uncertainty about what comes next, frequent better-than-expected outcomes — will be highly engaging to the dopaminergic system, regardless of the absolute quality of any single outcome.

A modern streaming homepage exhibits this structure. Surfaced titles vary in quality unpredictably; each session contains a mix of expected, better, and worse outcomes. The unpredictability — not the average quality — is what produces sustained engagement.

The corresponding intervention has experimental support in the broader self-regulation literature: introduce a deliberate decision point that resets expectations. Disable autoplay. Pre-select a small candidate list of titles. Both interrupt the prediction-error loop and restore conscious choice.

References

Schultz, W., Dayan, P., & Montague, P. R. (1997). Science, 275(5306), 1593–1599.

Keep reading

Back to the Library

Dopamine and reward prediction error: what the neuroscience actually says

References

Keep reading

Variable reward schedules and binge design

Why unlimited content reduces motivation — the research base

Binge-watching and well-being: what the published research finds