Language Does More: Streaming
There is a scene that exists in over a hundred different streaming shows.
A character has just received devastating news. They pause. They look into the middle distance. Then another character says, quietly: “Hey. Talk to me.” Or: “I’m right here.” Or: “You’re not alone.”
The specific words vary slightly but the rhythm, the emotional register, or even the length of the line? Almost never.
We’ve all probably experienced a vague sense of déjà vu while watching otherwise very different shows, this is not our imagination. It is the sound of language being optimized.
Behind the scenes of modern streaming, something unprecedented is happening to the way scripts are written and it has to do with data science and linguistics. The platforms that now dominate how the world watches are measuring everything: when you pause, when you skip, and when you abandon an episode entirely. And the language of the shows those platforms develop is increasingly shaped by what those numbers reward.
Netflix is the most transparent about what its data apparatus looks like. The platform collects behavioral signals from over 270 million subscribers including viewing history, completion rates, rewatch frequency, and even where in an episode people stop watching and never return. That data moves quickly from dashboards into creative discussions. As Netflix’s own engineering team has described it: if analytics shows that users consistently abandon a series after episode two, it doesn’t sit in a report but feeds directly into conversations about pacing, structure, and storytelling hooks.
NLP algorithms are applied even earlier in the process. According to a 2026 analysis of Netflix’s AI strategy, the platform uses natural language processing tools to scan early-stage scripts, evaluating elements like emotional arcs, character dialogue balance, and scene transitions, and comparing them against high-performing content in the same genre. Scripts are, in a very real sense, being read by machines before they are read by humans and the machines have opinions.
“Netflix told a director: can we get a big action set piece in the first five minutes? And it wouldn’t be terrible if you reiterated the plot three or four times in the dialogue — because people are on their phones while they’re watching.”
This quote is from a filmmaker speaking to researcher Stephen Follows, whose data-driven studies of film and television have revealed what the aggregate numbers actually show. Analyzing subtitle files from over 61,500 films, Follows found that modern films use significantly more lines of dialogue to convey the same or slightly greater information than older films because the sentences themselves have gotten shorter. The rhythm is more broken up. Dialogue arrives in smaller, faster doses. The monologue, the slow-burn speech, the scene that breathes — data suggests audiences leave before they end.
In a separate study, Follows tracked dramatic intensity in film dialogue from 1940 to the present, using linguistic signals within subtitle files including high-arousal vocabulary, punctuation pressure, and emotional urgency. His findings show that between 1940 and 2022, the share of runtime spent in high-intensity dialogue rose by 21%, while the “downshift index” (how much films ease off emotional pressure in their middle acts) fell by 35%. Once pressure enters, it tends to stay.
Linguists studying film discourse have found corroborating patterns across decades of screenplay language. A diachronic study of English film dialogue found that the standard register intensifiers of earlier eras i.e., words like terribly, awfully, perfectly have steadily declined, replaced by informal, maximal-intensity expressions that read as more raw, more immediate, and more emotionally legible at speed. The trend, researchers note, reflects what audiences increasingly expect from scripted content: the emotional texture of authentic face-to-face conversation, compressed and delivered fast.
None of this is secret, but the cumulative effect is rarely named out loud. When every major platform is running its scripts through the same class of analytical models, rewarding the same engagement metrics, and optimizing for the same viewer behavior patterns, the result becomes a converging dialect. Plots accelerate to the same rhythm. Emotional scenes deploy the same spare, low-complexity lines. Characters in wildly different genres end up speaking at roughly the same register because that register has been, empirically, the one that keeps people watching. This is the tension at the heart of data-driven storytelling.
The same analytical tools that help a platform understand which shows lose viewers after the second episode also, gradually, ensure that every show starts to sound like the one that didn’t.
Linguistic diversity in scripted entertainment like the idiosyncratic voice, the unhurried pace, or the scene that trusts silence is exactly what engagement metrics are most likely to flag as a risk. The irony is sharp: the industry has never had more data about what audiences respond to, and yet many viewers describe a growing sense that something intangible is missing. What they may be sensing is language itself being slowly optimized into fluency and out of surprise.

