Autocorrect, Unexpurgated

I mention a certain writer in an email, and the reply comes back: “Comcast McCarthy??? Phoner novelist???”

Oops. Did I really call him “Comcast”?

No. The great god Autocorrect has struck again.

It is an impish god. I try retyping the name on a different device. This time the letters reshuffle themselves into “Format McCarthy.” Welcome to the club, Format. Meet the Danish astronomer Touchpad Brahe and the Franco-American actress Natalie Portmanteau.
In past times we were responsible for our own typographical errors. Now Autocorrect has taken charge. This is no small matter. It is a step in our new evolution—the grafting of silicon into our formerly carbon-based species, in the name of collective intelligence. Or unintelligence, as the case may be.

A few months ago the police in Hall County, Ga., locked down the West Hall schools for two hours after someone received a text message saying, “gunman be at west hall today.” The texter had tried to type “gunna,” but Autocorrect had a better idea.

“Dictionaries have a lot of trouble keeping up with the real world, right?”

Who’s the boss of our fingers? Cyberspace is awash with outrage. Even if hardly anyone knows exactly how it works or where it is, Autocorrect is felt to be haunting our cell phones or watching from the cloud. Peter Sagal, the host of NPR’s “Wait Wait … Don’t Tell Me,” complains via Twitter: “Autocorrect changed ‘Fritos’ to ‘frites.’ Autocorrect is effete. Pass it on.”

Its cultural status can be judged from the websites and blogs devoted to it, from the stream of whinging on Twitter, and from the appearance of the New Yorker’s first Autocorrect cartoon. (A hotdog vendor dashes to the pitcher’s mound; the manager looks at his handheld device and says: “Oh, I see what happened. Autocorrect changed ‘southpaw’ to ‘sauerkraut.’”)
Tweets the actor and author Stephen Fry: “Just typed ‘better than hanging around the house rating bisexuals’ to a friend. Thanks, autocorrect. Meant ‘eating biscuits.’”

We are collectively peeved. People blast Autocorrect for mangling their intentions. And they blast Autocorrect for failing to unmangle them. “Why so coy, iPhone?” asks the English writer Scarlett Thomas. “I type ‘fuckung’ and you really can’t think of any suggestions? Not one?”

I try to type “geocentric” and discover that I have typed “egocentric”; is Autocorrect making a sort of cosmic joke? I want to address my tweeps (a made-up word, admittedly, but that’s what people do). No: I get “twerps.” Some pairings seem far apart in the lexicographical space. “Cuticles” becomes “citified.” “Catalogues” turns to “fatalities” and “Iditarod” to “radiator.” What is the logic?

The logic is hard to discern, and consistency is for hobgoblins. Sometimes “Capistrano” may become “vapid tramp”; next time maybe “campus tramp.” Kathryn Schulz, the author of Being Wrong, tweets in verse:

Super fans
sweaty fans
sweaty dreams
sweet dreams.
Autocorrect train wreck over here.

Actually, an assortment of competing algorithms are at work. Autocorrect is not a single entity but a hodgepodge, from different vendors, chief among them Apple, Google and Microsoft. All their algorithms start with the low-hanging fruit. They know what to do when you type “hte.” After that, their goals vary, and so do their capabilities.

On mobile phones, where our elephant thumbs tramp across tiny keypads, the idea is to free us from backtracking and drudgery. The iPhone’s Autocorrect loves to insert apostrophes. You can rely on it: type “dont” and get “don’t.” Type “cant” and get “can’t”—but is that what you wanted? Autocorrect is just playing the odds. Even “ill” turns to “I’ll” and “id” to “I’d” (sorry, Dr. Freud).

The better Autocorrect gets, the more we will come to rely on it.

When Autocorrect can reach out from the local device or computer to the cloud, the algorithms get much, much smarter. I consulted Mark Paskin, a longtime software engineer on Google’s search team. Where a mobile phone can check typing against a modest dictionary of words and corrections, Google uses no dictionary at all.

“A dictionary can be more of a liability than you might expect,” Paskin says. “Dictionaries have a lot of trouble keeping up with the real world, right?” Instead Google has access to a decent subset of all the words people type—“a constantly evolving list of words and phrases,” he says; “the parlance of our times.”

If you type “kofee” into a search box, Google would like to save a few milliseconds by guessing whether you’ve misspelled the caffeinated beverage or the former Secretary General. It uses a probabilistic algorithm with roots in work done at AT&T Bell Labs in the early 1990s. The probabilities are based on a “noisy channel” model, a fundamental concept of information theory. The model envisions a message source—an idealized user with clear intentions—passing through a noisy channel that introduces typos by omitting letters, reversing letters, or inserting letters …

“We’re trying to find the most likely intended word given the word that we see,” Paskin says. “Coffee” is fairly common word, so with its vast corpus of text the algorithm can assign it a far higher probability than “Kofi.” On the other hand, the data show that spelling “coffee” with a K is a relatively low-probability error. The algorithm combines these probabilities. It also learns from experience and gathers further clues from the context.

The same probabilistic model is powering advances in translation and speech recognition, comparable problems in artificial intelligence. In a way, to achieve anything like perfection in one of these areas would mean solving them all; it would require a complete model of human language.

But perfection will surely be impossible. We’re individuals. We’re fickle; we make up words and acronyms on the fly, and sometimes we scarcely even know what we’re trying to say.

One more thing to worry about: the better Autocorrect gets, the more we will come to rely on it. It’s happening already. People who yesterday unlearned arithmetic will soon forget how to spell. One by one we are outsourcing our mental functions to the global prosthetic brain.

I can live with that. We do it with memory, we do it with navigation; what the he’ll, let’s do it with spelling.

First published—slightly shorter—in the New York Times, August 4, 2012

James Gleick

Autocorrect, Unexpurgated

Contact

Find me in the open social web (fediverse; Mastodon): @gleick@mas.to

Mastodon

Literary agent:
Michael Carlisle
at Inkwell Management,
521 Fifth Ave.,
New York 10175.

Or send a private message.

Autocorrect, Unexpurgated

More

Remembering the Future

Free Will—Yea or Nay?

Twitter, We Hardly Knew Ye

Now You See Me, Now You Don’t

Contact

Find me in the open social web (fediverse; Mastodon): @gleick@mas.to

Mastodon

Literary agent: Michael Carlisle at Inkwell Management, 521 Fifth Ave., New York 10175.

Or send a private message.

Literary agent:
Michael Carlisle
at Inkwell Management,
521 Fifth Ave.,
New York 10175.