The Most Human Human (34 page)

Authors: Brian Christian

BOOK: The Most Human Human

11.58Mb size Format: txt, pdf, ePub

Ditto for the personal sphere. Don’t make the mistake of thinking that when “So, what else is new?” runs out of steam you’re fully “caught up” with someone. Most of what you don’t know about them has little if anything to do with the period between this conversation and your previous one.

Whether in speed dating, political debate, a phone call home, or dinner table conversation, I think information entropy applies. Questions as wide and flat open as the uniform distribution. We learn someone through little surprises. We can learn to talk in a way that elicits them.

Pleasantries are low entropy, biased so far that they stop being an earnest inquiry and become ritual. Ritual has its virtues, of course, and I don’t quibble with them in the slightest. But if we really want to start fathoming someone, we need to get them speaking in sentences we can’t finish.
¹⁷

Lempel-Ziv; the Infant Brain; Redefining “Word”

In many compression procedures—most famously, one called the Lempel-Ziv algorithm—bits that occur together frequently get chunked together into single units, which are called
words
. There might be more to that label than it would seem.

It’s widely held by contemporary cognitive scientists that infants learn the words of their native language by intuiting which sounds tend, statistically, to occur together most often. I mentioned earlier that Shannon Game values tend to be highest at the starts of words, and lower at the ends: meaning that
intra
-word letter or syllable pairs have significantly lower entropy than inter-word pairs. This pattern may be infants’ first toehold on English, what enables them to start chunking their parents’ sound streams into discrete segments
—words
—that can be manipulated independently. Infants are hip to information entropy before they’re hip to their own names. In fact, it’s the very thing that gets them there. Remember that oral speech has no pauses or gaps in it—looking at a sound-pressure diagram of speech for the first time, I was shocked to see no inter-word silences—and for much of human history neither did writing. (The space was apparently introduced in the seventh century for the benefit of medieval Irish monks not quite up to snuff on their Latin.) This Shannon entropy spike-and-decay pattern (incidentally, this is also what a musical note looks like on a spectrograph), this downward sloping ramp, may be closer to the root of what a word is than anything having to do with the spacebar.
¹⁸

And we see the Lempel-Ziv chunking process not just in language
acquisition
but in language
evolution
as well. From “bullpen” to “breadbox” to “spacebar” to “motherfucker,” pairings that occur frequently enough fuse into single words.
¹⁹(“In general, permanent compounds begin as temporary compounds that become used so frequently they become established as permanent compounds. Likewise many solid compounds begin as separate words, evolve into hyphenated compounds, and later become solid compounds.”
²⁰) And even when the fusion isn’t powerful enough to close the spacebar gap between the two words, or even to solder it with a hyphen, it can often be powerful enough to render the phrase impervious to the
twists of grammar. Certain phrases imported into English from the Norman French, for example, have stuck so closely together that their inverted syntax never ironed out: “attorney general,” “body politic,” “court martial.” It would seem that these phrases, owing to the frequency of their use, simply came to be taken, subliminally, as atomic, as—internal space be damned!—single words.

So, language learning works like Lempel-Ziv; language evolution works like Lempel-Ziv—what to make of this strange analogue? I put the question to Brown University cognitive scientist Eugene Charniak: “Oh, it’s much stronger than just an
analogue
. It’s probably what’s
actually going on.
”

The Shannon Game vs. Your Thumbs: The Hegemony of T9

I’m guessing that if you’ve ever used a phone to write words—and that is ever closer to being all of us now
²¹—you’ve run up against information entropy. Note how the phone keeps trying to predict what you’re saying, what you’ll say next. Sound familiar? It’s the Shannon Game.

So we have an empirical measure, if we wanted one, of entropy (and maybe, by extension, “literary” value): how often you disappoint your phone. How long it takes you to write. The longer, arguably, and the more frustrating, the more interesting the message might be.

As much as I rely on predictive text capabilities—sending an average of fifty iPhone texts a month, and now even taking down writing ideas on it
²²—I also see them as dangerous: information entropy turned hegemonic. Why hegemonic? Because every time you type a word that isn’t the predicted word, you have to (at least
on the iPhone)
explicitly
reject their suggestion or else it’s (automatically) substituted. Most of the time this happens, I’m grateful: it smoothes out typos made by mis-hitting the keyboard, which allows for incredibly rapid, reckless texting. But there’s the sinister underbelly—and this was just as true too on my previous phone, a standard numerical keypad phone with the T9 prediction algorithm on it. You’re gently and sometimes less-than-gently pushed, nudged, bumped into using the language the way the original test group did. (This is particularly true when the algorithm doesn’t adapt to your behavior, and many of them, especially the older ones, don’t.) As a result, you start unconsciously changing your lexicon to match the words closest to hand. Like the surreal word market in Norton Juster’s
Phantom Tollbooth
, certain words become too dear, too pricey, too scarce. That’s crazy. That’s no way to treat a language. When I type on my laptop keyboard into my word processor, no such text prediction takes place, so my typos don’t fix themselves, and I have to type the whole word to say what I intend, not just the start. But I can write what I want. Perhaps I have to type more keystrokes on the average than if I were using text prediction, but there’s no disincentive standing between me and the language’s more uncommon possibilities. It’s worth it.

Carnegie Mellon computer scientist Guy Blelloch suggests the following:

One might think that lossy text compression would be unacceptable because they are imagining missing or switched characters. Consider instead a system that reworded sentences into a more standard form, or replaced words with synonyms so that the file can be better compressed. Technically the compression would be lossy since the text has changed, but the “meaning” and clarity of the message might be fully maintained, or even improved.

But—Frost—“poetry is what gets lost in translation.” And—doesn’t it seem—what gets lost in compression?

Establishing “standard” and “nonstandard” ways of using a language necessarily involves some degree of browbeating. (David Foster Wallace’s excellent essay “Authority and American Usage” shows how this plays out in dictionary publishing.) I think that “standard” English—along with its subregions of conformity: “academic English,” specific fields’ and journals’ style rules, and so on—has always been a matter of half clarity, half shibboleth. (That “standard” English is not the modally spoken version should be enough to argue for its nonstandardness, enough to argue that there is
some
hegemonic force at work, even if unwittingly or benevolently.)

But often
within
communities of speakers and writers, these deviations have gone unnoticed, let alone unpunished: if everyone around you says “ain’t,” then the idea that “ain’t” ain’t a word seems ridiculous, and correctly so. The modern, globalized world is changing that, however. If American English dominates the Internet, and British-originating searches return mostly American-originating results, then all of a sudden British youths are faced with a daily assault of u-less
colors
and
flavors
and
neighbors
like no other generation of Brits before them. Also, consider Microsoft Word: some person or group at Microsoft decided at some point in time which words were in its dictionary and which were not, subtly imposing their own vocabulary on users worldwide.
²³Never before did, say, Baltimorean stevedores or Houston chemists have to care if their vocabulary got the stamp of approval from Seattle-area software engineers: who cared? Now the vocabulary of one group intervenes in communications between members of other groups, flagging perfectly intelligible and standardized terms as mistakes. That said, on the other hand, as long as you can spell it, you can write it (and subsequently force the dictionary to stop red-underlining it). The software doesn’t actually
stop
people from typing what they want.

That is, as long as those people are using computers, not phones. Once we’re talking about mobile phones, where text prediction schemes rule, things get scarier. In some cases it may be literally impossible to write words the phone doesn’t have in its library.

Compression, as noted above, relies on bias—because making expected patterns easier to represent necessarily makes unexpected patterns harder to represent. The yay-for-the-consumer ease of “normal” language use also means there’s a penalty for going outside those lines. (For a typewriter-written poem not to capitalize the beginnings of lines, the beginnings of sentences, or the word “I” may be either a sign of laziness
or
an active aesthetic stand taken by the author—but for users subject to auto-”correction,” it can only be the latter.)

The more helpful our phones get, the harder it is to be ourselves. For everyone out there fighting to write idiosyncratic, high-entropy, unpredictable, unruly text, swimming upstream of spell-check and predictive auto-completion:
Don’t let them banalize you. Keep fighting
.
²⁴

Compression and the Concept of Time

Systems which involve large amounts of data that go through relatively small changes—a version control system, handling successive versions of a document, or a video compressor, handling successive frames of a film—lend themselves to something called “delta compression.” In delta compression, instead of storing a new copy of the data each time, the compressor stores only the original, along with files of the successive changes. These files are referred to as “deltas” or “diffs.” Video compression has its own sub-jargon: delta compression goes by “motion compensation,” fully stored frames are “key frames” or “I-frames” (intra-coded frames), and the diffs are called “P-frames” (predictive frames).

The idea, in video compression, is that most frames bear some marked resemblance to the previous frame—say, the lead actor’s mouth and eyebrow have moved very slightly, but the static background is exactly the same—thus instead of encoding the entire picture (as with the I-frames), you just (with the P-frames) encode the
diffs
between the last frame and the new one. When the entire scene cuts, you might as well use a new I-frame, because it bears no resemblance to the last frame, so encoding all the diffs will take as long as or longer than just encoding the new image itself. Camera edits tend to contain the same spike and decay of entropy that words do in the Shannon Game.

As with most compression, lowered redundancy means increased fragility: if the original, initial document or key frame is damaged, the diffs become almost worthless and all is lost. In general, errors or noise tends to stick around longer. Also, it’s much harder to jump into the middle of a video that’s using motion compensation, because in order to render the frame you’re jumping to, the decoder must wheel around and look backward for the most recent key frame, prepare that, and then make all of the changes between that frame and the one you want. Indeed, if you’ve ever wondered what makes streamed online video behave so cantankerously when you try to jump ahead in it, this is a big part of the answer.
²⁵

But would it be going too far to suggest that delta compression is changing our very understanding of
time
? The frames of a film, each bumped downward by the next; the frames of a View-Master
reel, each bumped leftward by the next … but these metaphors for motion—each instant in time knocked out of the present by its successor, like bullet casings kicked out of the chamber of an automatic weapon—don’t apply to
compressed
video. Time no longer
passes
. The future, rather than displacing it,
revises
the present, spackles over it, touches it up. The past is not the shot-sideways belt of spent moments but the blurry underlayers of the palimpsest, the hues buried in overpainting, the ancient Rome underfoot of the contemporary one. Thought of in this way, a video seems to heap
upward
, one infinitesimally thin layer at a time, toward the eye.

Diffs and Marketing, Personhood

A movie poster gives you one still out of the 172,800-ish frames that make up a feature film, a billboard distills the experience of a week in the Bahamas to a single word, a blurb tries to spear the dozen hours it will take to read a new novel using a trident of just three adjectives. Marketing may be lossy compression pushed to the breaking point. It can teach us things about grammar, by cutting the sentence down to its key word. But if we look specifically at the way art is marketed, we see a pattern very similar to I-frames and P-frames; but in this case, it’s a cliché and a diff. Or a genre and a diff.

When artists participate in a stylistic and/or narrative tradition (which is always), we can—and often do—describe their achievement as a diff. Your typical love story, with a twist: ____________________. Or, just like the sound of ________ but with a note of _____________. Or, ________ meets _______.

Children become diffs of their parents. Loves, diffs of old loves. Aesthetics become diffs of aesthetics. This instant: a diff of the one just gone.

Kundera:

What is unique about the “I” hides itself exactly in what is unimaginable about a person. All we are able to imagine is
what makes everyone like everyone else, what people have in common. The individual “I” is what differs from the common stock, that is, what cannot be guessed at or calculated, what must be unveiled.

Other books

Bold (The Handfasting) by St. John, Becca

Rakshasa Book I, Part #4: Shadowfall by Knight, Alica

A Broken Beautiful Beginning by Summers, Sophie

A Second Chance by Shayne Parkinson

Other Shepards by Adele Griffin

La canción de Troya by Colleen McCullough

Wilde Heart (Wilde Women Book 2) by Halliday, Suzanne

The Expatriates by Janice Y. K. Lee

The Truth Machine by Geoffrey C. Bunn

The Demon of the Air by Simon Levack