The Most Human Human (24 page)

Read The Most Human Human Online

Authors: Brian Christian

BOOK: The Most Human Human
9.8Mb size Format: txt, pdf, ePub

There’s a trade-off, of course, between the number of opportunities for interaction and response, on the one hand, and the sophistication of the responses themselves. The former thrives with brevity, the latter with length. It seemed to me, though, that so much of the difficulty and nuance in conversation comes from understanding the question and offering an appropriate response—thus it makes sense to maximize the amount of interchanges.

Some judges, I would discover, would be startled or confused at
this jumping of the gun, and I saw them pause, hesitate, yield, even start backspacing what they had half written. Other judges cottoned on immediately, and leaped right in after.
4

In the first round of the 2009 contest, judge Shalom Lappin—computational linguist at King’s College London—spoke with Cleverbot, and then myself. My strategy of verbosity was clearly in evidence: I made 1,089 keystrokes in five minutes (3.6 keystrokes a second) to Cleverbot’s 356 (1.2/sec), and Lappin made 548 keystrokes (1.8/sec) in my conversation, compared to 397 (1.3/sec) with Cleverbot. Not only did I say three times as much as my silicon adversary, but I engaged the judge more, to the tune of 38 percent more typing from Lappin.

Looking back at the logs, though, I wanted to see if there was a way to quantify the
fluidity
of the human interactions against the rigidity of the machine ones. It occurred to me that you could create a benchmark—let’s call it “swaps”—for the number of times that the party who typed the most recent keystroke changes.

I did manage to type three times as much as Cleverbot, but the
real
story, as it turns out, is in the swaps. Lappin’s conversation with Cleverbot had 33 such events; his conversation with me had 492, almost fifteen times that.

There’s no single obvious way to show, on the page, what that kind of interaction looks like (and this is partially the point, the gap between performance and transcript). His Cleverbot conversation, written in the traditional script style, looks a bit like this:

JUDGE
:
What language are you written in?

REMOTE
:
English.

JUDGE
:
Give me a break!

REMOTE
:
Ok.

JUDGE
:
How do you enjoy the conference?

Even if we notate it more like musical staves, keystroke by keystroke,
5
there’s very little more to the story. The keystrokes line up in tidy lines: statement/response/statement/response:

Four carriage returns, four swaps. And our conversation, as a script, looks like this:

REMOTE
:
sweet, picking up an instrument

JUDGE
:
I meant Stones, Dylan, Beatles …

But the musical-staff-looking keystroke logs look utterly unlike the Cleverbot logs, and they tell a much different story:

Two carriage returns, fifty-one swaps.

Alternately, we might try a third notation, which makes the difference even clearer: to string all the letters together, bolding the judge’s
keystrokes and leaving the computer’s and my own unbolded. You get
this
from the human-computer dialogues:

And
this
from the human-human dialogues:

Now if that difference isn’t night and day, I don’t know what is.
Over
.

1.
Some equations (the Newtonian parabolas that projectiles follow, for instance) are such that you can just plug in any old future value for time and get a description of the future state of events. Other calculations (e.g., some cellular automata) contain no such shortcuts. Such processes are called “computationally irreducible.” Future time values cannot simply be “plugged in”; rather, you have to run the simulation all the way from point A to point Z, including all intermediate steps. Stephen Wolfram, in
A New Kind of Science
, attempts to reconcile free will and determinism by conjecturing that the workings of the human brain are “irreducible” in this way: that is, there are no Newtonian-style “laws” that allow us shortcuts to knowing in advance what people will do. We simply have to observe them.

2.
Linguists have dubbed this “back-channel feedback.”

3.
Apparently the world of depositions is changing as a result of the move from written transcripts to video. After being asked an uncomfortable question, one expert witness, I was told, rolled his eyes and glowered at the deposing attorney, then shifted uncomfortably in his chair for a full fifty-five seconds, before saying, smugly and with audible venom, “I don’t recall.” He had the transcript in mind. But when a
video
of that conversation was shown in court, he went down in flames.

4.
As Georgetown University linguist Deborah Tannen notes: “This all-together-now interaction-focused approach to conversation is more common throughout the world than our one-at-a-time information-focused approach.”

5.
We’ll use “_” to mean a space, “
” to mean carriage return/enter, and “»” to mean backspace.

8. The World’s Worst Deponent
Body (&) Language

Language is an odd thing. We hear communication experts telling us time and again about things like the “7-38-55 rule,” first posited in 1971 by UCLA psychology professor Albert Mehrabian: 55 percent of what you convey when you speak comes from your body language, 38 percent from your tone of voice, and a paltry 7 percent from the words you choose.

Yet it’s that 7 percent that can and will be held against you in a court of law: we are held, legally, to our diction much more than we are held to our tone or posture. These things may speak louder than words, but they are far harder to transcribe or record. Likewise, it’s harder to defend against an accusation of using a certain word than it is to defend against an accusation of using a certain tone; also, it’s much more permissible for an attorney quoting a piece of dialogue to superimpose her own body language and intonation—because they cannot be reproduced completely accurately in the first place—than to supply her own diction.

It’s that same, mere 7 percent that is all you have to prove your humanity in a Turing test.

Lie Detection

One way to think about the Turing test is as a lie-detection test. Most of what the computer says—notably, what it says about itself—is false. In fact, depending on your philosophical bent, you might say that the software is
incapable
of expressing truth at all (in the sense that we usually insist that a liar must understand the meaning of his words for it to count as lying). I became interested, as a confederate, in examples where humans have to confront other humans in situations where one is attempting to obtain information that the other one doesn’t want to give out, or one is attempting to prove that the other one is lying.

One of the major arenas in which these types of encounters and interactions play out is the legal world. In a deposition, for instance, most any question is fair game—the lawyer is, often, trying to be moderately sneaky or tricky, the deponent knows to expect this, and the lawyer knows to expect them expecting this, and so on. There are some great findings that an attorney can use to her advantage—for example, telling a story
backward
is almost impossible if the story is false. (Falsehood would not appear to be as modular and flexible as truth.) However, certain types of questions
are
considered “out of bounds,” and the deponent’s attorney can make what’s called a “form objection.”

There are several types of questions that can be objected to at a formal level. Leading questions, which suggest an answer (“You were at the park, weren’t you?”), are out of bounds, as are argumentative questions (“How do you expect the jury to believe that?”), which challenge the witness without actually attempting to discover any particular facts or information. Other formally objectionable structures include compound questions, ambiguous questions, questions assuming facts not yet established, speculative questions, questions that improperly characterize the person’s earlier testimony, and cumulative or repetitive questions.

In the courtroom, verbal guile of this nature is off-limits, but it may be that we find this very borderline—between appropriate and inappropriate levels of verbal gamesmanship—is precisely the place where we want to position ourselves in a Turing test. The Turing test has no rules of protocol—anything is permissible, from obscenity to nonsense—and so interrogative approaches deemed too cognitively taxing or indirect or theatrical for the legal process may, in fact, be perfect for teasing apart human and machine responses.

Questions Deserving
Mu

To take one example, asking a “simple” yes-or-no question might prompt an incorrect answer, which might provide evidence that the respondent is a computer. In 1995, a judge responded to “They have most everything on Star Trek” by asking, “Including [rock band] Nine Inch Nails?” The answer: an unqualified “Yes.” “What episode was that?” says the judge. “I can’t remember.” This line of questioning goes some way toward establishing that the interlocutor is just answering at random (and is thereby probably a machine that simply doesn’t understand the questions), but even so, it takes some digging to make sure that your conversant didn’t simply misunderstand what you asked, isn’t simply being sarcastic, etc.—all of which takes time.

Other books

Sweet Women Lie by Loren D. Estleman
Sunrise Point by Robyn Carr
A Goal for Joaquin by Jerry McGinley
Roll With It by Nick Place
Late for the Wedding by Amanda Quick
Pixie's Passion by Mina Carter