Authors: Benedict Carey
And that’s
Winston Churchill
.
The next thing to say is less obvious, though it’s rooted in a far more common type of blown test. We open the booklet and see familiar questions on material we’ve studied, stuff we’ve highlighted with yellow marker: names, ideas, formulas we could recite with ease only yesterday. No trick questions, no pink elephants, and still we lay an egg. Why? How? I did so myself on one of the worst possible days: a trigonometry final I needed to ace to get into an Advanced Placement course, junior year. I spent weeks preparing. Walking into the exam that day, I remember feeling pretty good. When the booklets were handed out, I scanned the questions and took an easy breath. The test had a few of the concepts I’d studied, as well as familiar kinds of questions, which I’d practiced dozens of times.
I can do this, I thought.
Yet I scored somewhere in the low 50s, in the very navel of average. (These days, a score like that would prompt many parents to call a psychiatrist.) Who did I blame? Myself. I knew the material but didn’t hear the music. I was a “bad test taker.” I was kicking myself—but for all the wrong reasons.
The problem wasn’t that I hadn’t worked hard enough, or that I lacked the test taking “gene.” No, my mistake was misjudging the depth of what I knew. I was duped by what psychologists call fluency, the belief that because facts or formulas or arguments are easy to remember
right now
, they’ll remain that way tomorrow or the next day. The fluency illusion is so strong that, once we feel we’ve nailed some topic or assignment, we assume that further study won’t help. We forget that we forget. Any number of study “aids” can create fluency illusions, including (yes) highlighting, making a study guide, and even chapter outlines provided by a teacher or a textbook. Fluency misperceptions are automatic. They form subconsciously and make us poor judges of what we need to restudy, or practice again. “We know that if you study something twice, in spaced sessions, it’s harder to process the material the second time, and so people think it’s counterproductive,” as Nate Kornell, a psychologist at Williams College, told me. “But the opposite is true: You learn more, even though it feels harder. Fluency is playing a trick on judgment.”
So it is that we end up attributing our poor test results to “test anxiety” or—too often—stupidity.
Let’s recall the Bjorks’ “desirable difficulty” principle: The harder your brain has to work to dig out a memory, the greater the increase in learning (retrieval and storage strength). Fluency, then, is the flipside of that equation. The
easier
it is to call a fact to mind, the smaller the increase in learning. Repeating facts right after you’ve studied them gives you nothing, no added memory benefit.
The fluency illusion is the primary culprit in below-average test performances. Not anxiety. Not stupidity. Not unfairness or bad luck.
Fluency.
The best way to overcome this illusion and improve our testing skills is, conveniently, an effective study technique in its own right. The technique is not exactly a recent invention; people have been employing it since the dawn of formal education, probably longer. Here’s the philosopher Francis Bacon, spelling it out in 1620: “If you read a piece of text through twenty times, you will not learn it by heart so easily as if you read it ten times while attempting to recite it from time to time and consulting the text
when your memory fails.” And here’s the irrepressible William James, in 1890, musing about the same concept: “A curious peculiarity of our memory is that things are impressed better by active than by passive repetition. I mean that in learning—by heart, for example—when we almost know the piece, it pays better to wait and recollect by an effort from within, than to look at the book again. If we recover the words in the former way, we shall probably know them the next time; if in the latter way, we shall very likely
need the book once more.”
The technique is testing itself. Yes, I am aware of how circular this logic appears: better testing through testing. Don’t be fooled. There’s more to self-examination than you know. A test is not only a measurement tool, it alters what we remember and
changes
how we subsequently organize that knowledge in our minds. And it does so in ways that greatly improve later performance.
• • •
One of the first authoritative social registries in the New World was
Who’s Who in America
, and the premiere volume, published in 1899, consisted of more than 8,500 entries—short bios of politicians, business leaders, clergymen, railroad lawyers, and sundry “
distinguished Americans.” The bios were detailed, compact, and historically rich. It takes all of thirty seconds, for example, to learn that Alexander Graham Bell received his patent for the telephone in 1876, just days after his twenty-ninth birthday, when he was a professor of vocal
physiology at Boston University. And that his father, Alexander Melville Bell (the next entry), was an inventor, too, an expert in elocution who developed Visible Speech, a set of symbols used to help deaf people learn to speak. And that
his
father—Alexander Bell, no middle name, of Edinburgh—pioneered the treatment of speech impediments. Who knew? The two younger Bells, though both were born in Edinburgh, eventually settled in Washington, D.C. The father lived at 1525 35th Street, and the son at 1331 Connecticut Avenue. That’s right, the addresses are here, too. (Henry James: Rye, Isle of Wight.)
In 1917, a young psychologist at Columbia University had an idea: He would use these condensed life entries to help answer a question. Arthur Gates was interested in, among other things, how the act of recitation interacts with memory. For centuries, students who received a classical education spent untold hours learning to recite from memory epic poems, historic monologues, and passages from scripture—a skill that’s virtually lost today. Gates wanted to know whether there was an ideal ratio between reading (memorizing) and reciting (rehearsal). If you want to learn Psalm 23 (
The Lord is my shepherd, I shall not want …
) by heart—in, say, a half hour—how many of those minutes should you spend studying the verse on the page, and how many should you spend trying to recite from memory? What ratio anchors that material in memory most firmly? That would have been a crucial percentage to have, especially back when recitation was so central to education. The truth is, it’s just as handy today, not only for actors working to memorize Henry V’s St. Crispin’s Day speech but for anyone preparing a presentation, learning a song, or studying poetry.
To find out if such a ratio existed, Gates enlisted five classes from a local school, ranging from third to eighth grade,
for an experiment. He assigned each student a number of
Who’s Who
entries to memorize and recite (the older students got five entries, the youngest ones three). He gave them each nine minutes to study along with specific
instructions on how to use that time: One group would spend a minute and forty-eight seconds memorizing, and seven minutes, twelve seconds rehearsing (reciting); another would split its time in half, equal parts memorizing and rehearsing; a third, eight minutes of its time memorizing, and only a minute rehearsing. And so on.
Three hours later, it was showtime. Gates asked each student to recite what he or she could remember of their assigned entries:
“Edgar Mayhew Bacon, author … born, uh, June 5, 1855, Nassau, the Bahamas, and uh, went to private schools in Tarrytown, N.Y.; worked in a bookstore in Albany, and then I think became an artist … and then wrote, ‘The New Jamaica’?… and ‘Sleepy Hollow’ maybe?”
One, after another, after another. Edith Wharton. Samuel Clemens. Jane Addams. The brothers James. More than a hundred students, reciting.
And in the end, Gates had his ratio.
“In general,” he concluded, “the best results are obtained by introducing recitation after devoting about 40 percent of the time to reading. Introducing recitation too early or too late leads to poorer results,” Gates wrote. In the older grades, the percentage was even smaller, closer to a third. “The superiority of optimal reading and retention over reading alone is
about 30 percent.”
The quickest way to download that St. Crispin’s Day speech, in other words, is to spend the first third of your time memorizing it, and the remaining two thirds reciting from memory.
Was this a landmark finding? Well, yes, actually. In hindsight, it was the first rigorous demonstration of a learning technique that scientists now consider one of the most powerful of all. Yet at the time no one saw it. This was one study, in one group of schoolchildren. Gates didn’t speculate on the broader implications of his results, either, at least not in the paper he published in the
Archives of
Psychology
, “Recitation as a Factor in Memorizing,” and the study generated little scientific discussion or follow-up.
The reasons for this, I think, are plain enough. Through the first half of the twentieth century, psychology was relatively young and growing by fits and starts, whipsawed by its famous theorists. Freud’s ideas still cast a long shadow and attracted hundreds of research projects. Ivan Pavlov’s experiments helped launch decades of research on conditioned learning—stimulus-response experiments, many of them in animals. Research into education was in an exploratory phase, with psychologists looking into reading, into learning disabilities, phonics, even the effect of students’ emotional life on grades. And it’s important to say that psychology—like any science—proceeds in part by retrospective clue gathering. A scientist has an idea, a theory, or a goal, and looks backward to see if there’s work to build on, if there’s anyone who’s had the same idea or reported results that are supportive of it. Science may be built on the shoulders of giants, but for a working researcher it’s often necessary to ransack the literature to find out who those giants are. Creating a rationale for a research project can be an exercise in historical data mining—in finding shoulders to build on.
Gates’s contribution is visible only in retrospect, but it was inevitable that its significance would be noticed. Improving education was, then as now, a subject of intense interest. And so, in the late 1930s, more than twenty years later, another researcher found in Gates’s study a rationale for his own. Herbert F. Spitzer was a doctoral student at the State University of Iowa, who in 1938 was trawling for a dissertation project. He wasn’t interested in recitation per se, and he didn’t belong to the small club of academic psychologists who were focused on studying the intricacies of memory. He was intent on improving teaching methods, and one of the biggest questions hanging over teachers, from the very beginning of the profession, was
when
testing is most effective. Is it best to give one big exam
at the end of a course? Or do periodic tests given earlier in the term make more sense?
We can only guess at Spitzer’s thinking, because he did not spell it out in his writings. We know he’d read Gates’s study, because
he cites it in his own. We know, too, that he saw Gates’s study for what it was. In particular, he recognized Gates’s recitation as a form of self-examination. Studying a prose passage for five or ten minutes, then turning the page over to recite what you can without looking, isn’t only practice. It’s a test, and Gates had shown that that self-exam had a profound effect on final performance.
That is to say: Testing
is
studying, of a different and powerful kind.
Spitzer understood that, and then asked the next big question. If taking a test—whether recitation, rehearsal, self-exam, pop quiz, or sit-down exam—improves learning, then when is the best time to take it?
To try to find out, he mounted an enormous experiment, enlisting sixth graders at ninety-one different elementary schools in nine Iowa cities—3,605 students in all. He had them study an age-appropriate six-hundred-word article, similar to what they might get for homework. Some were assigned an article on peanuts, and others one on bamboo. They studied the passage once. Spitzer then divided the students into eight groups and had each group take several tests on the passages over the next two months. The tests for each group were all the same, multiple-choice, twenty-five questions, each with five possible answers. For example, for those who studied bamboo:
What usually happens to a bamboo plant after the flowering period?
a. It dies
b. It begins a new growth
c. It sends up new plants from the roots
d. It begins to branch out
e. It begins to grow a rough bark
In essence, Spitzer conducted what was, and probably still is, the largest pop quiz experiment in history. The students had no idea that the quizzes were coming, or when. And each group got hit with quizzes at different times. Group 1 got one right after studying, then another a day later, and a third three weeks later. Group 6 didn’t take their first quiz until three weeks after reading the passage. Again, the time the students had to study was identical. So were the questions on the quizzes.
Yet the groups’ scores varied widely, and a pattern emerged.
The groups that took pop quizzes soon after reading the passage—once or twice within the first week—did the best on a final exam given at the end of two months, getting about 50 percent of the questions correct. (Remember, they’d studied their peanut or bamboo article only once.) By contrast, the groups who took their first pop quiz two weeks or more
after
studying scored much lower, below 30 percent on the final. Spitzer showed not only that testing is a powerful study technique, he showed it’s one that should be deployed sooner rather than later.
“Immediate recall in the form of a test is an effective method of aiding the retention of learning and should, therefore, be employed more frequently,” he concluded. “Achievement tests or examinations are learning devices and should not be considered only as tools for
measuring achievement of pupils.”