Authors: Ian Ayres
Nonetheless, there are still many areas where metrics of success and plentiful historical data are just waiting to be mined. While data-driven thinking has been on the rise throughout society, there are still plenty of pockets of resistance that are ripe for change.
There's almost an iron-clad law that it's easier for people to warm up to applications of Super Crunching outside of their own area of expertise. It's devilishly hard for traditional, non-empirical evaluators to even consider the possibility that quantified predictions might do a better job than they can do on their own home turf. I don't think this is primarily because of blatant self-interest in trying to keep our jobs. We humans just overestimate our ability to make good decisions and we're skeptical that a formula that necessarily ignores innumerable pieces of information could do a better job than we could.
So let's turn the light onto the process of publishing books itself. Couldn't Super Crunching help Bantam or its parent, Random House, Inc., decide what to publish? Of course not. Book publishing is too much of an art to be susceptible to Super Crunching. But let's start small. Remember, I already showed how randomized trials helped test titles for this book. Why can't a regression help choose at least the title of books? Turns out Lulu.com has already run this regression. They estimated a regression equation to help predict whether a book's title is going to be a best-seller.
Atai Winkler, a British statistician, created a dataset on the sales of every novel to top the
New York Times
Bestseller List from 1955 to 2004 together with a control group of less successful novels by the same authors. With more than 700 titles, he then estimated a regression to predict the likelihood of becoming a best-seller. The regression tested for the impact of eleven different characteristics (Is the title in the form “Theââofââ”? Does the title include the name of a person or place? Does it begin with a verb?).
It turns out that figurative titles are more likely to produce best-sellers than are literal ones. It also matters whether the first word of a title is a verb, pronoun, or exclamation. And, contrary to publishing wisdom, shorter isn't necessarily better: a title's length does not significantly affect book sales. All told, the regression produced predictions that were much better than random guesses. “It guessed right in nearly 70 percent of cases,” Winkler said. “Given the nature of the data and the way tastes change, this is very good.” But Winkler didn't want to over-claim. “Whether a book gets to the best-seller list,” he said, “depends a lot on the other books that happen to be there that week on the list. Only one of them could be the best-seller.”
The results aren't perfect. While Agatha Christie's
Sleeping Murder
claimed the top spot among all of the titles Winkler analyzed, the model predicted that
The Da Vinci Code
had only a 36 percent chance of becoming a best-seller.
Even with its flaws, this is a web application that's both fun and a bit addictive. Just type in your proposed title at Lulu.com/titlescorer and bam, the applet gives you a prediction of success for any title you might imagine. You can even use the “Titlefight” feature to pit two competing title ideas against each other. Of course, this isn't really a test of whether your book will become a best-seller. It is a test of whether the title of someone like Jane Smiley will take off or not. Yet even if you've never had a book at the top of the best-sellers list, wouldn't you want to know how your title scored? (I did. Even though the book is nonfiction,
Super Crunchers
predicted a 56.8 percent chance of success. From Lulu's lips to God's ears.)
But why stop at the title of the book? Why not crunch the content?
My first reaction is again, nah, that would never work. It's impossible to code whether a book is well written. But this might just be the iron law of resistance talking. Beware of the person who says, “You could never quantify what I do.”
If Epagogix's analysis of plots can predict movie sales, why couldn't an analysis of plots help predict novel sales? Indeed, novels should be even easier to code because you don't have the confounding influences of temperamental actors and the possibility of botched or beautiful cinematography. The text is all there is. You might even be able to start by coding the very same criteria that Epagogix uses for movie scripts. The economic criteria for success also exist in abundance. Nielsen BookScan provides its subscribers with weekly point-of-sale data on how well books are selling at most major book retailers. So there are tons of data on sales success just waiting to be crunched. Instead of crudely predicting the probability of whether you hit the top of the best-seller list or not, you could try to predict the total sales based on a lot more than just the title.
Yet no one in publishing is rushing to be the first on the block to publicly use number crunching to choose what books to buy or how to make them better. A large part of me viscerally resists the idea that a nonfiction book could be coded or that Super Crunching could improve the content of this book. But another part of me has in fact already data mined a bit on what makes for success in nonfiction publishing.
As a law professor, my primary publishing job is to write law review articles. I don't get paid for them, but a central measure of an article's success is the number of times the articles have been cited by other professors. So with the help of a full-time number-crunching assistant named Fred Vars, I went out and analyzed what caused a law review article to be cited more or less. Fred and I collected citation information on all the articles published for fifteen years in the top three law reviews. Our central statistical formula had more than fifty variables. Like Epagogix, Fred and I found that seemingly incongruous things mattered a lot. Articles with shorter titles and fewer footnotes were cited significantly more, whereas articles that included an equation or an appendix were cited a lot less. Longer articles were cited more, but the regression formula predicted that citations per page peak for articles that were a whopping fifty-three pages long. (We law professors love to gas on about the law.)
Law review editors who want to maximize their citation rates should also avoid publishing criminal and labor law articles, and focus instead on constitutional law. And they should think about publishing more women. White women were cited 57 percent more often than white men, and minority women were cited more than twice as often. The ultimate merit of an article isn't tied to the race or gender of the author. Yet the regression results suggest that law review editors should think about whether they have been unconsciously setting the acceptance bar unfairly high for women and minority authors whose articles, when published, are cited systematically more often.
Law review editors of course are likely to resist many of these suggestions. Not because they're prima donnas (although believe me, some are), but just because they're human.
A Store of Information
Some long ago when we were taught
That for whatever kind of puzzle you got
You just stick the right formula in
A solution for every fool.
“LEAST COMPLICATED,” INDIGO GIRLS
We don't want to be told what to do by a hard-edged and obviously incomplete equation.
Something there is that doesn't love a formula
. Equations, like Robert Frost's walls, limit our freedom to go where we want.
With a little prodding, however, some of our most coveted assessments may yield to the reason of evidence. If this book has succeeded in convincing you that we humans do a poor job in figuring out how much weight to put on various factors when making predictions, then you should be on the lookout for areas in your own job and in your own life where Super Crunching could help.
Stepping back, we can see that technological constraints to data-driven decision making have fallen across the board. The ability to digitalize and store information means that any laptop with access to the Internet can now access libraries several times the size of the library of Alexandria. Computational techniques and fast computers to make the computations were of course necessary, but both regressions and CPUs were in place well before the phenomenon seriously took off. I've suggested here that it is instead our increasing facility with capturing, merging, and storing digital data that has more to do with the current onslaught. It is these breakthroughs in database technology that have also facilitated the commodification of information. Digital data now has market value and it is coalescing into huge data warehouses.
There's no reason to think that the advances in database technology will not continue. Kryder's Law shows no sign of ending. Mashups and merger techniques are becoming automated. Data-scraping programs of the future will not only search the web for new pieces of information but will also automatically seek out the merging equivalents of a Rosetta stone to pair observations from disparate datasets. Increasingly predictive Super Crunching techniques will help mash up the observations from disconnected data.
And maybe most importantly, we should see continued advances in the digital domain's ability to capture informationâespecially via miniaturized sensors. The miniaturization of electronic sensors has already spurred the capture of all sorts of data. Cell phones are ready to pinpoint owners' whereabouts, purchase soda, or digitally memorialize an image. Never before have so many people had in their pocket an ever-present means to record pictures.
But in the not-too-distant future, nanotechnology may spur an age of “ubiquitous surveillance” in which sensing devices become ever more pervasive in our society. Where retailers now keep track of inventory and sales through collection of data at the checkout scanner, nanotechnology may soon allow them to insert small sensors directly into the product. Nanosensors could keep track of how long you hold on to a particular product before using it, how far you transport it, or whether you will likely use the product in conjunction with other products. Of course, consumers would need to consent to product sensors. But there is no reason to limit the application of nanosensors to embedding them in other objects or clothing. Instead, we may find ourselves surrounded by “smart dust”: nanosensors that are free-floating and truly ubiquitous within a particular environment. These sensors might quite literally behave like dust; they would flow through the breeze and, at a size of one cubic millimeter, be nearly undetectable.
The prospect of pervasive digitalization of information is both exciting and scary. It is a cautionary tale for a world without privacy. Indeed, we have seen here several worrisome stories. Poor matching in Florida might have mistakenly purged thousands of African-American voters. Even the story of Epagogix rankles. Isn't art supposed to be determined by the artist? Isn't it better to accept a few cognitive foibles, but to retain more humane environments for creative flourishing? Is Super Crunching good?
CHAPTER 7
Are We Having Fun Yet?
Sandra Kay Daniel, a second-grade teacher at the Emma E. Booker Elementary School in Sarasota, Florida, sits in front of about a dozen second graders. She is a middle-aged matronly African-American woman with a commanding but encouraging voice.
Open your book up to lesson sixty on page 153. And at the count of three. Oneâ¦. Twoâ¦Three. Everyone should be on page 153. If the yellow paper is going to bother you, drop it. Thank you. Everyone touch the title of your story. Fingers under the title. Get ready to read the titleâ¦.
The
â¦
Fast
â¦
Way
. We're waiting for one member. Thank you. Fingers under the title of the story. Get ready!
Class (in unison): “The Pet Goat.”
Yes. “The Pet Goat.” Fingers under the first word of the story. Get ready to read the story the fast way. GET READY!
The class begins reading the story in unison. As they read, the teacher taps her ruler against the board, beating out a steady rhythm. The students read one word per beat.
Class (to the beat): A girl got a pet goat.
Go on.
Class (to the beat): She liked to go running with her pet goat.
Go on.
Class (to the beat): She played with herâ¦
Try it again. Get ready, from the beginning of that sentence. GET READY!
Class (to the beat): She played with her goat in her house.
Go on.
Class (to the beat): The goat ate cans and he ate canes.
Go on.
Class (to the beat): One day her dad said that goat must go.
What's behind the word “said”?
Class (in unison): Comma.
And what does that comma mean?
Class: Slow down.
Let's read that sentence again. Get ready!
Class (to the beat): One day her dad said (pause) that goat must go.
Go on.
Class (to the beat): He eats too many things.
Go on.
Class (to the beat): The girl said that if you let the goat stay with us I will see that he stops eating all those things.
Nice and loud, crisp voices. Let's go.
Class (to the beat): Her dad said that he will try it.
Go on.
Class (to the beat): But one day a car robber came to the girl's house.
Go on.
Class (to the beat): He saw a big red car in the house and said I will steal that car.
Go on.
Class (to the beat): He ran to the car and started to open the door.
Go on.
Class (to the beat): The girl and the goat were playing in the backyard.
Go on.
Class (to the beat): They did not see the car robber. More to come.
More to come? This is a real cliff-hanger. Will the goat stop the car robber? Will the dad get fed up and kick the goat out?
Millions of us have actually seen Ms. Daniel's class. However, in the videotape, our attention was centered not on the teacher or the students but on a special guest who was visiting that day. The special guest, who was sitting quietly by Ms. Daniel's side, was President George W. Bush.
The videotape of the class was a central scene in Michael Moore's
Fahrenheit 9/11
. Just as Ms. Daniel was asking her students to “open your book to lesson sixty,” Andrew Card, the president's chief of staff, came over and whispered into Bush's ear, “A second plane hit the second tower. America is under attack.”
Moore's purpose was to criticize Bush for not paying more attention to what was happening outside the classroom. Yet what was happening inside Ms. Daniel's classroom concerns one of the fiercest battles raging about how best to teach schoolchildren. Bush brought the press to visit this class because Ms. Daniel was using a controversial, but highly effective, teaching method called “Direct Instruction” (DI).
The fight over whether to use DI, like the fight over evidence-based medicine, is at heart a struggle about whether to defer to the results of Super Crunching. Are we willing to follow a treatment that we don't like, but which has been shown in statistical testing to be effective?
Direct Instruction forces teachers to follow a script. The entire lessonâthe instructions (“Put your finger under the first word.”), the questions (“What does that comma mean?”), and the prompts (“Go on.”)âis written out in the teacher's instruction manual. The idea is to force the teacher to present information in easily digestible, bitesize concepts, and to make sure that the information is actually digested.
Each student is called upon to give up to ten responses each minute. How can a single teacher pull this off? The trick is to keep a quick pace and to have the students answer in unison. The script asks the students to “get ready” to give their answers and then after a signal from the teacher, the class responds simultaneously. Every student is literally on call for just about every question.
Direct Instruction also requires fairly small groups of five to ten students of similar skill levels. Small group sizes make it harder for students to fake that they're answering and it lets the teacher from time to time ask individual students to respond, if the teacher is concerned that someone is falling behind.
The high-speed call and response of a DI class is both a challenging and draining experience. As a law professor, it sounds to me like the Socratic method run amok. Most grade schoolers can only handle a couple of hours a day of being constantly on call.
The DI approach is the brainchild of Siegfried “Zig” Engelmann, who started studying how best to teach reading at the University of Illinois in the 1960s. He has written over 1,000 short books in the “Pet Goat” genre. Engelmann, now in his seventies, is a disarming and refreshingly blunt academic who for decades has been waging war against the great minds of education.
Followers of the Swiss developmental psychologist Jean Piaget have championed child-centered approaches to education that tailor the curriculum to the desires and interests of individual students. Followers of the MIT linguist and polymath Noam Chomsky have promoted a whole-language approach to language acquisition. Instead of breaking reading into finite bits of information in order to teach kids specific phonic skills, the whole-language approach embraces a holistic immersion in listening to and eventually reading entire sentences.
Engelmann flatly rejects both the child-centered and whole-language approaches. He isn't nearly as famous as Chomsky or Piaget, but he has a secret weaponâdata. Super Crunching doesn't say on a line-by-line basis what should be included in Zig's scripts, but Super Crunching on the back end tells him what approaches actually help students learn. Engelmann rails against educational policies that are the product of top-down philosophizing instead of a bottom-up attention to what works. “Decision makers don't choose a plan because they know it works,” he says. “They choose a plan because it's consistent with their vision of what they think kids should do.” Most educators, he says, seem to have “a greater investment in romantic notions about children” than they do “in the gritty detail of actual practice or the fact that some things work well.”
Engelmann is a thorough pragmatist. He started out in his twenties as an advertising exec who tried to figure out how often you had to repeat an ad to get the message to stick. He kept asking the “Does it work?” question when he turned his attention to education.
The evidence that DI works goes all the way back to 1967. Lyndon Johnson, as part of his War on Poverty, wanted to “follow through” on the vanishing gains seen from Head Start. Concerned that “poor children tend to do poorly in school,” the Office of Education and the Office of Economic Opportunity sought to determine what types of education models could best break this cycle of failure. The result was Project Follow Through, an ambitious effort that studied 79,000 children in 180 low-income communities for twenty years at a price tag of more than $600 million. It is a lot easier to Super Crunch when you have this kind of sustained support behind you. At the time, it was the largest education study ever done. Project Follow Through looked at the impact of seventeen different teaching methods, ranging from models like DI, where lesson plans are carefully scripted, to unstructured models where students themselves direct their learning by selecting what and how they will study. Some models, like DI, emphasized acquisition of basic skills like vocabulary and arithmetic, others emphasized higher-order thinking and problem-solving skills, and still others emphasized positive attitudes toward learning and self-esteem. Project Follow Through's designers wanted to know which model performed the best, not only in developing skills in its area of emphasis, but also across the board.
Direct Instruction won hands down. Education writer Richard Nadler summed it up this way: “When the testing was over, students in DI classrooms had placed first in reading, first in math, first in spelling, and first in language. No other model came close.” And DI's dominance wasn't just in basic skill acquisition. DI students could also more easily answer questions that required higher-order thinking. For example, DI students performed better on tests evaluating their ability to determine the meaning of an unfamiliar word from the surrounding context. DI students were also able to identify the most appropriate pieces to fill in gaps left in mathematical and visual patterns. DI even did better in promoting students' self-esteem than several child-centered approaches. This is particularly striking because a central purpose of child-centered teaching is to promote self-esteem by engaging children and making them the authors of their own education.
More recent studies by both the American Federation of Teachers and the American Institutes for Research reviewed data on two dozen “whole school” reforms and found once again that the Direct Instruction model had the strongest empirical support. In 1998, the American Federation of Teachers included DI among six “promising programs to increase student achievement.” The study concluded that when DI is properly implemented, the “results are stunning,” with DI students outperforming control students along every academic measure. In 2006, the American Institutes for Research rated DI as one of the top two out of more than twenty comprehensive school reform programs. DI again outperformed traditional education programs in both reading and math.
“Traditionalists die over this,” Engelmann said. “But in terms of data we whump the daylights out of them.”
But waitâit gets even better. Direct Instruction is particularly effective at helping kids who are reading below grade level. Economically disadvantaged students and minorities thrive under DI instruction. And maybe most importantly, DI is scalable. Its success isn't contingent on the personality of some über-teacher. DI classes are entirely scripted. You don't need to be a genius to be an effective DI teacher. DI can be implemented in dozens upon dozens of classrooms with just ordinary teachers. You just need to be able to follow the script.
If you have a school where third graders year after year are reading at a first-grade level, they are seriously at risk of being left behind. DI gives them a realistic shot of getting back to grade. If the school adopts DI from day one of kindergarten, the kids are much less likely to fall behind in the first place.
Imagine that. Engelmann has a validated and imminently replicable program that can help at-risk students. You'd think schools would be beating a path to his door.
What Am I, a Potted Plant?
Direct Instruction has faced severe opposition from educators on the ground. They criticize the script as turning teachers into robots, and for striving to make education “teacher proof.”
Can you blame them for resisting? Would
you
want to have to follow a script most of your working day, repeating ad nauseam stale words of encouragement and correction? Most teachers are taught that they should be creative. It is a stock movie genre to show teachers getting through to kids with unusual and idiosyncratic techniques (think
To Sir with Love, Stand and Deliver, Music of the Heart, Mr. Holland's Opus
). No one's going to make a motivational drama about Direct Instruction.
Engelmann admits that teacher resistance is a problem. “Teachers initially think this is horrible,” he said. “They think it is confining. It's counter to everything they've ever been taught. But within a couple of months, they realize that they are able to teach kids things that they've tried to teach before and never been able to teach.”
Direct Instruction caused a minor schism when it was introduced into Arundel Elementary in 1996. Arundel Elementary is perched upon a hill in Baltimore's struggling Cherry Hill neighborhood. It is surrounded by housing projects and apartment complexes. Ninety-five percent of its students are poor enough to qualify for federally subsidized lunches. When Arundel adopted DI, several teachers were so frustrated with the script that they transferred to other schools. The teachers who stayed, though, have come to embrace the system. Matthew Carpenter teaches DI seven hours a day. “I like the structure,” he said. “I think it's good for this group of kids.”