Authors: Ian Ayres
This is in many ways a depressing story for the role of flesh-and-blood people in making decisions. It looks like a world where human discretion is sharply constrained, where humans and their decisions are controlled by the output of machines. What, if anything, in the process of prediction can we humans do better than the machines?
What's Left for Us to Do?
In a word, hypothesize. The most important thing that is left to humans is to use our minds and our intuition to guess at what variables should and should not be included in statistical analysis. A statistical regression can tell us the weights to place upon various factors (and simultaneously tell us how precisely it was able to estimate these weights). Humans, however, are crucially needed to generate the hypotheses about what causes what. The regressions can test whether there is a causal effect and estimate the size of the causal impact, but somebody (some body, some human) needs to specify the test itself.
Consider, for example, the case of Aaron Fink. Fink was a California urologist and an outspoken advocate of circumcision. (He was the kind of guy who self-published a book to promote his ideas.) In 1986, the
New England Journal of Medicine
published a letter of his which proposed that uncircumcised men would be more susceptible to HIV infection than circumcised men. At the time, Fink didn't have any data, he just had the idea that the cell of the prepuce (the additional skin on an uncircumcised male) might be susceptible to infection. Fink also noticed that countries like Nigeria and Indonesia where only about 20 percent of men are uncircumcised had a slower spread of AIDS than in countries like Zambia and Thailand where 80 percent of men are uncircumcised. Seeing through the sea of data to recognize a correlation that had eluded everyone else was a stroke of brilliance.
Before Fink died in 1990, he was able to see the first empirical verification of his idea. Bill Cameron, an AIDS researcher in Kenya, hit upon a powerful test of Fink's hypothesis. Cameron and his colleagues found 422 men who visited prostitutes in Nairobi, Kenya, in 1985 (85 percent of these prostitutes were known to be HIV positive) and subsequently went to a clinic for treatment of a non-HIV STD. Like Ted Ruger's Supreme Court study, Cameron's test was prospective. Cameron and his colleagues counseled the men on HIV, STDs, and condom use and asked them to refrain from further prostitute contact. The researchers then followed up with these men on a monthly basis for up to two years, to see if and under what circumstances the men became HIV positive. Put simply, they found that uncircumcised men were 8.2 times more likely to become HIV positive than circumcised men.
This small but powerful study triggered a cascade of dozens of studies confirming the result. In December 2006, the National Institutes of Health stopped two randomized trials that it was running in Kenya and Uganda because it became apparent that circumcision reduced a man's risk of contracting AIDS from heterosexual sex by about 65 percent. Suddenly, the Gates Foundation is considering paying for circumcision in high risk countries.
What started as a urologist's hunch may end up saving hundreds of thousands of lives. Yes, there is still a great role for deductive thought. The Aristotelian approach to knowledge remains important. We still need to theorize about the true nature of things, to speculate. Yet unlike the old days, where theorizing was an end in itself, the Aristotelian approach will increasingly be used at the beginning as an input to statistical testing. Theory or intuition may lead the Finks of the world to speculate that X and Y cause Z. But Super Crunching (by the Camerons) will then come decisively into play to test and parameterize the size of the impact.
The role of theory in
excluding
potential factors is especially important. Without theory or intuition, there is a literal infinity of possible causes for any effect. How are we to know that what a vintner had for lunch when he was seven doesn't affect who he might fall in love with or how quaffable his next vintage might be? With finite amounts of data, we can only estimate a finite number of causal effects. The hunches of human beings are still crucial in deciding what to test and what not to test.
The same is even more true for randomized testing. People have to figure out in advance what to test. A randomized trial only gives you information about the causal impact of some treatment versus a control group. Technologies like Offermatica are making it a lot cheaper to test dozens of separate treatments' effects. Yet there's still a limit to how many things can be tested. It would probably be just a waste of money to test whether an ice-cream diet would be a healthy way to lose weight. But theory tells me that it might not be a bad idea to test whether financial incentives for weight loss would work.
So the machines still need us. Humans are crucial not only in deciding what to test, but also in collecting and, at times, creating the data. Radiologists provide important assessments of tissue anomalies that are then plugged into the statistical formulas. Same goes for parole officials who subjectively judge the rehabilitative success of particular inmates. In the new world of database decision making, these assessments are merely inputs for a formula, and it is statistics, and not experts, that determine how much weight is placed on the assessments.
Albert Einstein said the “really valuable thing is intuition.” In many ways, he's still right. Increasingly, though, intuition is a precursor to Super Crunching. In case after case, traditional expertise innocent of statistical analysis is losing out to unaided intuition. As Paul Meehl concluded shortly before he died:
There is no controversy in social science which shows such a large body of qualitatively diverse studies coming out so uniformly in the same direction as this one. When you are pushing over 100 investigations, predicting everything from the outcome of football games to the diagnosis of liver disease, and when you can hardly come up with a half dozen studies showing even a weak tendency in favor of the clinician, it is time to draw a practical conclusion.
It is much easier to accept these results when they apply to someone else. Few people are willing to accept that a crude statistical algorithm based on just a handful of factors could outperform them. Universities are loath to accept that a computer could select better students. Book publishers would be loath to delegate the final say in acquiring manuscripts to an algorithm.
At some point, however, we should start admitting that the superiority of Super Crunching is not just about the other guy. It's not just about baseball scouts and wine critics and radiologists andâ¦the list goes on and on. Indeed, by now I hope to have convinced you that something real is going on out there. Super Crunching is impacting real-world decisions in many different contexts that touch us as consumers, as patients, as workers, and as citizens.
Kenneth Hammond, the former director of Colorado's Center for Research on Judgment and Policy, reflects with some amusement on the resistance of clinical psychologists to Meehl's overwhelming evidence:
One might ask why clinical psychologists are offended by the discovery that their intuitive judgments and predictions are (almost) as good as, but (almost) never better than, a rule. We do not feel offended at learning that our excellent visual perception can often be improved in certain circumstances by the use of a tool (e.g., rangefinders, telescopes, microscopes). The answer seems to be that tools are used by clerks (i.e., someone without professional training); if psychologists are no different, then that demeans the status of the psychologist.
This transformation of clinicians to clerks is indicative of a larger trend. Something has happened out there that has triggered a shift of discretion from traditional experts to a new breed of Super Crunchers, the people who control the statistical equations.
CHAPTER 6
Why Now?
My nephew Marty has a T-shirt that says on the front: “There are 10 types of people in the world⦔ If you read these words and are trying to think of what the ten types are, you've already typed yourself.
The back of the shirt reads: “Those that understand binary, and those that don't.” You see, in a digitalized world, all numbers are represented by 0s and 1s, so what we know as the number 2 is represented to a computer as 10. It's the shift to binary bytes that is at the heart of the Super Crunching revolution.
More and more information is digitalized in binary bytes. Mail is now email. From health care files to real estate and legal filings, electronic records are everywhere. Instead of starting with paper records and then manually inputting information, data are increasingly captured electronically in the very first instanceâsuch as when we swipe our credit card or the grocery clerks scan our deodorant purchase at the checkout line. An electronic record of most consumer purchases now exists.
And even when the information begins on paper, inexpensive scanning technologies are unlocking the wisdom of the distant and not-so-distant past. My brother-in-law used to tell me, “You can't Google dead trees.” He meant that it was impossible to search the text of books. Yet now you can. For a small monthly fee, Questia.com will give you full text access to over 67,000 books. Amazon.com's “Search Inside the Book” feature allows Internet users to read snippets from full text searches of over 100,000 volumes. And Google is attempting something that parallels the Human Genome Project in both its scope and scale. The Human Genome Project had the audacity to think that within the space of thirteen years it could sequence three billion genes. Google's “Book Search” is ambitiously attempting to scan the full text of more than thirty million books in the next ten years. Google intends to scan every book ever published.
From 90 to 3,000,000
The increase in accessibility to digitalized data has been a part of my own life. Way back in 1989 when I had just started teaching, I sent six testers out to Chicagoland new car dealerships to see if dealers discriminated against women or minorities. I trained the testers to follow a uniform script that told them how to bargain for a car. The testers even had uniform answers to any questions the salesman might ask (including “I'm sorry but I'm not comfortable answering that”). The testers walked alike, talked alike. They were similar on every dimension I could think of except for their race and sex. Half the testers were white men and half were either women or African-Americans. Just like a classic fair housing test, I wanted to see if women or minorities were treated differently than white men.
They were. White women had to pay 40 percent higher markups than white men; black men had to pay more than twice the markup, and black women had to pay more than three times the markup of white male testers. My testers were systematically steered to salespeople of their own race and gender (who then gave them worse deals).
The study got a lot of press when it was published in the
Harvard Law Review. Primetime Live
filmed three different episodes testing whether women and minorities were treated equally not just at car dealerships but at a variety of retail establishments. A lot of people were disturbed by film clips of shoe clerks who forced black customers to wait and wait for service even though no one else was in the store. More importantly, the study played a small role in pushing the retail industry toward no-haggle purchasing.
A few years after my study, Saturn decided to air a television commercial that was centrally about Saturn's unwillingness to discriminate. The commercial was composed entirely of a series of black-and-white photographs. In a voiceover narrative, an African-American man recalls his father returning home after purchasing a car and feeling that he had been mistreated by the salesman. The narrator then says maybe that's why he feels good about having become a salesperson for Saturn. The commercial is a remarkable piece of rhetoric. The stark photographic images are devoid of the smiles that normally populate car advertisements. Instead there is a heartrending shot of a child realizing that his father has been mistreated because of their shared race and the somber but firmly proud shot of the grown, grim-faced man now having taken on the role of a salesman who does not discriminate. The commercial does not explicitly mention race or Saturn's no-haggle policyâbut few viewers would fail to understand that race was a central cause of the father's mistreatment.
The really important point is that all this began with six testers bargaining at just ninety dealerships. While I ultimately did a series of follow-up studies analyzing the results of hundreds of additional bargains, the initial uproar came from a very small study. Why so small? It's hard to remember, but this was back in the day before the Internet. Laptop computers barely existed and were expensive and bulky. As a result, all of my data were first collected on paper and then had to be hand-entered (and re-entered) into computer files for analysis. Technologically, back then, it was harder to create digital data.
Fast-forward to the new millennium, and you'll still find me crunching numbers on race and cars. Now, however, the datasets are much, much bigger. In the last five years, I've helped to crunch numbers in massive class-action litigation against virtually all of the major automotive lenders. With the yeoman help of Vanderbilt economist Mark Cohen (who really bore the laboring oar), I have crunched data on more than three million car sales.
While most consumers now know that the sales price of a car can be negotiated, many do not know that auto lenders, such as Ford Motor Credit or GMAC, often give dealers the option of marking up a borrower's interest rate. When a car buyer works with the dealer to arrange financing, the dealer normally sends the customer's credit information to a potential lender. The lender then responds with a private message to the dealer that offers a “buy rate”âthe interest rate at which the lender is willing to lend. Lenders often will pay a dealerâsometimes thousands of dollarsâif the dealer can get the consumer to sign a loan with an inflated interest rate. For example, Ford Motor Credit tells a dealer that it was willing to lend Susan money at a 6 percent interest rate, but that they would pay the dealership $2,800 if the dealership could get Susan to sign an 11 percent loan. The borrower would never be told that the dealership was marking up the loan. The dealer and the lender would then split the expected profits from the markup, with the dealership taking the lion's share.
In a series of cases that I worked on, African-American borrowers challenged the lenders' markup policies because they disproportionately harmed minorities. Cohen and I found that on average white borrowers paid what amounted to about a $300 markup on their loans, while black borrowers paid almost $700 in markup profits. Moreover, the distribution of markups was highly skewed. Over half of white borrowers paid no markup at all, because they qualified for loans where markups were not allowed. Yet 10 percent of GMAC borrowers paid more than $1,000 in markups and 10 percent of the Nissan customers paid more than a $1,600 markup. These high markup borrowers were disproportionately black. African-Americans were only 8.5 percent of GMAC borrowers, but paid 19.9 percent of the markup profits. The markup difference wasn't about credit scores or default risk; minority borrowers with good credit routinely had to pay higher markups than white borrowers with similar credit scores.
These studies were only possible because lenders now keep detailed electronic records of every transaction. The one variable they don't keep track of is the borrower's race. Once again, though, technology came to the rescue. Fourteen states (including California) will, for a fee, make public the information from their driver's license databaseâinformation that includes the name, race, and Social Security number of the driver. Since the lenders' datasets also included the Social Security numbers of their borrowers (so that they could run credit checks), it was child's play to combine the two different datasets. In fact, because so many people move around from state to state, Cohen and I were able to identify the race of borrowers for thousands upon thousands of loans that took place in all fifty states. We'd know the race of a lot of people who bought cars in Kansas, because sometime earlier or later in their lives they took out a driver's license in California. A study that would have been virtually impossible to do ten years earlier had now become, if not easy, at least relatively straightforward. And in fact, Cohen and I did the study over and over as the cases against all the major automotive lenders moved forward. The cases have been a resounding success: lender after lender has agreed to cap the amount that dealerships can mark up loans. All borrowers regardless of their race are now protected by caps that they don't even know about. Unlike my initial test of a few hundred negotiations, these statistical studies of millions of transactions were only possible because the information now is stored in readily accessible digital records.
Trading in Data
The willingness of states to sell information on the race of their own citizens is just a small part of the commercialization of data. Digitalized data has become a commodity. And both public and private vendors have found the value of aggregating information. For-profit database aggregators like Acxiom and ChoicePoint have flourished. Since its founding in 1997, ChoicePoint has acquired more than seventy smaller database companies. It will sell clients one file that contains not only your credit report but also your motor-vehicle, police, and property records together with birth and death certificates and marriage and divorce decrees. While much of this information was already publicly available, ChoicePoint's billion dollars in annual revenue suggests that there's real value in providing one-stop data-shopping.
And Acxiom is even larger. It maintains consumer information on nearly every household in the United States. Acxiom, which has been called “one of the biggest companies you've never heard of,” manages twenty billion customer records (more than 850 terabytes of raw dataâenough to fill a 2,000-mile tower of one billion diskettes).
Like ChoicePoint, a lot of Acxiom's information is culled from public records. Yet Acxiom combines public census data and tax records with information supplied by corporations and credit card companies that are Acxiom clients. It is the world's leader in CDI, consumer data integration. In the end, Acxiom probably knows the catalogs you get, what shoes you wear, maybe even whether you like dogs or cats. Acxiom assigns every person a thirteen-digit code and places them in one of seventy “lifestyle” segments ranging from “Rolling Stones” to “Timeless Elders.” To Acxiom, a “Shooting Star” is someone who is thirty-six to forty-five, married, no kids yet, wakes up early and goes for runs, watches
Seinfeld
reruns, and travels abroad. These segments are so specific to the time of life and triggering events (such as getting married) that nearly one-third of Americans change their segment each year. By mining its humongous database, Acxiom not only knows what segment you are in today but it can predict what segment you are likely to be in next year.
The rise of Acxiom shows how commercialization has increased the fluidity of information across organizations. Some large retailers like Amazon.com and Wal-Mart simply sell aggregate customer transaction information. Want to know how well Crest toothpaste sells if it's placed higher on the shelf? Target will sell you the answer. But Acxiom also allows vendors to trade information. By providing Acxiom's transaction information about its individual customers, a retailer can gain access to a data warehouse of staggering proportions.
Do the Mash
The popular Internet mantra “information wants to be free” is centrally about the ease of liberating digital data so that it can be exploited by multiple users. The rise of database decision making is driven by the increasing access to what was OPIâother people's information. Until recently, many datasetsâeven inside the same corporationâcouldn't easily be linked together. Even a firm that maintained two different datasets often had trouble linking them if the datasets had incompatible formats or were developed by different software companies. A lot of data were kept in isolated “data silos.”
These technological compatibility constraints are now in retreat. Data files in one format are easily imported and exported to other formats. Tagging systems allow single variables to have multiple names. So a retailer's extra-large clothes can simultaneously be referred to as “XL” and “TG” (for the French term
très grande
). Almost gone are the days where it was impossible to link data stored in non-compatible proprietary formats.
What's more, there is a wealth of non-proprietary information on the web just waiting to be harvested and merged into pre-existing datasets. “Data scraping” is the now-common practice of programming a computer to surf to a set of sites and then systematically copy information into a database. Some data scraping is perniciousâsuch as when spammers scrape email addresses off websites to create their spam lists. But many sites are happy to have their data taken and used by others. Investors looking for fraud or accounting hijinks can scrape data from quarterly SEC filings of all traded corporations. I've used computer programs to scrape data for eBay auctions to create a dataset for a study I'm doing about how people bid on baseball cards.