Everything Is Obvious (25 page)

Read Everything Is Obvious Online

Authors: Duncan J. Watts

BOOK: Everything Is Obvious
5.62Mb size Format: txt, pdf, ePub

As creative as they are, these examples of crowdsourcing
work best for media sites that already attract millions of visitors, and so automatically generate real-time information about what people like or don’t like. So if you’re not Bravo or Cheezburger or BuzzFeed—if you’re just some boring company that makes widgets or greeting cards or whatnot—how can you tap into the power of the crowd? Fortunately, crowdsourcing services like Amazon’s Mechanical Turk (which Winter Mason and I used to run our experiments on pay and performance that I discussed in
Chapter 2
) can also be used to perform fast and inexpensive market research. Unsure what to call your next book? Rather than tossing around ideas with your editor, you can run a quick poll on Mechanical Turk and get a thousand opinions in a matter of hours, for about $10—or better yet, have the “turkers” come up with the suggestions as well as voting on them. Looking to get feedback on some design choices for a new product or advertising campaign? Throw the images up on Mechanical Turk and have users vote. Want an independent evaluation of your search engine results? Strip off the labels and throw your results up on Mechanical Turk next to your competitors’, and let real Web users decide. Wondering if the media is biased against your candidate? Scrape several hundred news stories off the Web and have the turkers read them and rate them for positive or negative sentiment—all over a weekend.
9

Clearly Mechanical Turk, along with other potential crowdsourcing solutions, comes with some limitations—most obviously the representativeness and reliability of the turkers. To many people it seems strange that anyone would work for pennies on mundane tasks, and therefore one might suspect either that the turkers are not representative of the general population or else that they do not take the work seriously. These are certainly valid concerns, but as the Mechanical Turk community matures, and as researchers learn
more about it, the problems seem increasingly manageable. Turkers, for example, are far more diverse and representative than researchers initially suspected, and several recent studies have shown that they exhibit comparable reliability to “expert” workers. Finally, even where their reliability is poor—which sometimes it is—it can often be boosted through simple techniques, like soliciting independent ratings for every piece of content from several different turkers and taking the majority or the average score.
10

PREDICTING THE PRESENT

At a higher level, the Web as a whole can also be viewed as a form of crowdsourcing. Hundreds of millions of people are increasingly turning to search engines for information and research, spending ever more time browsing news, entertainment, shopping, and travel sites, and increasingly sharing content and information with their friends via social networking sites like Facebook and Twitter. In principle, therefore, one might be able to aggregate all this activity to form a real-time picture of the world as viewed through the interests, concerns, and intentions of the global population of Internet users. By counting the number of searches for influenza-related terms like “flu,” and “flu shots,” for example, researchers at Google and Yahoo! have been able to estimate influenza caseloads remarkably close to those reported by the CDC.
11
Facebook, meanwhile, publishes a “gross national happiness” index based on users’ status updates,
12
while Yahoo! compiles an annual list of most-searched-for items that serves as a rough guide to the cultural zeitgeist.
13
In the near future, no doubt, it will be possible to combine search and update data, along with tweets on Twitter, check-ins on Foursquare, and many other sources to develop more specific indices associated with
real estate or auto sales or hotel vacancy rates—not just nationally, but down to the local level.
14

Once properly developed and calibrated, Web-based indices such as these could enable businesses and governments alike to measure and react to the preferences and moods of their respective audiences—what Google’s chief economist Hal Varian calls “predicting the present.” In some cases, in fact, it may even be possible to use the crowd to make predictions about the near future. Consumers contemplating buying a new camera, for example, may search to compare models. Moviegoers may search to determine the opening date of an upcoming film or to locate cinemas showing it. And individuals planning a vacation may search for places of interest and look up airline costs or price hotel rooms. If so, it follows that by aggregating counts of search queries related to retail activity, moviegoing, or travel, one might be able to make near-term predictions about behavior of economic, cultural, or political interest.

Determining what kind of behavior can be predicted using searches, as well as the accuracy of such predictions and the timescale over which predictions can be usefully made are therefore all questions that researchers are beginning to address. For example, my colleagues at Yahoo! and I recently studied the usefulness of search-query volume to predict the opening weekend box office revenues of feature films, the first-month sales of newly released video games, and the Billboard “Hot 100” ranking of popular songs. All these predictions were made at most a few weeks in advance of the event itself, so we are not talking about long-term predictions here—as discussed in the previous chapter, those are much harder to make. Nevertheless, even having a slightly better idea a week in advance of audience interest might help
a movie studio or a distributor decide how many screens to devote to which movies in different local regions.
15

What we found is that the improvement one can get from search queries over other types of public data—like production budgets or distribution plans—is small but significant. As I discussed in the last chapter, simple models based on historical data are surprisingly hard to outperform, and the same rule applies to search-related data as well. But there are still plenty of ways in which search and other Web-based data could help with predictions. Sometimes, for example, you won’t have access to reliable sources of historical data—say you’re launching a new game that isn’t like games you’ve launched in the past, or because you don’t have access to a competitor’s sales figures. And sometimes, as I’ve also discussed, the future is not like the past—such as when normally placid economic indicators suddenly increase in volatility or historically rising housing prices abruptly crash—and in these circumstances prediction methods based on historical data can be expected to perform poorly. Whenever historical data is unavailable or is simply uninformative, therefore, having access to the real-time state of collective consciousness—as revealed by what people are searching for—might give you a valuable edge.

In general, the power of the Web to facilitate measure-and-react strategies ought to be exciting news for business, scientists, and government alike. But it’s important to keep in mind that the principle of measure and react is not restricted to Web-based technology, as indeed the very non-Web company Zara exemplifies. The real point is that our increasing ability to measure the state of the world ought to change the conventional mind-set toward planning. Rather than predicting how people will behave and attempting to design ways
to make consumers respond in a particular way—whether to an advertisement, a product, or a policy—we can instead measure directly how they respond to a whole range of possibilities, and react accordingly. In other words, the shift from “predict and control” to “measure and react” is not just technological—although technology is needed—but psychological. Only once we concede that we cannot depend on our ability to predict the future are we open to a process that discovers it.
16

DON’T JUST MEASURE: EXPERIMENT

In many circumstances, however, merely improving our ability to measure things does not, on its own, tell us what we need to know. For example, a colleague of mine recently related a conversation he’d had with the CFO of a major American corporation who confided that in the previous year his company had spent about $400 million on “brand advertising,” meaning that it was not advertising particular products or services—just the brand. How effective was that money? According to my colleague, the CFO had lamented that he didn’t know whether the correct number should have been $400 million or zero. Now let’s think about that for a second. The CFO wasn’t saying that the $400 million hadn’t been effective—he was saying that he had
no idea
how effective it had been. As far as he could tell, it was entirely possible that if they had spent no money on brand advertising at all, their performance would have been no different. Alternatively, not spending the money might have been a disaster. He just didn’t know.

Now, $400 million might seem like a lot of money not to know about, but in reality it’s a drop in the ocean. Every year, US corporations collectively spend about $500
billion
on marketing, and there’s no reason to think that this CFO was any different from CFOs at other companies—more honest perhaps, but not any more or less certain. So really we should be asking the same question about the whole $500 billion. How much effect on consumer behavior does it really have? Does anybody have any idea? When pressed on this point, advertisers often quote the department-store magnate John Wanamaker, who is reputed to have said that “half the money I spend on advertising is wasted—I just don’t know which half.” It’s entirely apropos and always seems to get a laugh. But what many people don’t appreciate is that Wanamaker uttered it almost a century ago, around the time when Einstein published his theory of general relativity. How is it that in spite of the incredible scientific and technological boom since Wanamaker’s time—penicillin, the atomic bomb, DNA, lasers, space flight, supercomputers, the Internet—his puzzlement remains as relevant today as it was then?

It’s certainly
not
because advertisers haven’t gotten better at measuring things. With their own electronic sales databases, third-party ratings agencies like Nielsen and comScore, and the recent tidal wave of clickstream data online, advertisers can measure many more variables, and at far greater resolution, than Wanamaker could. Arguably, in fact, the advertising world has more data than it knows what to do with. No, the real problem is that what advertisers want to know is whether their advertising is
causing
increased sales; yet almost always what they measure is the
correlation
between the two.

In theory, of course, everyone “knows” that correlation and causation are different, but it’s so easy to get the two mixed up in practice that we do it all the time. If we go on a diet and then subsequently lose weight, it’s all too tempting to conclude that the diet caused the weight loss. Yet often when
people go on diets, they change other aspects of their lives as well—like exercising more or sleeping more or simply paying more attention to what they’re eating. Any of these other changes, or more likely some combination of them, could be just as responsible for the weight loss as the particular choice of diet. But because it is the diet they are focused on, not these other changes, it is the diet to which they attribute the effect. Likewise, every ad campaign takes place in a world where lots of other factors are changing as well. Advertisers, for example, often set their budgets for the upcoming year as a function of their anticipated sales volume, or increase their spending during peak shopping periods like the holidays. Both these strategies will have the effect that sales and advertising will tend to be correlated whether or not the advertising is causing anything at all. But as with the diet, it is the advertising effort on which the business focuses its attention; thus if sales or some other metric of interest subsequently increases, it’s tempting to conclude that it was the advertising, and not something else, that caused the increase.
17

Differentiating correlation from causation can be extremely tricky in general. But one simple solution, at least in principle, is to run an experiment in which the “treatment”—whether the diet or the ad campaign—is applied in some cases and not in others. If the effect of interest (weight loss, increased sales, etc.) happens significantly more in the presence of the treatment than it does in the “control” group, we can conclude that it is in fact causing the effect. If it doesn’t, we can’t. In medical science, remember, a drug can be approved by the FDA only after it has been subjected to field studies in which some people are randomly assigned to receive the drug while others are randomly assigned to receive either nothing or a placebo. Only if people taking the drug get better more
frequently than people who don’t take the drug is the drug company allowed to claim that it works.

Precisely the same reasoning ought to apply in advertising. Without experiments, it’s actually close to impossible to ascertain cause and effect, and therefore to measure the
real
return on investment of an advertising campaign. Let’s say, for example, that a new product launch is accompanied by an advertising campaign, and the product sells like hotcakes. Clearly one could compute a return on investment based on how much was spent on the campaign and how much sales were generated, and that’s generally what advertisers do. But what if the item was simply a great product that would have sold just as well anyway, even with no advertising campaign at all? Then clearly that money was wasted. Alternatively, what if a different campaign would have generated twice as many sales for the same cost? Once again, in a relative sense the campaign generated a poor return on investment, even though it “worked.”
18

Without experiments, moreover, it’s extremely difficult to measure how much of the apparent effect of an ad was due simply to the predisposition of the person viewing it. It is often noted, for example, that search ads—the sponsored links you see on the right-hand side of a search results page—perform much better than display ads that appear on most other Web pages. But why is that? A big part of the reason is that which sponsored links you see depends very heavily on what you just searched for. People searching for “Visa card” are very likely to see ads for credit card vendors, while people searching for “Botox treatments” are likely to see ads for dermatologists. But these people are also more likely to be interested precisely in what those particular advertisers have to offer. As a result, the fact that someone who clicked on an ad for
a Visa card subsequently signs up for one can only be partly attributed to the ad itself, for the simple reason that the same consumer might have signed up for the card anyway.

Other books

Infamous by Virginia Henley
Trouble With the Law by Becky McGraw
Eternal by Glass, Debra
The Loyal Heart by Merry Farmer
Warrior of Scorpio by Alan Burt Akers
Muertos de papel by Alicia Giménez Bartlett
Thrown a Curve by Sara Griffiths