The Universal Sense (16 page)

Read The Universal Sense Online

Authors: Seth Horowitz

BOOK: The Universal Sense
13.45Mb size Format: txt, pdf, ePub

Sound can be very effective at generating a negative response, but the world is full of sounds that evoke other emotions. If you’ve ever been to New York’s Penn Station, near where the Long Island Rail Road waiting area is, you know
that it is not a particularly pretty environment. But one day in the dead of winter when I was waiting for the eternally late train, I suddenly heard birds chirping and singing. My first thought was that some local robins had flown into the station and were singing happily because they were not freezing to death in Central Park. But upon looking around and seeing no birds, I spotted a well-hidden speaker. It was playing a long, looped recording that sounded like a natural, if sparse, birdsong chorus. It was a genius piece of emotional engineering—play sounds associated with the countryside and spring mornings while people are surrounded by urine-drenched pillars and waiting for a train that will no doubt be several years late, and you will take the edge off the tension of hundreds of stressed-out commuters.
27

The basis for positive emotions, whether triggered by sound or anything else, is more complicated to understand. There is no simple anatomical region that underlies positive or complex emotions. This is probably because fear and other negative emotions drive survival behaviors such as running away or fighting, while more complex ones come with developmental, behavioral, and cultural baggage. What is erotic or intriguing or relaxing to me might be dull or horrifying for you. There have been a lot of attempts at modeling complex emotions such as love using animal models—after all, it’s very difficult to get an experimental protocol past a review committee by saying, “We’re going to drop electrodes into people’s brains to identify love areas and inject neural tracers, then slice up their brains to see where love lies.” Scientists have examined maternal care for offspring in everything from rodents to sea lions, pair bonding
in prairie voles (who seem to release the same neurotransmitter, oxytocin, as humans involved in positive social interactions), even mourning behavior in elephants and primates.

But in any of these comparative studies, it’s hard to jump the species barrier for anything more complex than fear, anger, or other reactive emotions—we simply don’t know whether other species experience more subtle feelings. Is a mouse happy when it hears the bell sound that means a treat is forthcoming? Or is it a simple association with a forthcoming reward? The closest neuroscience has come to identifying the sources of complex and positive emotions is identification of reward pathways and systems, such as the septal nuclei and the nucleus accumbens. Both structures are highly involved in pleasure-seeking behaviors. Back in 1954 James Olds and Peter Milner demonstrated that if you implanted stimulating electrodes in the septal nuclei, a rat would continuously stimulate itself, even ignoring food and water. The nucleus accumbens is highly responsive to drugs such as cocaine and heroin and is seen as an important site for generating the reward in satisfying addictive behaviors. Both of these sites are intricately connected with all the deep cortical structures that underlie subconscious motivation. The nucleus accumbens also receives a great deal of dopaminergic input from the ventral tegmental area, a region that connects bidirectionally to all areas of the brain, including those in the deep brain stem that connect to the cochlear nucleus. The presence of a relatively fast deep connective route from the basic auditory nuclei combined with massive inputs from all other regions of the cortex is similar to the arrangement found in the amygdala, and a study in 2005 showed that the nucleus accumbens plays a role in changes in emotional state induced by music.

But therein lies the problem. Neural imaging allows you to
look into the living human brain without doing things that will get you prosecuted under the Geneva Conventions. Much of the current crop of neuroscience research looking at positive emotions arises from neural imaging studies using fMRI, which shows where there is an increase in blood flow in specific brain regions based on presented stimuli. A great many studies have been run that claim to examine the underlying basis of things such as love and attachment, but all too often these claims end up blowing up in the claimant’s face because, to put it simply, complex stimuli elicit complex reactions, and fMRI is a crude tool for measuring them. One of the more glaring examples was reported not in a scientific paper but in a
New York Times
op-ed piece by Martin Lindstrom, a well-known consultant who has done some interesting work in the field of neuroeconomics, the study of how we decide what to buy. The study he reported on involved examining the response of young men and women to the sound or image of a ringing and vibrating iPhone, and the claim was made that subjects showed activation in both visual and auditory cortexes regardless of whether they had heard the iPhone or seen it. This was intended as proof that they were undergoing multisensory integration. The piece went on to claim that because the most activity was seen in the insular cortex, a region that some studies have shown is associated with positive emotion, that the subjects
loved
their iPhone.

This is the type of claim that should make us question neural imaging studies, especially those applied to commercial interests, and in fact it inspired more than forty scientists to write a reply to the paper highlighting the significant problems with this study. First of all, identifying the insular cortex as being “associated with feelings of love and compassion” is not useful,
as this region of the brain gets activated in about one-third of
all
neuroimaging studies. Second, the insular cortex—like most other parts of the brain—is involved in a lot of brain-directed activities, ranging from controlling heart rate and blood pressure to telling if your stomach or bladder is full. In fact, the insular cortex is involved in almost every inward-oriented process we go through. So identifying it as
the
place that makes you love your iPhone made for a great marketing moment but a very poor scientific claim.

As much as we neuroscientists love our tract tracing, EEGs, fMRIs, and PET scans, the complex mind is still much of a black box. When we want to determine what our subjects feel, we sometimes have to fall back on a good old-fashioned technique: asking them. Gathering subject response by questionnaire is even more top-down than running someone through an fMRI machine, but it gives you a handle on the actual cognitive or emotional response rather than worrying if you are identifying the right chunk of brain. Still, questionnaires are fraught with their own limitations. Questions have to be carefully constructed to avoid biasing the subjects. Subjects may answer based on their perception of how they think they are supposed to answer rather than how they actually feel. The environment they are being tested in, a lab or a classroom, may have its own emotional associations for them, so their answers are altered by their environmental context. Sometimes what they actually feel can’t be described by a simple linear scale from “pleasant” to “unpleasant,” or “arousing” to “calming.” On the other hand, the good thing about this technique is that you can pay a
lot
of volunteers to give responses for what it would cost to run even one fMRI scan, and large numbers provide a lot of
statistical power when trying to answer complex questions about the mind. (Plus these statistics then form the basis for running the more expensive tests.)

Not surprisingly, sounds are among the most common and powerful stimuli for emotions. There are a number of standardized psychological databases that assign emotional valence to different non-verbal sounds.
28
These databases have been used for decades and have accrued so much data that they are used as the foundation for studies employing other, more technologically oriented techniques, from EEG to fMRI. But even with this basic a technique, you run into the problem of trying to apply operational definitions to subconscious responses. Imagine you are a fifteenth-century barber/alchemist/wizard who is really interested in categorizing emotional responses to sounds. Lacking a digital recording device, you have your subjects sit in a dark room with a curtained window while you make sounds out of their view. They consider the sounds of crashing armor from a joust to be highly arousing and pleasurable, whereas the low grunting of a wild boar is frightening. Then play these sounds for someone in the twenty-first century (who does not spend a lot of time watching
Game of Thrones
). The crashing armor would probably be considered arousing but annoying, whereas the wild boar call becomes the sound of just some animal rather than the thing that killed half your village. For example, in one database, the sound that was rated as most pleasurable was a ten-second sample of the intro to Joan Jett’s “I
Love Rock n’ Roll.” This seems like an odd choice for most pleasurable sound—except the database came out in the late 1990s and the song was a big hit in 1981, when the people who were the sources of the data largely were impressionable preteens with access to MTV. It’s pretty unlikely that this particular sound effect will stand the test of time, but it’s pretty likely that one of the ones rated most unpleasant—the sound of a child being slapped and then crying—will continue to be rated the same way.

Sounds that are familiar will be processed faster, and those that involve things important to humans in general, such as hearing one of our young being made unhappy, will generate a stronger emotional response even if you have been tempted to slap a kid yourself after listening to it crying on a plane for six hours. However, another study by Melanie Aeschlimann and colleagues in 2008 pointed out that using a wide variety of sounds of different lengths and loudness could introduce too many processing artifacts. Since you are able to respond to sounds in hundredths of a second or less, a ten-second-long sample could cause a listener to only respond to the last few seconds or respond on some sort of internal summing mechanism. Perhaps the rating “most unpleasant” was based not on hearing the sound as child abuse but rather on a previous experience of a child’s sustained crying. This study proposed a completely different database based on samples two seconds long with a somewhat different rating metric. The samples were not complex sounds or speech, but rather human non-linguistic sounds (screams, laughs, erotic sounds) or non-human sounds such as alarm clocks. Using these very short samples, the researchers found that there was a lot of overlap with ratings of longer samples, indicating that the emotional valence kicks in very rapidly. However, a
few interesting things popped up. First, sounds that had negative emotional responses tended to be perceived as louder even though they were at the same amplitude. Second, the strongest ratings were associated with those sounds labeled as emotionally positive. And finally, the sounds with the strongest emotional response in any category were human vocalizations.

Sounds that evoke the strongest emotional response tend to be those from living things, especially other humans. Mechanical or environmental sounds tend to grab your attention but are usually limited in how far they will take your emotional response, unless they tell you of specific dangers you want to confirm visually (such as the sound of a rockslide) or unless they have strong associations (such as the sounds of waves at a beach). Sounds deliberately made by living things are almost always communicative—the dog’s growl, the frog’s croak, the baby’s cry—and are almost always harmonic (with variations from harmonicity bringing their own emotional response, such as cringing when someone screams).

The frequencies that make up human sounds (including but not limited to speech) and sounds from animals about our own size are in our most sensitive region, so it is easier for us to hear these sounds—they will jump out of the background. In addition, sounds that we have heard before are more easily identified and reacted to more quickly. Add on top of that the fact that we process low-level sensory information such as changes in tone and loudness faster than we do complex input such as speech, and we begin to understand why we can react quickly to the emotional content of sound. You don’t need to have been bitten by a dog to know that a low growl is menacing or have been mugged by a squirrel to know that a harsh barking sound means you’re too close to its territory. Overall,
any
communication
is about first evoking an emotional response on the part of the listener; humans just glue semantic content on top of it, giving us a tenuous hold on the title of most intelligent beings on Earth.

I will not delve into the intricacies of human speech (and thus triple the size of this book), but I will note that the emotional basis of communication relies not on
what
is said but on the acoustics of
how
we say something, independent of the formants of speech and to some extent what language is spoken. This flow of tones, called
prosody
, was first discussed by Charles Darwin as a predecessor of human speech, and to some degree the idea still holds up today. Both neural imaging and EEG studies have demonstrated that prosody is processed not in Wernicke’s region of the brain (which underlies speech comprehension) but rather in the right hemisphere. This is the side opposite to where the linguistic processing centers are in most people but in the region more important for contextual, spatial, and emotional processing.

For a simple demonstration, just say the word “yes.” Now say it as if you had just found out you won the lottery. Now say it as if someone had just asked you a question about something in your past that you thought no one knew about. Now say it as if this is the fortieth time you’ve answered “yes” in a really boring human resources interview about how much you love your job. Lastly, say it as if you were just forced to agree to a really horrible contract in order to keep from losing your job. Linguistically, you indicated affirmation every time, but each time the emotional meaning differed. What you changed was
how
you said the words: overall pitch, loudness, and timing. And someone listening to you who was also a native speaker of your language would get the subtext, which is sometimes even more
important than the ostensible meaning of the utterance. For example, think about how hard it can be to understand some synthesized speech patterns, particularly older ones. Speech synthesizers from the 1980s until the late 1990s were pretty much playbacks of recorded phonemes with no reliance on prosodic flow—in other words, the speech sounded like it was coming from a robot. Even today, after millions of dollars poured into making synthetic speech sound more “human,” you can still hear the difference between a human voice and, say, the iPhone “operator” Siri.

Other books

Suture Self by Mary Daheim
The Fox Steals Home by Matt Christopher
The Forbidden Lady by Kerrelyn Sparks
Cherry Money Baby by John M. Cusick
The Fairy Tales Collection by Elizabeth Kelly
Ascension by Grace, Sable
Paint the Wind by Pam Munoz Ryan
Acceptable Risk by Candace Blevins