May 17, 2018
As a speech scientist, I never thought I’d see so much excitement on social media about one tiny little word.
The clip, which went viral after being posted on Reddit, is polarizing listeners who hear a computer voice say either “Laurel” or “Yanny.” @AlexWelke tweeted, “This is the kinda stuff that starts wars.” While I can’t prevent a war, I can explain some reasons why this sound file has created such a controversy. Basically, the “word” relies on some tricks of acoustics. Your brain, and those of the millions of other Twitter viewers, is responsible for the rest.
Kudos to University of Minnesota speech-language researcher and professor Ben Munson for his original analysis explaining how the acoustic file can lead listeners to one of two conclusions. He used spectrographic analysis to demonstrate how the sound file might create confusion.
The discrepancy in what people hear comes down to a few different possibilities, none of which sort it out for certain. Clearly, though, one cause of its trickiness is that the sound file is synthesized, which is different than real speech. It’s akin to the synthetic flavors encountered in the candy world – think Jelly Belly Buttered Popcorn, the preference for which is as polarizing as this Yanny/Laurel thing.
Without a doubt, all this confusion is only possible because of the consonants in “Yanny” and “Laurel.” The “y,” “n,” “l” and “r” sounds are really the chameleons of speech. The way one pronounces them morphs based on the sounds that come before and after them in a word. Because of this, it is the brain of the listener that decides their identity, based on context. In this case, the sound is missing a few elements and your brain automatically makes a judgment, called interpolation, similar to how you can so easily read partially erased text.
What do you hear?! Yanny or Laurel pic.twitter.com/jvHhCbMc8I— Cloe Feldman (@CloeCouture) May 15, 2018
The fact that, for the life of me, I can only hear “Laurel” is because of a phenomenon called categorical perception. Originally described in 1957 and supported by countless additional studies, the idea is that your brain naturally sorts things into categories.
For example, my husband and I can never agree on the color of our couch (definitely green, not black, by the way), because while there is easily a continuum between very dark green and black, the boundaries between them vary for everyone. While we could agree that our couch looks blackish green, there is no such compromise in the perception of speech. Without conscious effort, our brain decides what our ears are hearing. Black or green, not blackish green. Yanny or Laurel, not some blend.
Whatever your brain tells you about Yanny/Laurel, the whole controversy should help everyone understand why it’s so hard to have a conversation in a noisy restaurant or why people with hearing loss sometimes “mishear” what you have to say. Listening to speech feels like a basic skill, but understanding speech is really an amazing feat. People perceive messages using the information available, which is sometimes incomplete. Our brains also make predictions based on past experiences. Listening in a foreign language, even if you are a fluent speaker, is challenging because your brain uses predictions based on both languages, but is unduly influenced by your experiences with your native language.
This internet hullabaloo underscores the marvelous, effortless, constant work of the human brain.