On New York University Week: What is music and what is speech?
Andrew Chang, Leon Levy postdoctoral fellow, asks our brains to listen in.
Andrew Chang is postdoctoral fellow at New York University, supported by National Institute of Health and Leon Levy Scholarship in Neuroscience. He studies the neural mechanisms of auditory perception, and how people use music and speech to interact in the real world.
Is It a Sound of Music…or of Speech?
How do our brains tell the difference between sounds that we experience as music and sounds that we interpret as speech? It might seem obvious that we rarely mix the two up, but the brain’s ability to make this distinction effortlessly and automatically is actually quite remarkable.
When we hear sounds, all acoustic waves activate the inner ear. The signals are then transmitted to increasingly higher order brain areas. The auditory cortex handles different basic aspects of sound, shared across all sound types. From there, the signals move to specialized regions for music or language, allowing us to recognize a melody as distinct from a sentence. But the exact mechanisms of this differentiation remain unclear.
Music and speech differ along several dimensions. The question is, how does the brain “decide” whether to route sounds to more language-relevant or more music-relevant areas?
The experiments in our study explored a simple property of sound called amplitude modulation, which describes how quickly a sound’s volume changes over time. Previous studies showed that speech has a consistent modulation rate of 4-5 hertz, while music has a rate of 1-2 hertz. We hypothesized that this basic pattern might be a universal biological signature.
To test this, we artificially synthesized white noise audio clips with varying modulation rates and varying regularity. We asked a few hundred participants to judge whether these clips sounded “more like speech or music”. By analyzing what clips were being perceived as music or speech, the results revealed a very simple principle: slower, more regular amplitude modulation rates were typically perceived as music. By contrast, aster, irregular rates were mostly judged to be speech.
These findings suggest that our brains use really basic acoustic features to help distinguish between speech and music. This discovery motivates new questions about our brain’s efficiency and the evolutionary roles of speech and music. One unanticipated new dimension concerns the implications of our data to support the treatment of language disorders.