Skip to Content

Department of Linguistics

Theories of Hearing

Robert Mannell

Historically, there have been two competing theories of hearing, the resonance or place theory and the frequency theory. Crude forms of the resonance theory can be found as far back as 1605, but the beginning of the modern resonance theory can be attributed to Helmholtz in 1857.The frequency theory can be dated back to Rinne in 1865 and Rutherford in 1880. These theories underwent a continuous process of modification throught to the middle of the 20th century. An overview of the development of these theories can be found in Wever (1965) and Gulick (1971).

i. The Resonance or Place Theory

The Place theory, in its most modern form, states that the inner ear acts as a tuned resonator which extracts a spectral representation of the incoming sounds which it passes via the auditory nerve to the brainstem and the auditory cortex. This process involves a tuned resonating membrane, the basilar membrane, with frequency place-mapping. In other words, each position on the basilar membrane is associated with a particular characteristic frequency (CF). A tone of 500Hz, for example, would stimulate most strongly that part of the basilar membrane which has a characteristic frequency of 500Hz. Further, this mapping of frequency to place is linearly related to frequency with CF gradually decreasing as one moves from the oval window to the apex of the cochlea. This kind of frequency to place mapping is called tonotopic mapping and systems with this characteristic are said to display tonotopicity. The place encoded frequency pattern is passed to the brainstem and thence to the auditory cortex, both of which also display tonotopicity. A major premise of the place theories is that the frequency analysis is carried out in the inner ear creating a neural spectrogram that is transmitted to the brain.

ii. The Frequency Theory

Early forms of the frequency theory were sometimes referred to as telephone theories and assumed that the auditory nerve passed complete time domain representations of the incoming acoustic signal to the brain in a manner analogous to the way the waveform of a speech sound is transformed into fluctuating voltages in a telephone line. This theory assumes that a complete time domain representation of the incoming waveform is directly encoded in the pattern of firings of the auditory nerve. This theory, in this simple form, assumes that the auditory nerve can fire at rates of 20 to 20,000 times per second. This is a necessary assumption if the theory is to be able to account for the frequency range of human hearing, 20-20,000 Hz. A major premise of the frequency theories is that the frequency analysis is not carried out in the inner ear, but that a time domain representation is transmitted to the brain and the frequency analysis is carried out in the brain.

iii. Challenges to the Frequency and Place Theories

During the second quarter of the 20th century is became increasingly clear that individual nerve fibres could only fire at rates of between 300 and 500 times per second. Clearly, the firing rate limitations of single neurons could not account for human perception of frequencies up to 20,000Hz if one were to rely on the frequency theory. Groups of neurons, when examined together were found to accurately encode in their firing rates frequencies up to 1000 Hz and with less accuracy, frequencies up to 3000 to 5000 Hz. The volley principle was proposed to deal with this apparent anomaly between the behaviour of single neurons and groups of neurons. The volley principle is based on a military procedure developed to deal with the problem of the slow reload times in old firearms such as the flint-lock rifle. Assume some hypothetical firearm that takes 30 seconds to reload. If 90 soldiers were lined up and told to fire simultaneously then there would be a 30 second lull between each hail of bullets during which the opposing force could advance. If the soldiers were divided into 3 ranks of 30 each, with each rank lined up one in front of the next (squatting, kneeling and standing) and each rank fired 10 seconds after the preceding rank then the soldiers could, as a group, send a hail of bullets off towards the enemy every 10 seconds. In this way, during a thirty second cycle each group would be phase-locked ten seconds or a third of a cycle apart (ie. locked at phases of 0°, 120°, and 240°). In the same way, groups of auditory fibres could be phase locked into some part of each cycle and could therefore fire in volleys. This could extend the range of frequencies which could potentially be encoded by the principles expounded in the frequency theory. Accurate frequency representations could therefore be produced from volleys of nerve fibres as long as the phase-locking of the groups could be maintained accurately. To encode frequencies up to 20,000 Hz by phase locking groups of nerve fibres with individual firing rates of 300-500 Hz would require 40-67 groups of fibres phase locked about 5-9° apart. (1) The phase locking mechanism, however, is only accurate up to frequencies of about 1000Hz (ie. 2-3 volleys per cycle). Phase synchrony or the accuracy of phase locking gradually breaks down above this frequency with resulting inaccuracy in the information that can be passed to the brain utilising nerve firing rates. The volley mechanism completely breaks down by about 3000 to 5000 Hz (ie. by about 10 volleys per cycle).

Examine figure 34 from Wever (1965, p173) for a diagramatic representation of the volley principle and a demonstration of how differences in intensity can be represented by this principle.

Whilst the volley principle could be utilised to extend the frequency theory up to frequencies as high as 3000 to 5000 Hz, it clearly could not explain how we are able to hear frequencies up to 20,000 Hz. The frequency theory could not provide an explanation for our perception of high frequency sounds. To what extent can the place theory account for the totality of human frequency perception? Large numbers of experiments have examined the resonant characteristics of the basilar membrane and the electroacoustic responses (2) of the attached auditory nerve fibres. These experiments have examined the response of the basilar membrane and the auditory nerve to the complete range of frequencies audible to the human ear. Some more recent experiments (see Moore, 1986, chapter 3, esp. p140) have examined the electroacoustic responses of individual auditory nerve fibres to a wide range of frequencies and intensities in order to determine the frequency responce or filter shape at different characteristic frequencies. Such studies have determined the filter characteristics of the auditory system and thus characteristic frequencies down to about 120 Hz. Characteristic frequencies have been difficult to determine below that frequency and it is not clear whether place plays a significant role in the perception of such low frequencies whilst it is clear that the frequency principle is capable of dealing with them. On the other hand, the place principle can be clearly shown to operate up to the maximum audible frequency of 20,000 Hz.

The place principle cannot, however, fully account for our ability to discriminate between two tones of similar frequency ( frequency or pitch discrimination). Frequency discrimination refers to that difference in the frequency between two tones presented sequentially that can just be heard as a change in frequency. These differences are sometimes referred to as just noticeable differences or frequency jnds. Our ability to discriminate frequencies below 1000 Hz is much finer than the bandwidth of the auditory filter at those frequencies. There has been much argument over the last 30 years as to whether the frequency discrimination characteristics of the human auditory system can be explained by the response characteristics of the basilar membrane (ie. by the place theory). Some researchers (Zwicker and Fastl, 1990) strongly support the notion that frequency discrimination can be completely accounted for by the place principle and go so far as to estimate the number of hair cells (5.6) responsible for each frequency jnd step. These conclusions depend, however, on the way the experiments are designed. When the experiment is based on continuous changes in frequency (frequency modulation) then Zwicker's observations are supported. When the experiments are based on the presentation of a single tone followed by a second tone then much smaller jnds are observed. Moore and Glasberg (in Moore, 1986) have shown, however, that Zwicker's conclusions are not accurate and that frequency discrimination at low frequencies is not related to place. Their conclusions have been confirmed by studies of people with hearing loss whose place mechanisms have been seriously distorted but their frequency discrimination is virtually normal. It now seems clear that the place mechanism is not the mechanism responsible for pitch discrimination at frequencies below 1000 Hz.

It is now generally assumed that hearing (as defined by measures of frequency discrimination) is dominated by the frequency principle at low frequencies and the place principle at high frequencies with both principles contributing a mid frequencies.

Bibliography

Gulick, W.L (1971) Hearing: physiology and psychophysics, New York : Oxford University Press, QP461.G84

Moore, B.C.J. (ed.) (1982) An introduction to the psychology of hearing, London : Academic Press, BF251.M66/1982

Moore, B.C.J. (ed.) (1986) Frequency selectivity in hearing, London: Academic, BF251.F73

Wever, E.G. (1965) Theory of hearing, New York : Wiley (reprint of 1949 edition) QP461.W47

Zwicker, E. and Fastl, H. (1990) Psychoacoustic: facts, and models, Berlin ; Springer-Verlag, QP461.Z92/1990

Footnotes

1. These numbers are derived from the following: 20000Hz/500Hz = 40 and 20000Hz/300Hz = 66.67 (but say 67), further 360°/40 = 9° and 360°/67 = 5.4° (but say 5°).

2. An electroacoustic response is the electrical response of a neuron to a sound presented at the ear. It is usually measured by a probe inserted into the cell.