Skip to Content

Department of Linguistics

SPEECH ACOUSTICS

Frequency Transformations of Speech
Frequency Selectivity

Robert Mannell

Overview of Frequency Selectivity

Frequency selectivity (or resolution) describes the ability of a listener to discriminate two peaks in a complex spectrum. If two peaks are separated by more than the auditory frequency resolution (for that part of the spectrum) then those two peaks independently contribute to the perceived sound quality of the complex sound. If two peaks are separated by less than the auditory frequency resolution then they are effectively perceived as a single broader peak.

The auditory system of a normally hearing person is characterised by finer frequency resolution at low frequencies and progressively coarser frequency resolution at higher frequencies. The frequency selectivity of the auditory system is related to the "bandwidth of the auditory filter" at each point on the basilar membrane. Each inner hair cell on the basilar membrane responds best to its characteristic frequency and to a lesser extent to surrounding frequencies. The response pattern to sound of an inner hair cell is a direct consequence of the mechanical response of the particular position on the basilar membrane to which it is attached. An inner hair cell is a filter in the sense that it responds selectively to a certain range of frequencies and not to frequencies outside that range. The range of frequencies that a hair cell responds to is referred to as the bandwidth of the auditory filter at that point on the basilar membrane.

Attempts to measure the frequency selectivity of the auditory filter are attempts to measure the bandwidth of the auditory filter at different points on the basilar membrane.

Older research on the frequency selectivity of the auditory filter (dating from the late 1950's) measured critical bandwidths (CB) using a number of techniques. Once the critical bandwidths at varous parts of the basilar membrane have been determined it is a simple matter to derive from them a frequency selectivity scale. This is done by placing contiguous, non-overlapping, auditory filters side-by-side starting from 0 Hz. The scale derived from the CB measurements was called the Bark scale. 1 Bark is one CB from 0 Hz, 2 Bark is 2 CB from 0 Hz, and so forth up to about 24 Bark. Remember that each successive CB is broader in Hertz than the immediately lower band, so whilst each auditory filter has a bandwidth of 1 Bark, the bandwidth in Hertz increases as filter centre frequency increases.

More recent studies (dating from the mid 1980's) have determined auditory bandwidths more accurately using noise masking techniques. The filter bandwidths derived from these studies are referred to as Equivalent Rectangular Bandwidths (ERBs) and the derived frequency selectivity scale is referred to as the ERB-rate scale (although, in practice, this is often also referred to as the ERB scale).

Figure 1: Relationship between Critical Bandwidths (CB) on the Bark scale, and Equivalent Rectangular bandwidths (ERB) on the ERB-rate scale.

Figure 1 clearly shows that ERB is always less than CB but that below about 600 Hz the Bark scale is shown to be progressively more inaccurate as frequency decreases.

Figure 2: Ratio of Equivalent Rectangular bandwidths (ERB) to Critical Bandwidths (CB).

For more information on frequency selectivity, click here.

Bark-scaled and ERB-scaled Representations of Speech

The links in the following table will take you to Hertz, Bark and ERB spectrograms and spectra of some simple speech tokens.