Department of Linguistics


Impressionistic Studies of Australian English Phonetics

Robert Mannell

Important: If you have not yet either installed the phonetic font "Charis SIL" or tested this installation to determine if the phonetic characters installed properly then click here to go to the phonetic font help pages.

Comments About This Topic

Most of this topic is provided as background reading only. The discussion on early attempts at descriptions of Australian English, up to and including Sidney Baker, should be considered to be non-assessable historical background. The studies by Mitchell and Delbridge are more relevant to the content of this subject. The material on nasality, assimilation, deletion and intonation is also only provided as non-assessable background material. The focus of the module to which this material belongs is the vowel system of Australian English and students should focus particularly on the vowels when reading this material.

What you should understand from this material is:-

  • The nature of Impressionistic Phonetics and its strengths and shortcomings
  • The general findings regarding the nature of Australian English vowels (particularly the six vowels that differ greatly between varieties of Australian English)
  • The historical background behind the symbols traditionally used for the phonemic (broad) transcription of Australian vowels
  • The division of Australian English into Broad, General and Cultivated varieties. You should understand that these "varieties" are not distinct and separate but represent an arbitrary division of a dialect continuum.

Descriptions of A.E. Pronunciation by non-linguists.

Attempts at the deciphering of early comments and descriptions of Australian English are fraught with many problems. Many non-linguists made comments about the pronunciation of the inhabitants of the distant colonies in Australia and these were often written up in various publications devoted to the impressions and experiences of travelers. For the most part, the comments of travelers about A.E. speech tended to be uncomplimentary and often included various unfounded notions about the nature or causes of the A.E. pronunciation that persist in the popular mythology to this day. It was often noted that A.E. speech tended to be "slovenly", poorly articulated, "nasal", etc. and such afflictions were variously ascribed to the climate, excessive pollen in the air, alleged laziness of the people, inferiority complexes, social inhibition, etc.

Another possible source of information might be expected to be found in the works of various fiction writers (eg. Henry Lawson) who attempted to capture the Australian vernacular language in the orthography. Of course, such attempts persist up to this day (eg. "Let Stalk Strine" by Afferbek Lauder) and it is not difficult to see that these attempts vary from writer to writer, are extremely difficult to interpret phonetically, and usually do little more than highlight the alleged non-standard nature of the pronunciations they attempt to mimic.

Another source of information about early A.E. pronunciation is denied us because English spelling had been standardised well before the advent of the British invasion of Australia. Because of this, letters, diary accounts etc. cannot be utilised since those who were literate were generally scrupulous in their attempts to spell correctly and so their spelling can't throw much light on their pronunciations as similar sources often do for far more ancient languages (eg. Greek, Latin, Old English).

Impressionistic phonetics

Before a description of the pronunciation of a language or dialect can be carried out there is a need for a consistent system of symbols to represent that speech. Ideally, such a system would have one (and only one) symbol for each sound and one sound to each symbol. It takes only a little reflection to realise that such systems are many in number and have been in existence for a very long time. The earliest forms of alphabet were probably designed with this goal in mind. Unfortunately, such systems have a tendency to become conservative and often do not keep up with actual pronunciation and phonological changes in a language. Quite often this means that new sounds incorporated into a language are represented by a symbol that already belongs to another sound. This is particularly true when two allophones of a single phoneme begin to be contrasted in the language. For example, some dialects of Arabic now have a contrast between a "normal l" [l] and an "emphatic l" [lˁ](ie. secondary pharyngeal articulation). Both are represented in orthography by a single character as this represents the phonology of the parent classical Arabic. Conversely, sometimes two speech sounds can either lose their contrastiveness and become incorporated into a single phoneme (eg. [k] and [c] in English) or a sound can be lost altogether (eg. an emphatic or pharyngealised "s" [sˁ] in Classical Hebrew and analogous to a similar sound still spoken in Arabic has been lost by Modern Hebrew). The symbol that originally referred to the lost or incorporated sound is often maintained and used to refer to the remaining sound or phoneme (eg. "k" and "c" in English, and more than one symbol referring to [s] in modern Hebrew). A further problem arises when spelling itself becomes fossilised and no longer represents the current pronunciation of a word but rather that of an archaic pronunciation. In English, much of this standardisation of spelling had occurred between the times of Chaucer and Shakespeare. In Arabic, spelling had standardised at the time of the writing of the Qur'an in the 8th century A.D. Some languages maintain a fairly "phonetic" orthography, but very often the orthography has only an approximate match to the sound structure of a language or dialect. Further, no orthography is equipped to represent slight interpersonal variation, regional variation, and the complete picture of the language's allophonic variation.

For all of the above reasons, it is necessary to formulate a system that attempts to represent all of the sounds that are utilised in all human languages in either a contrastive or a non-contrastive manner. In the European tradition there have been several attempts to devise such a system with varying success (eg. Robert Robinson (1617), Melville Bell's "visible speech" (1867) and A.J. Ellis' "glossic" (~1870's)), and this culminated in the development of the International Phonetic Alphabet (IPA) in the late 19th century. The refinement of IPA continues to this day. A particularly useful idea was the development of a system of standard reference vowels which could be used to define the quality of any vowel. Vowels occupy a continuum of possible articulations and using such a standard system it is possible to locate any vowel relative to the closest standard vowel. The best known such system today is Daniel Jones' system of "cardinal vowels".

Impressionistic phonetics, to be practiced successfully, requires that the intending practitioner be thoroughly trained by a teacher already skilled in transcription. Ideally the practitioner must not only be able to hear the sounds but also to pronounce them as well. It is extremely undesirable to try to learn such a system from a book without the aid of at least some sort of recorded version of the sound to which each symbol refers. Instructions such as "similar to the French vowel in the second syllable of ... " are really not adequate guides to the accurate acquisition of transcription skills. In spite of this the first two attempts at description of Australian English were by authors who had learnt phonetic transcription from books. As will be seen below, only one of these two attempts could be considered to be reasonably successful.

Impressionistic phonetics is predominantly an auditory skill requiring the practitioner to listen carefully to the utterance to be transcribed and then to record the utterance using an appropriate phonetic script. Additionally, the impressionistic phonetician may use sight to confirm auditory impressions (eg. lip posture). The transcription is only as good as the phonetician's skill and the only way a reader has of determining the extent of that skill is through awareness of reputation, comparisons with the transcriptions of other phoneticians etc. For this reason most phoneticians tend to be wary of transcriptions made by people not trained by an established institution or scholar.

The chief disadvantage of impressionistic phonetics (ie. compared to instrumental phonetics) is this element of potential unreliability. It tends to lack objectivity. An instrumental (acoustic) study by Ladefoged (1967) of the production of cardinal vowels by 11 highly trained phoneticians showed considerable variation between their attempts and even overlap between adjacent cardinals. This is in spite of the fact that all the measured vowels were judged by Daniel Jones (the originator of the most commonly used cardinal vowel system) to be good productions of the target vowel. This may be due, at least in part, to the complex nature of speech perception, and especially to the apparent ability of a listener to "normalise" a speaker's vowels (which will differ with different vocal tract dimensions) to fit into a standard template (mental representation) in which the effects of differing vocal tract dimensions have been removed. An informal study (ca. early to mid 1990s) by the present author of the production of an Australian phonetician of the cardinal vowels may suggest a further process at work. The speaker produced several sets of cardinal vowels and he and two other trained and experienced phoneticians then selected from these the best attempts or if unsatisfied requested further recordings. It was particularly interesting to note the relative position of the third cardinal. This was found to be significantly higher than that of the British phoneticians. This allowed the A.E. /æ/ vowel to fit between cardinal 3 and 4. When this AE vowel is mapped onto the cardinal vowel productions of Daniel Jones it is found to very close to his version of cardinal 3. When it is considered that, for many speakers, the A.E. /æ/ is higher than the British /æ/ the possibility arises that even trained phoneticians are not immune from the influences of their own dialects when it comes to the learning of the cardinal vowels. We are told that /æ/ is to be found between cardinal 3 and 4 (as it is in British English) and so we oblige by shifting our cardinal 3 upwards so that /æ/ can fit below it. In more recent years the /æ/ vowel has moved progressively downwards, towards cardinal 4, for many speakers (especially younger speakers) of Australian English so it might be perdicted that this raising of cardinal 3 for Australian English phoneticians will have disappeared.

The main advantage of impressionistic phonetics is that of data compression. The human ear is able to reduce an extremely complex acoustic signal into a series of discrete symbols. An acoustic analysis, on the other hand, produces a vast amount of data which ultimately needs to be much reduced before its significance can be grasped by a human mind. Further, it is sometimes stated by phoneticians that the "auditory world" is really quite different to that of the "acoustic world". If we want to study the information presented to the linguistic processing systems in the brain then the only way to do that accurately is to listen. This position becomes weaker as it becomes possible to more adequately model the effects of the auditory system on the acoustic waveforms. Before impressionistic phonetics can be finally laid to rest however, it will also be necessary to also accurately model the processes of speech perception. We do not, at this stage, even remotely approach this position.

Samuel McBurney (1887)

This study (quoted in Ellis (1889)) was the first systematic characterisation of AE. McBurney was a Scottish school teacher who lived and taught in Australia for several years and who traveled widely in Australia and New Zealand recording the pronunciations of the local people. He was phonetically self-trained (he had read Melville Bell's "Visible Speech" and A.J. Ellis' "Pronunciation for singers") and made all of his transcriptions in Ellis' "glossic" script (nb. this was later converted to another script "palaeotype" by Ellis before publication). He recorded the pronunciations of school children in Victoria, Tasmania, Sydney, Brisbane and New Zealand and further indicated the frequency of usage of each type and, if relevant, preferences by boys and girls. He records common variants for seven of the vowels, five of which are included in Mitchell's list of the six vowel phonemes which especially distinguish between the varieties of AE, ie. /ʉː/, /æɪ/, /əʉ/, /ɑe/, and /æɔ/ (but curiously, not /iː/) as well as recording the "poor" [poː] ~ [poə] ~ [puə] and the "dance" [daːns] ~ [dæns] variants. The lack of description of /iː/ variants is surprising when it is considered that it is one of the most striking causes of variation between AE varieties. This suggests either that these variations in /iː/ did not exist then, or, that McBurney may simply have missed them because all of the AE variants may have not seemed strange to his ears. In other words, it may be that he was only prompted to record those AE sounds that varied from his own experience in the British Isles. McBurney's account, for the most part, can be considered most perceptive.

/iː/ ii
/ʉː/ u, uu, ə'uu
/æɪ/ ee, éei, éi, ææ'i
/əʉ/ oo, á'u
/ɑe/ i, A'i, 'i
/æɔ/ éeu, ææ'u, áa'u, u, 'u

Table 1: Transcriptions of 6 key vowels as described by Samuel McBurney (1887). In this table, the doubling of a symbol is used to denote longer duration. Most of the symbols have similar values to the modern IPA values. [A] is most likely similar to the low-central vowel [ɐ].

Sidney J. Baker (1945)

Baker devoted a chapter of his book "The Australian Language" to a phonetic description of Australian English. His description, although rather amateurish, wasn't entirely inaccurate. He was clearly well read in the available literature of the time and highly critical of the many myths that had been perpetuated about AE, and yet he still persisted in using adjectives such as "slovenly" in his descriptions of certain varieties of the dialect. He had obviously studied Daniel Jones' cardinal vowels, and for the most part used them reasonably well, but he managed to record three and four target allophones of some vowels that can only be described as ludicrous.
eg. /iː/ = [ij, əɪ, iə, ɪi, eɪi, eɪij]

The allophones containing [j] are certainly out of line with modern ideas of the nature (consonants) and distribution (non-post-vocalic) of semi-vowels. Further, the [e-ɪ-i] transitions are overstated as a transition from [e] to [i] in AE passes directly through the [ɪ] position, and further, these sequences are probably an attempt at transcribing the very common schwa-[i] glide or diphthong (ie. [əi, əi]) so common in broad AE.

He quite correctly contrasted AE with Cockney, whilst many earlier writers, including McBurney, tended to at least partially equate the two dialects. He also described three versions of AE which are similar to Mitchell's two, and later three categories of AE. In a later edition he strongly criticised Mitchell for having taken twenty years to give up the notion of two varieties in favour of three. "In the long interval ... his phonetics students had been misled as to the nature of Australian speech" (Baker's revised edition, 1966, p463). He failed to see the division of AE into two, three or however many varieties as no more than a convenient dividing up of a continuum and not the identification of three separate and separable varieties.

A.G. Mitchell (1946)

Mitchell (1946) is the first comprehensive phonetic description of AE by a professionally trained phonetician. Like Baker, he used Jones' cardinal vowel diagram on which to plot his AE vowels, but unlike Baker, he was trained in its use and familiar with Jones' actual pronunciation of the cardinal vowels. He noted, for example, that the AE vowels(1) /ɪ/,/e/,/æ/,/ɜː/,/ə/, and /oː/ are noticeably closer (higher tongue position) than their RP (British Received Pronunciation) equivalents. The centring diphthongs are accordingly also closer in AE than RP. RP /e/ and /æ/, for example, are both half way between cardinals [e] - [ɛ], and [ɛ] - [a] respectively, whilst the AE vowels are much closer to the higher cardinal in each case. Also, AE /oː/ is above cardinal [ɔ] and quite close to cardinal [o] whilst RP /oː/ is below cardinal [ɔ] and is traditionally depicted by "/ɔ/" (a system which Mitchell and Delbridge followed). Mitchell described AE /ɐ/ and /ɐː/ as being low front vowels (which may have been true in the 1940's) and so he used the symbol "/a/" to represent /ɐː/. He continued to use the RP symbol "/ʌ/" for /ɐ/, a symbol much better suited for the description of RP pronunciation. In other words, he classified /ɐ/ and /ɐː/ as front vowels in AE and back vowels in RP. This pattern has been largely confirmed by later acoustic studies although it is now widely considered that /ɐ/ and /ɐː/ in AE are actually central vowels (close to the sounds depicted by the IPA symbols [ɐ] and [ɐː]).

In the original edition of Mitchell's study, he identified two varieties of AE, "Broad" and "Educated" (later called "Cultivated"). Educated speakers have 5 closing diphthongs /æɪ/,/ɑe/,/ɔɪ/,/æɔ/,/əʉ/, whilst Broad speakers have seven (the above five plus /iː/ = [əi] and /ʉː/ = [əu]). He identified the six vowels found in the words "beat", "boot", "say", "so", "high", "how" as those sounds which, more than any other sounds, differentiated between Broad and Educated speakers. Further, he noted, that the broader a person's speech the wider the glide and the longer it takes.

/æɪ/ [eɪ ~ ʌɪ]
/ɑe/ [aɪ ~ ɒɪ]
/æɔ/ [aʊ ~ æʊ]
/əʉ/ [oʊ ~ ʌʊ]

Often, the broad realisation of one phoneme overlaps with the educated realisation of another phoneme.
ie. /æɪ/ with /ɑe/,/æɔ/ with /əʉ/, and sometimes /ɑe/ with /ɔɪ/.

A.G. Mitchell and Arthur Delbridge (1965)

This survey of the speech habits of young Australians has sometimes been described as the outstanding achievement of impressionistic phonetics in Australia and undoubtedly, in some respects at least, it deserves this description. It had by far the largest number of subjects (>7000) of any impressionistic study (or any other type for that matter) of AE and drew these samples from six states. The subjects included both males (47%) and females (53%) and were each identified using a wide range of social parameters. All of the speech parameters were assessed impressionistically by the authors and included variety of AE spoken (ie. Broad, General or Cultivated), use of weak consonants, elision or assimilation, type of intonation (normal or restricted), speed of speech and nasality (normal, pervasive or contextual). The only other impressionistic study of AE to go to the trouble of collecting samples several states and to indicate the frequencies of occurrence was McBurney. The survey was a collection of tape recordings by teachers from the 309 participating schools, and because of this, both the quality of the recording and the quality of the interview varied greatly.

One of the analysis methods was the analysis of the six vowels identified by Mitchell (1946) as being the main indicators of AE variety being used. For the most part subjects used all the sounds in a particular set (i, ii, or iii), although a significant minority used vowels from more than one set. (see table 2)

Set Name /iː/ /ʉː/ /æɪ/ /əʉ/ /ɑe/ /æɔ/ % of sample
i) Cultivated [ɪi] [ʊu] [ɛɪ/eɪ] [oʊ] [aɪ] [aʊ] 11%
ii) General [əɪ] [əʊ] [ʌɪ] [ʌʊ] [ɒɪ] [æʊ] 55%
iii) Broad [ə̞ˑɪ] [ə̞ˑʊ] [ʌ̘ˑɪ] [ʌ̘ˑʊ] [ɒˑɪ] [æ̝ˑʊ] 34%

Table 2: Pronunciation of six key vowels by variety of Australian English (Mitchell and Delbridge, 1965) (Note that the diacritic "ˑ" indicates half-long, or a length between short and long. Also, the diacritics " ̞", " ̘" and " ̝" indicate that the vowel is lower, more fronted, or higher, respectively, than the normal position for that vowel)

Sets i and ii conformed to Mitchell's early values for educated and broad and set iii largely conformed to the earlier notion that broader speakers had longer slower glides in their closing diphthongs. It is at this point that Mitchell and Delbridge settled on the new groupings "Cultivated", "General" and "Broad".

Further tests indicated that the Broad speakers demonstrated a greater tendency for syllable assimilation and consonant elision, were more likely to use weak consonants or restricted intonation (narrow pitch range), were more likely to speak slowly (drawl), and further, showed a greater tendency to exhibit pervasive nasality. For none of these parameters, however, was the tendency >30%, meaning that at least 70% of Broad speakers did not show these features (ie. each feature taken separately), and suggesting that the usually popular characterisations of AE are in no way representative of the way most Broad speakers speak.

The enormous size and geographical scope of this project probably ensures that it will never be surpassed in breadth, and anyhow, future studies are more likely to be instrumental rather than impressionistic. The advantage of the impressionistic method for this kind of study means that each utterance can be reduced to a minimum number of variables (ie. to a series of narrow phonetic transcriptions determined according to the judgment of the researcher). An acoustic study, on the other hand, produces an increase in parametric detail by several orders of magnitude and is therefore likely to require a much smaller sample of subjects than in the present study. The major disadvantage of an impressionistic study is the degree of subjective disagreement between workers (nb. 25% disagreement on assignment of speaker category in this study) and this disagreement increases if the workers have trained under different traditions etc.

  Broad General Cultivated Total
Males 24 22 1 47%
Females 10 33.5 9.5 53%
Capital Cities 11 27 9 47
Other Centres 23 28 2 53%

Table 3: Percentage of Broad, General and Cultivated subjects, by sex and region (Mitchell and Delbridge, 1965)

  Assimilation and Elision
  Normal Excessive
Broad 71 29
General 91 9
Cultivated 99 1
Male 75 25
Female 94 6

Table 4: Degree of assimilation and elision by dialect variety and sex (Mitchell and Delbridge, 1965)

  Normal Restricted
Broad 95 5
General 98 2
Cultivated 100 0
Male 97 3
Female 98 2

Table 5: Intonation type by dialect variety and sex. Restricted intonation is a flat or monotonous intonation. (Mitchell and Delbridge, 1965)

  Normal Pervasive Contextual
Broad 73 13 14
General 92 3 5
Cultivated 98 1 1
Male 78 10 12
Female 93 3 4

Table 6: Degree or type of audible "nasality" by dialect variety and sex (Mitchell and Delbridge, 1965)

  Fast Normal Slow ("drawl")
Broad 3 89 8
General 5 93 2
Cultivated 3 96 2
Male 4 92 4
Female 4 92 4

Table 7: Rate of utterance by dialect variety and sex (Mitchell and Delbridge, 1965)


Please note:The references listed below do not represent required reading for this module. This is simply a listing of the references cited in this module. On-campus students may, if they wish, use this list as a guide to further reading. External students should note that they will not be disadvantaged because of their inability to gain access to the following books and articles as they are optional additional reading only.

1. Note that the phonemic symbols "/.../" used here are the ones utilised throughout this course and vary, in some cases significantly, from the symbols used by Mitchell and Delbridge. The biggest difference is their use of "/ɔ/" for /oː/ and "/ɒ/" for /ɔ/. Great care must be taken when reading phonetic references on Australian English, particularly with respect to this pair of vowels. Note that the Mitchell and Delbridge transcription is used throughout the Macquarie Dictionary. Also note that the Ladefoged textbook also uses a different vowel phonemic transcription system appropriate to the pronunciation of these vowels by American English speakers. [back]