Skip to Content

Department of Linguistics


Interlanguage Phonology: Sources of L2 Pronunciation "Errors"

Michael Carey

Important: If you have not yet either installed the phonetic font "Charis SIL" or tested this installation to determine if the phonetic characters installed properly then click here to go to the phonetic font help pages.

Pronunciation errors have long been thought to be caused by the transfer of phonological aspects from the L1 to the L2. While a significant portion of errors may be attributed to interlanguage transfer (2.1) according to Selinker (1972), inter and intra speaker variables exist which may be attributed to the other recognised items listed (2.2 to 2.9) below. Section (2.10) reviews a hierarchy significant pronunciation errors proposed by Jenkins (1990).

1. Interlanguage Transfer

When a language learner attempts to produce an L2 sound their relative success at approaching the target is reliant on their ability to disassociate their L2 utterance from their repertoire of L1 phonemes and allophones. Disassociation is often necessary because two languages may contain sounds which seem to be the same but are produced by differing articulatory motions. They are therefore acoustically different and may be perceived to be divergent from the target by the listener.

While it is possible for adult speakers to learn to produce acoustically acceptable approximations of targets such as the troublesome /l/ and /r/ distinction over time (Flege, 1995) the level of success varies between individual speakers. It is a common remark that the more successful producers of near-native sounding pronunciation are rare, gifted or talented. Their success could be more reasonably attributed to their ability to disassociate phonological aspects of the L1 and L2 and thus minimise the transfer of phonological features from one language to the other.

Assumptions regarding the variable success of learners’ L2 also concern the relative difficulty different nationalities encounter in their production of acceptable English pronunciation. As discussed previously, a very useful observation to consider in the contrastive analysis of various language groups and their L2 English production is that nationalities with a vastly different phonetic inventory to that of English, often find it easier to learn to produce an acceptable phonetic target in the L2 than a nationality whose L1 contains contrasting sounds (Flege’s Speech learning model 1987, 1995). For example, Japanese students have an advantage over Koreans when it comes to the production of English vowels. As Japanese only contains five simple monophthongal vowels to Korean’s ten, Japanese English speakers only have five vowels to interfere with the twelve monophthongs present in (Australian) English.

Therefore, it may be assumed that it is a simpler task to learn totally foreign sounds than sounds which bear a resemblance to sounds found in the L1. Furthermore, fossilised errors, which are attributable to the negative transfer of L1 to L2, may more easily be unlearnt when they are of the foreign rather than the familiar variety. For example, English /v/, which is totally foreign to Korean and often approximated by /b/, is less of a problem to unlearn than the negatively transferred Korean high front vowel /i/ which typically replaces the similar but durationally longer English high front vowel /iː/.

Each language has its own pattern relating to the length of vowels. Languages vary widely as to whether and how much vowel duration is affected by the following consonant. English has extremely exaggerated pre-consonantal vowel duration when preceding voiced consonants. (Takahashi, 1987).

There are however some universal (inter-lingual) consistencies in vowel length. Low vowels are always longer than high vowels, and back vowels tend to be longer than front vowels. For this reason vowel duration is often regarded to be a universal. More of this phenomenon of universals and their significance to interlanguage transfer is discussed in the following section.

2. Phonological Universals

Phonological Universals are phonological patterns which are common to all known languages. They are also referred to as being unmarked, (common and regularly occurring phenomena), or inversely, as being marked (distinctive and unique phenomena). In English for example, /s/ is unmarked and /θ/ is marked. Another notion, implicational universals, is that the presence of a marked segment /θ/ in a language implies the presence of an unmarked segment /s/, but the reverse is not true. See Greenberg (1966) for a complete description.

Many studies have been undertaken to determine the degree of difficulty in acquisition of the various elements of L2 phonology (Altenberg and Vago, 1983; Broselow 1984; Ferguson, 1984; Payne, 1976) based on markedness theory of universals. Takahashi (1987) concluded from an analysis of their studies:

"Those less marked phonetic or phonological characteristics of L1 are harder to unlearn. That is, those characteristics which are acquired early in L1 acquisition and are important (yet commonly occurring) characteristics of L1 are easily carried over in the production of the L2 phonological system and remain persistently as the L2 learner’s foreign accent." The development of an absolute hierarchy of markedness in and between languages is an almost impossible task given the enormous number and diversity of languages. Markedness theory has however contributed to a general understanding of the tendencies of simplification adopted by L2 learners. Some of these are outlined below.

In all languages there appears to be a universal preference for the open CV syllable. A study by Kozhevnikov and Chistovich (1965) showed that in a stressful situation, speakers tended to revert to very simple CV patterns of pronunciation in their own native language. Tarone (1972) has argued that:

"the simple open syllable may be a universal articulatory and perceptual unit; that is, that the articulators tend to operate in basic CV programs in all languages, and the various languages simply elaborate upon this program by adding various combinations of initial and final consonants."

From this it could be hypothesised that in interlanguage transfer, first languages with a greater propensity towards open syllables will have a greater degree of difficulty in assimilating the syllable structure of English. English has a greater tendency to stray from open syllable structure than Korean and a far greater tendency for this than the Japanese language, which is essentially built upon open syllables. Therefore, the syllable structure of English is more problematical for Japanese than for Koreans.

Other phonological universal tendencies include devoicing of word-final obstruents and affrication of the word-final alveolar fricative /s/ (Takahashi, 1987).

SydneyMorning Herald, 21.10.97.

3. Avoidance

Avoidance is a general tendency for learners to avoid those aspects of production that they know to be problematic for them. See Schachter (1974). Avoidance strategies may be employed at the grammatical as well as at the phonemic level. At the grammatical level an article (a/an, the) may be left out of the utterance when the student suspects an article is necessary and yet chooses to avoid using one rather than running the risk of selecting the wrong one. The above newspaper clipping is a clear example of a grammatical avoidance strategy. The source of misperception of the Korean speaker’s message could easily be misdiagnosed as a deviance from the L2 target phonemes /æ/ and /p/. However, any English language teacher worth their salt would tell you that L2 learners of English often avoid articles because the internalisation of the rules for correct article use are almost an impossibility. In the above story (if the quotation is accurate) the Korean speaker has avoided the indefinite article an and has produced the grammatical utterance "Yes, I need adaptor" not "Yes I need a dapter." The speaker may have mispronounced /æ/ as something resembling /ɒ/ but this is unlikely, the Korean transfer strategy would produce something closer to /ɛ/ or /ʌ/. However, /p/ could easily be misperceived as /k/ over the telephone’s narrow bandwidth. Yet, it is unlikely that redundancy would fail to intervene. The listener could have searched their mental lexicon for an appropriate word candidate (or asked for clarification) if the speaker had used the article and said I need a dopter. So clearly the salient source of this misunderstanding is grammatical avoidance, not phonemic error.

At the phonemic level a typical example of avoidance is the avoidance of using words which contain difficult to pronounce phonemes such as /z/ for Koreans. Therefore, the speaker may give a false impression of the extent of their phonemic pronunciation errors by avoiding the use of words such as zoo.

4. Over-generalisation and over-elaboration

Over- generalisation is described by Richards (1973) as the application of a newly learnt target language rule to an inappropriate form or context. For example, pitch, duration and intensity as features of stress being used too often to highlight every content word in an utterance.

Over-elaboration is usually caused by exposure to language acquisition strategies that are heavily reliant on reading and writing, to the detriment of speaking. In an attempt to produce accurate target language utterances, the learner produces un-native like stilted and formal speech which may be syntactically accurate but unnatural (Tarone et. al, 1983).

5. Hypercorrection or Overcompensation

This phenomenon can be found to occur after students have become aware of a negative transfer effect and arises from the strategy they employ to deal with this. See Wardhaugh, (1986). For example, Japanese does not posses the CV /siː/ (see) but does contain the CV /ʃiː/ (she), so the expected negative transfer effect is the production of /ʃiː/ (she) for the word see. A Japanese English learner may realise that the sounds /s/ and /ʃ/ must be distinguished before the vowel /iː/ in the L2 but has not learnt exactly when to do this. The learner therefore acquires the notion of /ʃ/ + /iː/ is not allowed in English and applies it even when it is necessary in the production of the word she. Thus the learner overcompensates and produces /siː/ instead of /ʃiː/.

6. Elision and epenthesis

Elision is the non-articulation of a sound and epenthesis is the addition of a sound to a word in the L2. Both are a negative transfer effect of phonotactic constraints in the L1.

7. Stylistic variation

Variations in style of speech occur according to psycholinguistic factors such as the situation, the context, the addressee(s) and the location. See Bolinger (1975). In the gathering of speech data, factors which may effect the authenticity of the data are: the self consciousness felt by the subject and the pressure to perform in the situation of a studio recording; unfamiliarity with the context or lexis of the test sentences; the pressure to achieve a ‘good result’ for the addressee (the tester); and the artificial environment and discomforts associated with remaining still in a recording studio.

8. Letter to sound rule confusion

Learners of English, whose L1 contains a phonemic orthography, often learn to speak English through reading and writing and consequently attempt to interpret English pronunciation from the orthography. The inconsistent letter to sound rules of English lexis may result in mispronunciation, not because of an inability to produce the phonemes, but due to the interference of spelling.

9. The developmental model

This model of language acquisition suggests that there are significant parallels between the replacement strategies employed by infant L1 learners of English and infant - adult L2 learners of English. Various studies (Flege and Davidian, 1984; Hecht and Mulford, 1982; Wode, 1981) across a variety of nationalities of L2 English learners have revealed replacement strategies for the production of new phones in L2 which are similar to the substitution strategies found in the L1 speech of infants. For example, in German speaking children Wode found that retroflex [ɻ] is first replaced by /w/ which is the same substitution which occurs in the speech of L1 infant speakers of English. In a study of adult L2 acquisition Flege & Davidian found that Chinese, Spanish and Polish speakers showed two developmental processes, word final stop deletion and devoicing, which are typically found in the speech of L1 English speaking infants. This model therefore suggests there is a hierarchy for substitution of new phonemes (consonants) which is sequential and consistent (perhaps universal) across all languages

10. Significant errors: current views

In an issue of Speak Out! (1996) Jennifer Jenkins outlines the need for a change in pronunciation modification priorities from systems based on a native model of pronunciation towards a system that modifies the errors which impede communication across an international context. The motivation for this new system of priorities stems from the assumption that non-native speakers will spend more time speaking to other non-native speakers of English, than to native speakers owing to the fact that there are now far more non-native speakers of English in the world than native speakers. This is particularly the case for EFL learners who usually return to their own countries after studying abroad.

Jenkins’ list of priorities was built upon the work of Jenner (1989) who sought to establish a common core of English phonology:

"Jenner recently advocated the need to "establish what all native speakers of all varieties (of English) have in common which enables them to communicate effectively with native speakers of other varieties other than their own."

Jenkins has adapted Jenner’s list of priorities from a list designed to "offer the learner a guarantee of intelligibility and acceptability to native speakers anywhere in the world" (Jenner 1989), to a list of priorities which also takes the listener intelligibility of non-native speakers into account. The following is a summary of her list of nine significant aspects of pronunciation to be focused on for pronunciation error elimination:

a) Vowel quantity

Jenkins believes that because the extra long vowels of English vary in length from one NS to another, "there is little common ground between them. Provided the learner is consistent in use, quality is unlikely to affect intelligibility greatly". I disagree with this generalisation and maintain that the length contrast of /iː/ versus /ɪ/ being phonemic in English and phonetic in Korean causes a negative transfer effect which is compounded rather than assisted by a NNS to NNS interaction.

b) Diphthongs

Diphthong length but not quality is essential to intelligibility as diphthong quality varies between groups of NS without a great loss intelligibility, length however is more critical. Jenkins herself admits that future research into diphthongs may reverse this decision.

c) Consonant conflations except for those involving /θ/ and /ð/

Jenkins’ claims that Gimson has demonstrated consonants tend to be more essential to intelligibility than vowels (no citation was given and no source has been found). Jenkins agrees with this view with the exception of /θ/ and /ð/. She believes the conflations (substitutions) of these sounds with /t/ and /d/ or /s/ and /z/ respectively are common to all NNS varieties and for this reason do not significantly impede intelligibility between NNSs and contextual clues may aid intelligibility in NNS/NS interaction. Gimson & Cruttenden (1994) assert the opposite by stating that for minimal intelligibility it is not acceptable to replace /θ/ and /ð/ with dental aspirated /tʰ/ and /dʰ. However some relaxation of the dental /ð/ is allowable in weak positions such as in the case of the following /s/ or /z/, e.g., in "What’s the time?"

d) Phonetic realisations

The majority of phonetic realisations in the L1, which are transferred to the L2, do not interfere with intelligibility and are merely perceived as characteristics of a foreign accent. Other phonetic transfers do interfere however and may lead to unintelligibility. Jenkins mentions however that there are sometimes alternate pronunciations which may be taught that are acceptable even to native speakers. "Adam Brown (1991) recommends ignoring and even teaching for productive use vocalic /ʊ/ for dark [ɫ] post-vocalically and syllabically in words such as milk, middle and full, since these are apparently easier for NNSs to produce and are fast becoming incorporated into NS varieties such as Estuary English".

e) Consonant cluster simplification (and consonant sequencing restrictions)

Many L1 languages have more phonotactic restrictions on consonants than English does. The predominance of open (CV) syllables in languages other than English as well as restrictions on the position consonants may take within the word are negatively transferable to English. The resulting production is either epenthesis (vowel addition) or omission (consonant deletion). Jenkins believes that where learners use the strategy of epenthesis, intelligibility is rarely effected, since listeners are able to recover the original form, e.g. Japanese [pərədʌktoː] for product. However, consonant deletion causes considerable problems for NNS listeners in particular, since the target form is not recoverable from the output, eg. Taiwanese /pɒdʌk/ for product.

f) Word stress

Dalton and Seidlhofer (1994) have shown that NS listeners identify words on the basis of their stress patterns, first picking out the stressed syllable and then searching their mental lexicons on the basis of this syllable, judging possible candidates according to how well they fit the unstressed syllables on either side of it. Jenkins asserts that this is more of a problem for NS listeners than for NNS listeners, assuming that NNSs listen more for contextual cues when differentiating similar nouns and verbs such as /ˈrɛkɔd/ and /rəˈkɔd/, than stress patterns.

g) Prominence and weak forms

Jenkins suggests that failure to produce weak forms represents more of a threat to intelligibility for NS listeners than for NNS listeners, mainly because the ears of NSs are not attuned to this. Learners are likely to find unreduced syllables easier to interpret when speaking amongst each other. Therefore, they are reluctant to attempt reduction in their own utterances. Perhaps it is therefore advisable to abandon attempts to teach weak forms productively and focus on recognition alone in the speech of NSs, and on prominence for NNS production.

h) Tone groups

The tone unit is the basic unit of dividing up the stream of speech in English. In some L1s the utterances are longer between pauses. When this strategy is transferred to English the listener is deprived of much needed time to process information. It also may result in the breaking up of speech in unexpected places.

i) Nuclear/contrastive stress but not tone

In many L1s, if there is anything comparable to the English system of nuclear stress, it tends to fall on the last word of a group of words (or on the first word in Korean). When this is transferred to the L2 the main stress may fall on unimportant grammatical words (prepositions, articles, etc.) instead of function words (nouns, adjectives, verbs and adverbs). When the learner’s L1 uses syntactic instead of intonational means to indicate contrastive stress, prominence may be neglected by the speaker. Jenkins’ studies have revealed that learners "very quickly acquire receptive competence in contrastive stress, but they take considerably longer to acquire nuclear stress productively". This is not an issue for Korean L2 utterances as prominence is also a feature of the L1. Jenkins’ understanding of intonation is limited to anecdotal evidence so a more thorough contrastive acoustic analysis should reveal more significant negative transfer effects.


Altenberg, E.P. & Vago, R.M. (1983) Theoretical implications of an error analysis of second language phonology production. Language learning. 33. 4: 427-447.

Bolinger, D. (1975) Aspects of Language. New York: Harcourt, Brace and World.

Broselow, E. (1984) An investigation of transfer in second language phonology. IRAL. 22. 4: 253-269.

Ferguson, C.A. (1984) Repertoire universals, markedness, and second language aquisition. In W.E. Rutherford. (ed.) Language Universals and Second Language Acquisition. Typological Studies in Language. 5. Philadelphia: John Benjamins Publishing Company. 247-258.

Flege, E. & Davidian R.D. (1984) Transfer and developmental processes in adult foreign language speech production. Applied psycholinguistics. 5: 323-347.

Flege, E. (1987) The production of "new" versus "similar" phones in a foreign language: evidence for the effect of equivalence classification, Journal of Phonetics 15:47-65.

Flege, E. (1995) "Second-language speech learning: Theory, Findings, and Problems", in W.Strange (ed.), Speech Perception and Linguistic Experience: Theoretical and Methodological Issues in Cross-language Speech Research. Timonium, MD: York Press.

Gimson, AC & Cruttenden, A. (1994) Gimson’s Pronunciation of English. London: Edward Arnold.

Greenburg, J.H. (1966) Universals of Language. Cambridge, Mass.: MIT Press.

Hecht, B.F. & Mulford, R. (1982) The acquisition of a second language phonology: Interaction of transfer and developmental factors. Applied Psycholinguistics. 3: 313-328.

Jenkins, J. (1996). Changing pronunciation priorities for successful communication in international contexts. Speak Out! no.17.

Kozhevnikov & Chistovich, eds. (1965) : Speech: Articulation and perception. Moscow: Nauka. (Translation: JPRS: 30, 543, pub. Joint Publications Research Service, Washington D.C.).

Payne, A.C. (1976) The Acquisition of the Phonological System of a Second Dialect. Unpublished Ph.D. Dissertation: the University of Pennsylvania.

Richards, J.C. (1973) A non-contrastive approach to error analysis. In Oller, J.W. & Richards, J.C. (eds.) (1973) Focus on the learner: Pragmatic perspectives for the Language Teacher. Massachusetts: Newbury House.

Schachter, J. (1974) "An error in error analysis" Language learning 24, 2:73 – 107.

Selinker, L. (1972) "Interlanguage." IRAL 10: 201-231

Takahashi, Yoshiko (1987) Language Universals and Transfer Effects: Aquisition of English vowel durations by Japanese learners. Ph.D Dissertation, Stanford University.

Tarone, E. (1972) "A suggested unit for interlingual identification in Pronunciation," TESOL Quarterly 6,4: 325-331.

Tarone, E.; Cohen, A.D, & Dumas, G. (1983) In: A closer look at some interlanguage terminology: A framework for communication strategies. Farch, C. & Kasper. G (eds.)

Wardhaugh, R. (1986) An introduction to sociolinguistics. Oxford, Blackwell.

Wode, H. (1981) Learning and second language: An integrated view of language acquisition. Tuebingen, W. Germany. Gunter Narr Verlag.