NATURAL PHONOLOGY OF JAPANESE

by

R. Edward Smith, Ph.D.

(This paper is a somewhat revised and amended version of my December 1980 University of Hawaii doctoral dissertation in Linguistics of the same title. My dissertation committee was Gregory Lee (Chairman), Derek Bickerton, Robert Cheng, Gordon H. Fairbanks, Stanley Starosta, and Donald M. Topping.)

 

ACKNOWLEDGEMENTS

The successful completion of this work would hardly have been possible without the inspiration and support of certain teachers, colleagues, and friends. My sincere appreciation goes to the members of my committee, in particular my chairman, Greg Lee, in whose classes I first became aware of Natural Phonology and the new and interesting questions it brings to the study of pronunciation. His patience, accessibility, and erudition are responsible for much of the enlightenment I have achieved in the process of destroying many of my favorite prejudices. I also owe much to David Stampe whose presence in Hawaii on the faculty of the Linguistic Institute in 1977 and as a visiting scholar in the summer of 1980 was a source of inspiration and enlightenment and who was kind enough to comment on much of the work in these pages. Thanks go also to my good friend and colleague Shozo Kurokawa whose kindness and generosity have contributed much to the realization of this work. Interviews recorded by him in both the Toohoku and Hiroshima areas of Japan were an invaluable source of data and his readiness to discuss matters having to do with Japanese pronunciation made many of my conclusions possible. For the recorded interviews of Hawaiian Japanese speakers upon which some of the data in Chapter 5 is based I wish to thank two of my former students in Japanese linguistics in the East Asian Language Department at the University of Hawaii – Carolyn Kagawa and Utako Walsh. Finally I send a fond aloha to the entire Linguistics Department for making my stay there a warm, human, and fulfilling experience.

 

ABSTRACT

This paper applies the theory of Natural Phonology to the study of Japanese pronunciation. In Japanese as in any language constraints are placed on the universal set of phonetically motivated processes leaving a particular subset to govern underlying and derived structure. In this work 30 important processes of Japanese are identified by form and function. Evidence comes from careful and hypoarticulate speech data and the pronunciation of loan words.

Chapter 1 presents a general introduction to the theory of Natural Phonology.

In Chapter 2 processes affecting vowels and glides are identified including an important process, syllabicity reversal, which changes downgliding diphthongs into up-gliding ones. The important role played by this process in both diachronic developments and synchronic speech processing is shown.

Chapter 3 discusses consonants and consonant processes including four processes which are important to the derivation of superficially palatalized consonants from underlying plain ones.

Chapter 4 describes the role of processes in the processing of hypoarticulate speech. Some of the processes which govern the lexicon are shown to govern derived structure as well while others are subject to counter feeding.

In Chapter 5 the role of processes in the nativization of loan words is explored. Two borrowing strategies are proposed – an innovating one oriented toward imitating some of the phonetic characteristics of the source language and a conservative one oriented toward preserving the canons of the target language.

Chapter 6 analyses some features of Hawaiian Japanese speakers who have resided in Hawaii since prior to 1924. The speech of immigrants from Hiroshima and Fukushima are constrasted in terms of differing constraints on certain processes. The question of whether there is phonological levelling in the direction of Hiroshima speech is asked, and on the basis of data presented there the question is answered negatively. The difference between phonetically and conventionally motivated constraints on pronunciation is exemplified and an answer to the question of whether phonetically motivated constraints are more 'persistent' in adult speech is explored. On the basis of the data presented no significant difference is found.

 

TABLE OF CONTENTS

I.   NATURAL PHONOLOGY

1.0   Introduction
1.1   Natural phonology and phonemics
1.2   Fortition and Lenition Processes
1.3   Context-free and Context-sensitive processes
1.4   Constraints on underlying representation
1.5   Processes vs Rules
1.6   Constraints on the Application of Processes
1.6.1   Ordering
1.6.2   Suppression
1.6.3   Limitation

II.   VOWELS AND GLIDES

2.1   Syllable and mora in Japanese
2.1.1   The syllable
2.1.2   The mora
2.1.2.1   Mora duration
2.1.2.2   Mora and the Kana Orthography
2.2   Vowels
2.2.1   Context-free Vowel Processes
2.2.1.1   Raising
2.2.1.2   Bleaching
2.2.1.3   Coloring
2.3   Japanese Underlying Vowels
2.3.1   Vowel Length
2.3.2   Dipthongs
2.3.2.1   On-gliding dipthongs
2.3.2.2   Off-gliding dipthongs
2.3.2.2.1   Types of off-gliding dipthongs
2.3.2.2.2   Derivation vs. Borrowing
2.3.3   Glide Epenthesis

III.   CONSONANTS
3.0   Consonants
3.1   Japanese has twelve underlying consonants
3.1.1   Underlying /dz/
3.1.2   Syllable initial (onset) distribution
3.1.3   Syllable final (offset) distribution
3.1.3.1   Distribution of offset nasal
3.1.3.2   Distribution of offset obstruents
3.1.3.3   NP and GP analyses of geminate consonants compared
3.1.4   Obligatory Consonant Lenitions
3.1.4.1   Offset /n/ lenition
3.1.4.2   Onset consonant lenitions
3.1.4.2.1   /h/ is pronounced [ɸ] before /u/
3.1.4.2.2   /h/ is pronounced [ç] before /i, y/
3.1.4.2.3   /dz/ is pronounced [z] intervocalically
3.1.4.2.4   /g/ is pronounced [ŋ] intervocalically
3.1.4.2.5   Palatalization
3.1.4.3   Palatalization and Language Change
IV.   HYPOARTICULATE SPEECH
4.0   The term hypoarticulate speech
4.1   Vowel Lenitions
4.1.1   Vowel unvoicing
4.1.1.1   Complete assimilation of voiceless vowels
4.1.2   Voiced vowel assimilation
4.1.2.1   Offset /r/ nasalization
4.1.3   Syllabicity reversal
4.1.4   Vowel Coalescence
4.1.5   Vowel Shortening
4.1.6   Height Assimilation
4.2   Glide Lenitions
4.2.1   Glide fronting
4.2.2   Labial glide deletion
4.2.3   Palatal glide deletion
4.2.4   Post consonantal palatal glide deletion
4.3   Consonant lenitions
4.3.1   Obstruent voicing assimilation
4.3.2   Regressive obstruent assimilation
V.   LOAN PHONOLOGY
5.0   Modern Loan Words
5.1   Conservative vs Innovating
5.2   Processes in NP
5.3   Stampe's analysis applied to 'conservative' and 'innovating'
5.4   Two Borrowing Strategies
5.5   [ wo, we, wi ]
5.6   [ye] [C'e]
5.7   Three categories of proccesses
5.8   [ ti, tu, tyu ]
5.9   [ di, du, dyu ]
5.10   Conservative borrowing strategy
5.11   Innovating borrowing strategy
5.12   Correlation of strategies with ease of assimilation
VI.   HAWAIIAN JAPANESE
6.0   Introduction
6.1   Process-governed features
6.1.1   Comparative analysis of dialects in Table 6.2
6.2   Persistence of VOWEL RETRACTION and Y DEL in Fukushima issei speech
6.3   Rule-governed features of Fukushima dialect
VII.   CONCLUSION
BIBLIOGRAPHY
TABLES
1.1   Suppression of Processes
2.1   Foreign Verb Stems
2.2   Japanese Lexical Vowels
2.3   Foreign Vowels and Their Japanese Cognates
2.4   'Upside-Down Words'
3.1   Underlying Consonant Inventory
3.2   Consonant Feature Matrix
3.3   Short Syllable Inventory
6.1   Number of Issei in Hawaii in 1924 with their Prefecture of Origin
6.2   Merger of H/Y Syllables in F
6.3   Variation in the Speech of a Fukushima Resident
6.4   Occurrence of Vowel Retraction and Y Deletion in the Speech of Two Fukushima Issei
6.5   W-stem Verb Paradigm of the Verb 'to buy'
6.6   Alternative Analyses of the w-stem verb paradigm
6.7   Occurrences of w-stems in the speech of M and F
PROCESSES
P15   AFFRIC WEAKENING
P19   ALVPALADJ
P18   DEPAL
P8   GLIDE EPENTHESIS
P1   GLIDE FRONTING
P14   H LAB
P25   OFFSET /r/ NAS
P29   OBS VOICING ASSIM
P17   PAL
P9   POST CONS LAB GLIDE DEL
P20   POST CONS PAL GLIDE DEL
P28   POST VOC PAL GLIDE DEL
P2   PRE VOC LAB GLIDE DEL
P3   PRE VOC PAL GLIDE DEL
P10   PROG OFFSET
P4   PROGR HT ASSIM
P7   RAISING
P30   REGR OBS ASSIM
P11   REGR OFFSET /n/ ASSIM
P21   STOPAFF
P16   STOP NAS
P6   SYL REV
P12   TAUTO-SYL REGR VOWEL NAS
P24   VCD VOWEL ASSIM
P23   VCL VOWEL ASSIM
P26   VOWEL COALESCENCE
P5   VOWEL COLORING
P13   VOWEL DENAS
P27   VOWEL SHORTNG
P22   VOWEL UNVOICING


 

CHAPTER ONE

NATURAL PHONOLOGY

1.0 Introduction

The treatment in this paper is based on the theory of natural phonology (NP) first proposed by David Stampe. See for instance Stampe (1969, 1973), and Donegan and Stampe (1979). In Donegan and Stampe (1979) what is to be studied in phonology is characterized as 'the discrepancy between the sound perceived and intended and the sound pronounced.' This suggests three main areas of concern:

i) Phonological (underlying) representation – the sound perceived and intended;

ii) Phonetic (surface) representation – the sound pronounced;

iii) Processes – the phonetically motivated substitutions which "form the system of limitations standing between the intention and actualization of speech" (ibid p. 78).

The elements in phonological representation are taken from the lexicon – the list of items stored in the speaker's long-term memory. A lexical representation is an empirical hypothesis on the form in which an item is stored in the lexicon. This is the form which Stampe (with Sapir) believes is most readily brought to consciousness by the speaker. Such representation is constrained by processes which elsewhere actually govern substitutions in the language, but which in the lexicon act in the manner of morpheme structure conditions in generative phonology to disallow segments and sequences to which they might apply. According to Stampe (1973) 'These processes are mental operations performed on behalf of the physical system in speech perception and production. The purpose of these processes is to substitute for a class of sounds or sound sequences presenting a specific common difficulty to the speech capacity of the individual, an alternative class identical but lacking the difficult property.' The existence and form of processes may be inferred from substitutions which occur in child language, in diachronic development, in loan words, in speech errors, in secret languages (e.g. pig latin) and in hyper- and hypo-articulated speech.

The following example derivations are from Stampe (1973):

(i). /kant/[khæ̃t] 'can't'

(ii). /kæt/[khæt] 'cat'

NOTE: This paper makes extensive use of phonetic symbols. Your browser should display the characters in example (i) above like this:
You should see a small tilde ( ~ ) centered above the ash symbol ( æ ). The font is Arial Unicode MS. If your browser does not display this correctly, please go to this page: The International Phonetic Alphabet in Unicode for possible solutions.

In (i) the underlying representation on the left represents what the speaker intends, and the phonetic representation on the right shows the resulting pronunciation after the application of processes of aspiration, vowel nasalization, and nasal deletion. Compare this with the derivation of 'cat' in (ii) where the surface form differs from (i) only by the oral versus the nasal vowel.(1) The contrast of oral and nasal vowels in this pair of words, though phonetically distinctive, is below the level of consciousness of the naive speaker. it is thus the underlying representations which best represent the speaker's perception of the pronunciation of these words.(2)

1.1 Natural phonology and phonemics

In Stampe (1973) the relationship between NP and traditional phonemics is characterized in the following way:

"The relationship between contrary processes is a systematic account of the notion 'allophone' in traditional phonemics. An allophone is a sound which does not occur in underlying (phonemic) representation, but only in superficial (phonetic) representation, due to a context sensitive 'allophonic' process. An allophonic process is any process, like vowel nasalization in English, which creates sounds which do not occur in underlying representation in the language... In natural phonology, the nonoccurrence of certain sounds (nasal vowels) in underlying representation in a language is attributed to a process (vowel denasalization) in the phonological system of the language. Thus the notion 'allophonic process' translates as any process which gives rise to sounds eliminated by a prior, more general process in the system. Vowel nasalization in English is allophonic because it gives rise to sounds which the prior context-free process of vowel denasalization eliminates – nasal vowels. And nasal vowels are therefore 'allophones of' nonnasal vowels in English." (p. 25).

1.2 Fortition and Lenition Processes

In Donegan and Stampe (1979) segmental processes are either 'fortitions' or 'lenitions.' Fortitions – which include dissimilations, diphthongizations, syllabications, and epentheses – are designed to make pronunciations more perceptible. Lenitions – which include assimilations, monophthongizations, desyllabications, reductions, and deletions – make segments and sequences of segments easier to pronounce. The nasalization of the vowel in 'can't,' a lenition, is a response to the difficulty involved in pronouncing a vowel with raised velum when a sound with lowered velum follows.

1.3 Context-free and Context-sensitive processes

Lenitions are generally context-sensitive or syntagmatic processes. Fortitions are generally context free or paradigmatic processes and have the effect of maximizing the phonetic properties of individual segments often heightening their differences with neighboring segments. Thus the context-free vowel denasalization process in English maximizes the vocalic quality of the underlying vowel in 'can't.' It also accounts for the fact that English speakers perceive vowels as nonnasal even when they superficially aren't. In addition to governing the lexicon vowel denasalization in English also applies to foreign words such as French [mamã] rendering it [mama] without nasalization. On the other hand French [monami] without a nasalized first vowel is rendered [mõnami] by English speakers (examples from Stampe 1973). The nasal vowel in French [mamã] cannot survive in English because there is no post-vocalic nasal context in which it can be derived by processes of English. [khæ̃t] also lacks a post vocalic nasal, but its absence is due to the process which deletes nasals between a preceding vowel and a following stop in English. No such process exists for deleting post vocalic nasals not followed by a stop. In the case of French [monami] the first vowel is followed by a post vocalic nasal thus making vowel nasalization derivable in English.

1.4 Constraints on underlying representation

In NP there are circumstances in which underlying representations may be deeper than 'phonemic'. The case of syllable final obstruent devoicing in German furnishes an example. In German the words 'organization' and 'many colored' are homophonously [bunt] in their unflected form, but their inflected alternants are [bunde] and [bunte] respectively. The neutralization of [d,t] in the uninflected form is due to obligatory application of the syllable final obstruent devoicing process. Under such circumstances where the lexical representation is relatable to the surface by processes a 'morphophonemic' underlying representation is justified, and 'organization' is /bund//bunde/ and 'many-colored' is /bunt//bunte/.

There is one other case where non-allophonic alternants on the surface may be represented by a single underlying representation, viz. surface alternants that are derivable by optional application of processes e.g. careful-casual alternants like 'hands' /hæ̃ndz/[hæ̃ndz][hæ̃nz] , due to the optional application of [d]Ø / n_z.

Thus in NP underlying segments are identical with their surface representations except that 1) 'allophonic' features are barred from underlying representation, and 2) 'morphophonemic' representations are legal where required by alternation provided they are mapped to the surface by obligatory or optional processes of the language.

1.5 Processes vs Rules

Stampe makes a sharp distinction between processes, which have synchronic phonetic motivation, and rules, which define conventional substitutions without synchronic phonetic motivation. These are the neo-phonetic and paleophonetic alternations, respectively, of Baudouin de Courtenay (1895). The so-called 'velar softening' rule which alternates e.g. [k/s] in 'electri[k]electri[s]ity' is a rule not a process. Rules can be ignored if a speaker so chooses without creating difficulties in pronunciation e.g. 'electri[k]ity' is quite pronounceable by speakers of English as are 'persnickity' and 'lickity split.' Conventional substitutions are typically obligatory however, wherever they apply. (English speakers don't say 'electri[k]ity' even though capable of it). Rules play no role in the processing of forms resulting from speech errors or 'tongue slips.' In a spoonerism, 'Cynical guys' becomes [ǵinikl saiz] not [dz̆inikl kai̯z] i.e. velar softening does not apply to /g/ before /i/, but fronting, a process, does. As Stampe (1973) puts it: 'Phonological constraints which are learned (i.e. rules RES) do not govern our phonetic behavior (p. 44).' Nor, in the example, does [s] in 'cynical' revert to /k/ before non /i/. Or to pick a better example (Donegan and Stampe 1979 p. 166) in the secret language Ob, in which the syllable /ab/ is inserted before every vowel, 'electricity' is pronounced [abilabɛktrabɪsabɪtabɪ], not *[--k--] even though the vowel of /ab/ would block velar softening of a hypothetical underlying /k/. This shows that the relationship of [s] in 'electricity' to /k/ in 'electric' is not represented in the mind as an underlying /k/ in 'electricity.' And in general, Donegan and Stampe conclude, such 'systematic phonemic' representations – underlying representations related to surface representation through the application of rules rather than processes – are not confirmed by empirical evidence. Hence, they conclude, rules play no role in either the productive or perceptual aspects of speech processing.

There is the possiblity then that rules play only an interpretive role in the grammar. This in fact seems implicit in some of the non-NP writings which posit a less abstract lexical representation than is characteristic of generative phonology. Leben and Robinson (1977) suggest, for example, that if the level of lexical representation is phonemic then 'allophonic' rules might relate this to the surface while 'morphological' rules would apply interpretively to determine if lexical items shared a morpheme.

1.6 Constraints on the Application of Processes

1.6.1 Ordering

Processes which govern lexical representation are context-free where they determine the underlying segmental inventory and context-sensitive where they constrain underlying sequences. Where the effect of context-free and context-sensitive processes contradict each other as in the case of vowel denasalization and nasalization, the context free process is ordered first and governs the lexicon rendering all underlying vowels nonnasal. The context sensitive process governs surface structure and provides for the nasalization of vowels in a nasal context.

1.6.2 Suppression

Though the set of innate processes is universal, constraints on the application of processes are language particular and take the form of suppression, limitation, or ordering of the processes. In achieving mature speech the child is thus faced not with the task of learning processes but of learning to constrain their application in appropriate ways. Thus e.g. 'wabbit' will not become 'rabbit' until the English-speaking child has learned to suppress the context-free process r → w which substitutes simpler /w/ for more complex /r/. Since suppressing the application of a process takes effort, a frequent cause of language change is the failure of children acquiring language to suppress a process to the extent that mature speakers do.

The suppression of processes affects the phonological and phonetic structure of the language in various ways. In the case of vowel denasalization and nasalization four logical possiblities exist. These are shown below along with a representative language of each type and the phonological and phonetic consequences:

Table 1.1

V-denas V-nas LanguageEffect
Unsuppressed Unsuppressed English/v/ → [v,ṽ]
Suppressed Unsuppressed Hindi/v/ → [v]
/v/[ṽ]
/ṽ/ → [ṽ]
Suppressed Suppressed French/v/ → [v]
/ṽ/ → [ṽ]
Unuppressed Suppressed Hawaiian/v/ → [v]

1.6.3 Limitation

In English there is a process that deletes [h] before sonorants – affecting the sequences [hn, hl, hr, hw, hy, hV]. Historically, the gradually more general application of [h]-deletion proceeded along a parameter based on the degree of sonority of the following segment. Starting before the least sonorous segment, [h]-deletion affected e.g. OE [hnutu] 'nut', [hlaxan] 'laugh, [hring] 'ring', and is presently affecting e.g. Mod E [hweil] 'whale', [hjuː] 'hue', [hau̯s] 'house' in various dialects. There is an implicational order involved such that speakers who delete [h] before [j] (pronouncing 'hue' and 'you' alike) also do so before [w] (pronouncing 'whale' and 'wail' alike). And those who delete [h] before vowels (e.g. Cockney speakers) do so before all other sonorants as well. The fact of the implicational relationship means we are dealing with a single process. Where [h] is invariably present that part of the process is suppressed. Where [h] is invariably absent the relevant sub-process of [h]-deletion governs the lexicon, and where [h] is variably absent the relevant sub-processes govern only surface structure.


NOTES

(1) There are dialects where the pronunciation of (i) and (ii) differ in additional ways. Stampe has pointed out to me that some speakers have breaking i.e. [æ][æe̯] (southern U.S.) or [æ][aə̯]~[æe̯] (northern US.) before nasals and therefore have a distinct vowel in 'can't.'
(2) A source of native speaker awareness of /n/ in [khæ̃t] is the presence of /n/ in 'can'. It is evidently there because speakers who experience no difficulty in pronouncing [khæt] backwards [thæk] will when asked to pronounce [khæ̃t] backwards attempt [tnæk] even though it is unpronounceable in English. Evidence for the lack of phonetic [n] in [khæ̃t] is the fact that the [t] is often flapped in 'can't I' [khæ̃ɾae] just as it is in 'cat eye' [khæɾae̯]. The presence of a preceding nasal would block flapping.


 

CHAPTER TWO

VOWELS AND GLIDES

2.1 Syllable and mora in Japanese

2.1.1 The syllable

Because the syllable is the basic unit of segmental organization in all languages, many patterns of segmental distribution and evolution may best be understood within the context of syllable structure. The syllable may be characterized as an organization of segments in a sonority pattern. An utterance is a series of syllables – sonority peaks surrounded by less sonorous satellite segments. Vowels, the most sonorous segments, typically serve as peaks, with obstruents, the least sonorous segments, as satellites. Liquids, nasals, and glides occupy intermediate positions between vowels and obstruents. Satellite segments which precede the syllable peak comprise the syllable onset, while those which follow the peak comprise the syllable offset. Perceptually the ideal syllable is characterized by maximum sonority at the peak and minimum sonority in the satellites. Coupled with the tendency toward open syllables (syllables with no offset) this may explain the universal presence in infant speech of combinations of the type [papa], [tata], [dada], [mama].

Japanese syllable structure approaches the ideal of vocalic peaks of sonority separated by relatively non-sonorous consonant onsets. There are more complicated canonical shapes as well (cf. Chapter 3) but the classic CVCV pattern prevails overall. Such a pattern offers maximum perceptual clarity, but lenition processes diminish this clarity in favor of ease of articulation. The tension between these two teleologies explains many of the contradictory developments in synchronic speech processing as well as in diachronic phonology. For example, all syllables have vocalic peaks in lexical representation, but the. application of vowel weakening processes (cf. Chapter 4) sometimes results in syllables with obstruent or nasal peaks on the surface e.g. /su.te.ki/ → [s.te.ki ] "nice', /uma/ → [m.ma] 'horse.'

Syllable structure influences the application of processes in various ways. For instance, other things being equal, onset segments are less likely to undergo lenition processing than offset segments, e.g. offset /n/ is subject to a number of assimilatory processes to which onset /n/ is immune:

(1) /tan.i/ → [tãã.i] 'unit'
(2) /ta.ni/ → [ta.ni] 'valley'

In (1) /n/ weakens under the influence of an adjoining vowel while in (2) it does not. (1) also illustrates the fact that the application of processes is often constrained by the presence of syllable boundaries. In Japanese the offset nasal segment will extend nasalization to the preceding peak but not to a segment of the following syllable.

2.1.2 The mora

Most treatments of Japanese phonology fail to distinguish between syllable and mora. Hattori (1960 p. 247ff) discusses syllable structure, and McCawley (1968 p. 58ff) and Martin (1967 p. 246 and 1975 p. I-1 ff) clearly contrast the two. But all too frequently the terms 'syllable' and 'mora' are used interchangeably in reference to what is in fact the mora.

The mora concept arises in languages which make a distinction between long and short syllables, e.g. Japanese, Classical Latin, Classical Greek. A mora is a rhythmical unit of which short syllables have one and long syllables have two. The extra mora in long syllables is due to the presence of an offset segment. Thus in Japanese /a/, /ka/, /kya/ are short syllables and /aa/, /kaa/, /kyaa/, /kyan/ are long syllables. McCawley (1968), using terminology from Troubetskoy, has dubbed Japanese a 'mora counting syllable language,' by which he means that the syllable is the prosodic unit or bearer of pitch accent (i.e. there is no distinction between syllables accented on the first mora and syllables accented on the second mora), and the mora is the unit of phonological distance (i.e. accent rules are of the type 'place the accent on the antepenultimate moral). The example McCawley uses to show that both the syllable and the mora are necessary is the loan word /erebeetaa/ 'elevator' which in accordance with the above rule of accentuation of Japanese would be accented /erebeétaa/, but which in fact is accented /erebéetaa/ because long syllables are never accented on the second mora. The rule then must be worded 'place the accent on the syllable containing the antepenultimate mora.

There is other evidence that Japanese is a 'syllable language.' Ashworth and Lincoln (1973) cite verb stems derived from borrowed words which are shortened forms of the originals to which a native inflectional suffix is attached, e.g.

Table 2.1

Foreign Verb Stems

 OriginalVerb stemNon-past 
1./a.dzi.tee.syon//adzir-//adziru/'to agitate'
2./de.mon.su.to.ree.syon//demor-//demoru/'to demonstrate'
3./haa.mo.nai.dzu//hamor-//hamoru/'to harmonize'
4./da.ben/(1)/daber-/ /daberu/'to chat idly'

To state the rule for deriving the verb stem from the original in 2, 3, and 4 it will be necessary to recognize not only the syllable but the difference between long and short syllables since the stem apparently must begin with two short syllables followed by /r/ even if the syllables in question were originally long. Thus the verb stem is formed on the first two syllables (not moras) of the original minus any offset segments.

2.1.2.1 Mora duration

It is usually claimed (e.g. Han 1960) that in Japanese the moras in an utterance are of more or less equal duration, but spectrographic evidence (Wang 1968) shows that on the contrary the duration of moras differs widely depending on segmental constituency and other factors. Native speakers are aware of the number of moras in an utterance (or a line of poetry) and it is apparently length based on this quantity that they respond to rather than length measured in centiseconds. To the extent that the latter plays a role it is in the intention and perception of speech rather than its actuation.

2.1.2.2 Mora and the Kana Orthography

The confusion between the syllable and the mora in Japanese may derive to some extent from the 'syllabic' kana orthography with which the language is written. At its inception the kana orthography was, indeed, syllabic since there was apparently no distinction between long and short syllables in Old Japanese (OJ). Subsequent developments introduced such a distinction however, and today the kana orthography would be better termed 'moraic.'

2.2 Vowels

In her study of the natural phonology of vowels Donegan (1978) distinguishes three basic identifying vowel features – palatality, labiality, and sonority.(2) A cover term for the features palatality and labiality is color (timbre), and vowels with the palatal and/or labial feature are, chromatic vowels. The relationship between color and sonority is such that vowels with a high degree of one will have a low degree of the other:

PALATALITY LABIALITY
iɨu
eʌo
æaɔ
 SONORITY 

2.2.1 Context-free Vowel Processes

Donegan also proposes a number of universal context-free processes which account for the commonly occurring vowel substitutions in natural languages. Included are the following apparently contradictory pairs of processes: RaisingLowering; Bleaching (depalatalization or delabialization) – Coloring (palatalization or labialization); and TensingLazing. Raising, Coloring, and Tensing increase color and Lowering, Bleaching and Lazing increase sonority. Although the pairs of processes have contradictory teleologies, the fact is that only one member of the pair tends to apply under given phonetic conditions.

2.2.1.1 Raising

Raising tends to apply to chromatic vowels, especially lower ones, while Lowering tends to apply to achromatic vowels, especially higher ones:

[+ PAL]   [- PAL]
[- LAB]
  [+ LAB]
  i       ɨ       u
  e       ʌ       o
  æ       a       ɔ

2.2.1.2 Bleaching

Bleaching is more likely to apply to lower chromatic vowels:

[+ PAL]   [- PAL]
[- LAB]
  [+ LAB]
i →→       ɨ    ←← u
e →→→→     ʌ     ←←←← o
æ →→→→→→     a     ←←←←←← ɔ

2.2.1.3 Coloring

Coloring tends to apply to higher achromatic vowels:

[+ PAL]   [- PAL]
[- LAB]
  [+ LAB]
i←←←←←← ɨ→→→→→→ u
e ←←←← ʌ→→→→ o
æ ←←a →→ ɔ

A characteristic of any process is that the application of its subprocesses reflects a strict implicational hierarchy such that for instance Coloring never applies to a lower achromatic unless it applies to higher ones (e.g. ʌe implies ɨi), or Lowering never applies to a mid achromatic vowel unless it applies to a high one as well (e.g. ʌa implies ɨʌ). Since context-free processes in NP act in the manner of morpheme structure rules in generative phonology the hierarchical conditions on the application of processes governing the lexicon account for the fact that certain vowels (e.g. /ɨ, æ ,ɔ/) are frequently absent from underlying inventories.

2.3 Japanese Underlying Vowels

Japanese has the following five lexical vowels:

Table 2.2

Japanese Lexical Vowels

 [+ PAL][- PAL]
[- LAB]
[+ LAB]
[+ HIGH]i u
[- HIGH]
[- LOW]
e o
[+ LOW] a 

From the absence of low chromatic vowels we can infer that Raising or Bleaching govern the lexicon. Likewise, the absence of the less sonorous achromatics implies that Lowering or Coloring also govern the lexicon. There is diachronic evidence for the raising of (long) low chromatic vowels. OJ [au̯] underwent monophthongization to [ɔɔ̯] (Nishihara 1970). We can assume [ai̯] > [ææ̯] as well since this latter remains in certain dialects (e.g. Hiroshima prefecture according to Itoo (1979)), but in the standard language these monophthongs were raised to [oo̯] and [ee̯] respectively.

Loan phonology furnishes further evidence of vowel processes that govern the lexicon. Lovins (1973) cites a number of correspondences between foreign vowels and their Japanese cognates, e.g.

Table 2.3

Foreign Vowels and Their Japanese Cognates

ForeignJapaneseProcessExample
/ɔ/oRaising/ooto/ 'auto'
/ʌ/aLowering/rabu/ 'love'
/æ/aBleaching/batto/ 'bat'
/ə/aLowering/saakasu/ 'circus'

(See Lovins ibid. p. 72ff for discussion of exceptions due to orthographic influence and other reasons.)

2.3.1 Vowel length

As is typical of mora-counting languages, vowel length in Japanese is distinctive, and all five vowels occur in both long and short varieties. Long vowels will be written as a vowel plus homorganic glide sequence (VV̯), the vowel representing the syllable peak and the homorganic glide the offset e.g.

/ki/ 'tree'   –   /kii̯/ 'strange'
/me/ 'eye'   –   /mee̯/ 'niece'
/obasan/ 'aunt'   –   /obaa̯san/ 'grandmother'
/ku/ 'nine'  –   /kuu̯/ 'emptiness'
/ko/ 'child'  –   /koo̯/ 'this way'

Although for convenience long vowels are here written as two segments there is evidence, both diachronic and synchronic, that they are in fact unitary. Historically long vowels have not undergone diphthongization as might be expected if they were structurally bipartite (cf. Donegan 1978, p. 56). What changes they have undergone e.g. raising [ææ̯] > [ee̯] have affected the long vowel as a whole. There is also evidence from synchronic speech processing. Martin (1974) cites a game called sakasa kotoba 'upside-down words' (also the source of a special language referred to as yakuza kotoba 'gangster argot') which requires that the first and last portions of words be reversed, e.g. /kore//reko/. Table 2.4 contains examples involving long vowels and other syllables:

Table 2.4

'Upside-down Words'

 OriginalReversed
1./tai̯ra/ 'Pr Name'/rai̯ta/
2./kappu/ 'cup'/pukka/
3./ringo/ 'apple'/gonri/ ~ /gorin/
4./gindza/ 'Ginza'/dzangi/ ~ /dzagin/
5./byoo̯bu/ 'screen'/bubyoo̯/
6./tii̯dzu/ 'cheese'/dzutii̯/

Where long syllables are involved the syllable is divided if it contains a diphthong (example 1) or an offset nasal (examples 3 and 4). However, it is never divided if it contains a long vowel (examples 5 and 6).

Long vowels may also arise on the surface in lenited speech due to the application of the process Vowel Coalescence (cf. §4.1.4) to a sequence of identical vowels in adjacent syllables e.g. [suu.ri] 'vinegar vendor' (< /su/ 'vinegar' /uri/ 'sell').

2.3.2 Diphthongs

A diphthong is a syllable nucleus with two vowel segments only one of which is syllabic. The non-syllabic may come from an adjoining consonant which is weakened e.g. z > y, b > w; or from an adjoining vowel which loses its syllabicity e.g. i > y, u > w. Diphthongs may also arise from simple vowels (Donegan 1978 p. 111) but such a development seems not to have occurred in Japanese. in ongliding diphthongs the non-syllabic precedes the syllabic e.g. [ya, wa]. In off-gliding diphthongs the non-syllabic follows the syllabic e.g. [ai̯, au̯]. In mora-counting languages such as Japanese an off-gliding diphthong constitutes a long (i.e. two-mora) syllable.

2.3.2.1 On-gliding diphthongs

OJ had the following on-gliding diphthongs (Martin 1976):

ye, ya, yo, yu
wi, we, wa, wo

Due to the gradually more general application of processes eliminating prevocalic glides SJ now has only:

ya, yo, yu
wa

and [wa] alternates with [a] in hypoarticulate speech (cf. Chapter 4). The loss of the labial glide before vowels occurred in the following historical order: earliest before the palatal vowels /i,e/ (where it apparently shifted to /y/ which was itself subsequently lost (Martin 1975); next before /o/ (this is a recent development); and last before /a/ where it is now optionally fronted or deleted depending on the environment. The following two processes account for these developments:(3)

(P1)   GLIDE FRONTING:   W → Y / ___[ V
[ - lab
[ ! higher
]
]
]

Before /i,e/ this process governs the lexicon assuring */wi, we/. Before /a/ and preceded by a palatal vowel it optionally governs surface structure accounting for [iwa, ewa)[iya, eya) e.g. /i wa sinai/[i ya sinai] 'it is not found', /kore wa/[kore ya] 'this (topic)'. The application of this subprocess of P1 is most frequent where the topic marker 'wa' is involved.

(P2)   PRE VOC LAB GLIDE DEL:   W → Ø / ___[ V
[ - pal
[ ! higher
]
]
]

Before /u,o/ this process governs the lexicon assuring */wu,wo/. Before /a/ application is constrained as follows:

(i) Preceded by a non-palatal vowel it optionally governs surface structure e.g. /boku wa/[boku a] 'I (topic)', /soko wa/[soko a] 'there (topic)', /siawase/[s̆iaase] 'happiness'.

(ii) Preceded by # (word boundary) it usually does not apply e.g. /waza waza/*[aza aza ]. (Note however [atas̆i], a female speech version of [watas̆i] 'I').

The loss of /ye/ was due to the palatal counterpart of LAB GLIDE DEL and is accounted for by the following process:

(P3)   PRE VOC PAL GLIDE DEL:   Y → Ø / ___[ V
[ + pal
[ ! higher
]
]
]

(P1), (P2), and (P3) together assure */yi, ye, wi, we, wu, wo/. The effects of (P2) and (P3) can be seen in loan phonology. The following data is from Lovins (1973 p. 97):

/iisuto/   'yeast'
/eroo/'yellow'
/uuru/   'wool'
/ikooru/  'equal'

In Chapter V alternate forms of these loan words are discussed.

The 'mirror image' of PRE VOC PAL GLIDE DEL applies optionally in hypoarticulate speech to delete [y] after palatal vowels before non-palatals as in [iyoo̯][ioo̯] 'let's stay' (cf. §4.2.3).

Although there is diachronic evidence for Glide Fronting, at least before /e/, synchronic evidence is difficult to come by due to */yi, ye/. However there are some hypoarticulated forms which show its effects, e.g.

/inoue/[inouwe][inouye] 'Proper Name'
/nihon e/[nihõõ e][nihõõ we][nihõõ ye] 'to Japan'

The /w/ in these forms arises from application of Glide Epenthesis, a fortition process described in §2.3.3. It is not unusual for sequences which are lexically inadmissible (e.g. [we] and [ye] in these examples) to be created on the surface in hypoarticulate speech processing. Stampe (1973) gives the example of the hypoarticulate form [bniθ] 'beneath' which violates the lexical constraint in English against initial stop-nasal clusters.

2.3.2.2 Off-gliding diphthongs

There is general agreement that the off-gliding diphthongs in SJ are [ai̯, ei̯, oi̯, ui̯] (Hattori 1960, Martin 1975, McCawley 1968) with Martin adding [au̯] as a possibility in loan words. But the phonetic basis for these assertions is not clear. Martin (1967) gives only 'morphophonemic' status to bi-moric syllables saying they have 'nothing to do with any assumed physiological manifestations (p. 247).' Later however (Martin 1975) he says the notion of the bi-moric syllable is based on 'auditory impression and on accent behavior; it is assumed that experimental phonetic investigations will prove the existence of an articulatory (motor-production) unit that corresponds to these ... syllables (p. I-2).' If we assume bi-moric syllables with two vowel qualities (i.e. off-gliding diphthongs) then one of the vowels must be [+ syllabic] (the peak) and one [- syllabic] (the offset). It follows that the well-known constraint that only the first mora of a long syllable can bear the accent (McCawley 1968) is due to the fact that accent must fall on the syllable peak. In [tadáI̯ma] 'right now' (< [táda] 'just' + [íma] 'now') the accent shift is a prerequisite to considering to be a diphthong what was originally two separate vowels. In /siíru/ 'to force' the location of the accent shows the /i/'s are not tautosyllabic. (Cf. Martin (1975 p. 24-5) for a relevant discussion of verb accentuation.

A criterion for assigning two vowel qualities to the same syllable might be their joint participation in processes such as monophthongization. Historically the substitutions [ai̯, ei̯, oi̯,] > [ee̯] have occurred in at least some dialects, and the same may be said for [au̯, ao̯, ou̯] > [oo̯] and for [e̯] > [yoo̯]. In the discussion below vowel sequences which form the input to processes of this sort are assumed to be off-gliding diphthongs.

In the history of Japanese, intramorphemic vowel sequences arose from diachronic developments of the type CVCV > CVV where the elided consonant included among others the palatal and labial glides discussed in §2.3.2.1. The resulting vowel sequences plus subsequent diachronic and synchronic vowel sandhi developments, if any, are indicated below. It will be seen that the same processes are involved in diachronic and synchronic substitutions, the different effects being due to limitations on the application of the process in one case but not the other. (All diachronic evidence cited in 1-20 below is from Martin 1976.)

/Vi̯/ Sequences:

1. /ei̯/ virtually non-occurring in ModJ due to /ei̯/ > /ee̯/ which governs the lexicon for most speakers. e.g. /sensei̯/ > /sensee̯/ 'teacher'; also applies in loan phonology (cf Chapter 5).
2. /ai̯/ occurs in ModJ. Diachronic evidence for /ai̯/ > /ææ̯/ >/ee̯/. Merges with /ae̯/ in hypoarticulate speech (cf Chapter 4) and in some dialects (cf Iitoyo 1976, p. 272 ff).
3. /oi̯/ occurs in ModJ. Diachronic evidence for /oi̯/ > /ee̯/. Merges with /oe̯/ in hypoarticulate speech.
4. /ui̯/ occurs in ModJ. Merges with /ii̯/ in hypoarticulate speech.

/Vu̯/ Sequences:
5. /ou̯/virtually non-occurring in ModJ due to /ou̯/ > /oo̯/ which governs the lexicon for most speakers. e.g. /kou̯/ > /koo̯/ 'this way'. Also applies in loan phonology.
6. /au̯/ non-occurring (except in recent loans) in ModJ morphemes due to /au̯/ > /oo̯/ e.g. /siyau̯/ > /siyoo̯/ 'a method'.
7. /eu̯/ non-occurring in ModJ due to /eu̯/ > /yoo̯/ e.g. /keu̯/ > /kyoo̯/ 'today'.
8. /iu̯/ occurs in ModJ though there is diachronic evidence (e.g. /yorosiku/ > /yorosiu̯/ > /yorosyuu̯/ 'good') and synchronic evidence (cf Chapters 4 and 5) for /iu̯/ > /yuu̯/.

/Ve̯/ Sequences:
9. /ae̯/ occurs in ModJ though there is diachronic evidence for /ae̯/ > /ee̯/ (e.g. /temae̯/ > /temee̯/ 'you' in some dialects).
10. /oe̯/ occurs in ModJ.
11. /ue̯/ occurs in ModJ. Evidence for [ue̯] → [ (w)ee̯] in hypoarticulate speech and loan words.
12. /ie̯/ occurs in ModJ though there is diachronic evidence for /ie̯/ > /ee̯/ e.g. /kie̯ru/ > /kee̯ru/ 'to be extinguished' (Hiroshima dialect) and evidence in hypoarticulate speech for [ie̯] → [(y)ee̯].

/Vo̯/ Sequences:
13. /ao̯/ occurs in ModJ though there is diachronic evidence for /ao̯/ > /oo̯/ e.g. /mao̯su/ > /moo̯su/ 'to say'.
14. /uo̯/ occurs in ModJ. There is evidence in hypoarticulate speech and loan words for [uo̯] → [(w)oo̯]
15. /io̯/ occurs in ModJ. There is evidence in hypoarticulate speech for [io̯] → [yoo̯].
16. /eo̯/ non-occurring in ModJ due to /eo̯/ > /yoo̯/ e.g. /meo̯to/ > /myoo̯to/ 'husband and wife'.

/Va̯/ Sequences:
There are no Va sequences from diachronic developments. The few occurrences in ModJ are due to recent loans.
17. /ia̯/ e.g. /kasimia̯/ 'cashmere' but there is evidence for /ia̯/ → /iya/ e.g. /siberiya/ ~ /siberia̯/ 'Siberia', /itariya/ ~ /itaria̯/ 'Italy' and many such pairs. Also school children are known to have spelling difficulty with the distinction. There is also evidence for /ia̯/ → /ya/ e.g. /syamu/ 'Siam' (Lovins 1973 p. 98). There is some diachronic evidence for /ia̯/ > /e/ e.g. /kiyasu/ > /kia̯su/ > /kesu/ 'to erase'.
18. /ea̯/ e.g. /sukuea̯/ 'square' but there is evidence for /ea/ → /eya/ e.g. /hea̯/ ~ /heya/ 'hair', /hurea̯/ ~ /hureya/ 'flare' (Lovins ibid p. 77).
19. /oa̯/ e.g. /doa̯/ 'door', /sukoa̯/ 'score'.
20. /ua̯/ e.g. /manikyua̯/ 'manicure', /amatyua̯/ 'amateur'.

2.3.2.2.1 Types of off-gliding diphthongs

Typologically the sequences in 1-16 may be divided into three groups of off-gliding diphthongs:

(I) Up-gliding diphthongs not of mixed color: /ei̯, ai̯, ou̯, au̯, ae̯, ao̯/.
(II) Up-gliding or level diphthongs of mixed color: /oi̯, ui̯, eu̯, iu̯, oe̯, eo̯/.
(III) Down-gliding diphthongs: /ie̯, io̯, ue̯, uo̯/.
Historical and dialectal evidence indicates that where the non-syllabic, was higher than the syllabic (an up-gliding diphthong), the non-syllabic sought the height of the syllabic. I will account for this development by the process

(P4)   PROGR HT ASSIM:  
[ n height
]
]
  →   [ n - 1 ]   /   V
[ m height
]
]
  _____
 Condition: n higher than m

Where the vowel was low and the glide high, the assimilation presumably occurred in two steps through iterative application of the process:

Group I ei̯ > ee̯
ou̯ > oo̯
ai̯ > ae̯ > aæ̯
ae̯ > ae̯ >aæ̯
au̯ > ao̯ > aɔ̯
ao̯ > ao̯ > aɔ̯
Group II oi̯ > oe̯
eu̯ > eo̯

The sequences involving low achromatic vowels were then subject to

(P5)   VOWEL COLORING:   V
[ - color
]
]
  →   [ α color]   /   ____   V
[ α color
]
]

e.g. aæ̯ > ææ̯
e.g.   aɔ̯ > ɔɔ̯

[ɔɔ̯] < [au̯] is attested in MidJ (Nishihara 1970) and was subsequently raised to [oo̯]. [ææ̯] is attested in some ModJ dialects (Itoo 1979, Iitoyo 1976) but in SJ was raised to [ee̯].

For Group I sequences the result of these developments was monophthongization which effected the mergers

ai̯
ae̯   >   ee̯
ei̯
au̯
ao̯   >   oo̯
ou̯

Group II sequences underwent further change by Syllabicity Reversal, a process which applies to off-gliding diphthongs whose non-syllabic is not higher than its syllabic:

(P6)   SYL REV:   [ + syll
[- cons
[ n ht
]
]
]
[ - syll
[ - cons
[ m ht
]
]
]
  →   [ - syll
[ - cons
]
]
[ + syll
[ - cons
[ + long
]
]
]
Condition: m not higher than n

The [+long] syllabic on the right side of the arrow is written VV̯ in accordance with the convention proposed in §2.3.2, e.g.:

oe̯   >   o̯ee̯
eo̯   >   e̯oo̯
ui̯   >   wii̯
iu̯   >   yuu̯

The resulting non-high onset glides underwent Raising (the same process proposed in §2.2.1.1 for vowels). Glides are tenser and more chromatic than their homorganic vowels and so are even more subject to Raising. This implicational relationship (i.e. the raising of vowels implies the raising of homorganic glides) establishes that the same process is involved. The following formulation is from Donegan (1978) with a slight revision to include glides:

(P7)   RAISING:   [! - cons
[! n high
[! + chromatic
[! + tense
[! lower
]
]
]
]
]
  →   [n + 1 high] + tense
(! = especially)

e. g.o̯ee̯ → wee̯
 e̯oo̯ → yoo̯

[wee̯] was then subject to Glide Fronting and Palatal Glide Del: [wee̯] → [ yee̯ ] → [ ee̯ ].

The following are complete diachronic derivations exemplifying monophthongization and syllabicity reversal:

  eu̯ ui̯ oi̯ iu̯ ai̯ au̯ ei̯ ou̯
GLIDE HT ASSIM eo̯   oe̯   aæ̯ aɔ̯ ee̯ oo̯
VOWEL COLORING         ææ̯ ɔɔ̯    
SYLABICITY REVERSAL e̯oo̯ wii̯ o̯ee̯ yuu̯        
RAISING yoo̯   wee̯   ee̯ oo̯    
GLIDE FRONTING   yii̯ yee̯          
GLIDE DELETION   ii̯ ee̯          
  [yoo̯] [ii̯] [ee̯] [yuu̯] [ee̯] [oo̯] [ee̯] [oo̯]

2.3.2.2.2 Derivation vs. Borrowing

Stated in their most general form processes often define the goal of a sound change in progress and provide a principled account of the relationship between various dialects in terms of evolution toward that goal. However processes do not always apply in their most general form, and sound changes frequently stop short of completion e.g. there are many [ai̯] sequences which did not monophthongize or which did so only in some dialects. Generative treatments (e.g. Hasegawa 1979, Slawson 1970) often note that only certain lexical items are subject to the 'ai̯, oi̯, ae̯ → ee̯ rule' and that there are certain social constraints on its application. Consider the following pairs from Hasegawa (ibid p. 129)

1. /atarimae̯/ /atarimee̯/ 'of course'
2. /nai̯/ /nee/'negative suffix'
3. /sugoi̯/ /sugee̯/ 'terrific'
4. /koi̯tu/ */kee̯tu/'this man'
5. /sai̯go/ */see̯go/'the last'
6. /kimae̯/ */kimee̯/ 'generosity'

The forms on the left occur in SJ. The use of the right hand forms in examples 1, 2, and 3 by standard speakers is restricted to males. The impression is dialectal or slangy. The right hand forms in examples 4, 5, and 6 do not occur at all in standard speech. Under the circumstances it seems clear that the [ai̯] – [ee̯] alternation in the above pairs is no longer a phonetically motivated substitution. The alternants are equally pronounceable by standard speakers and are not sensitive to rate of speech or other factors which might suggest phonetic motivation. For SJ the source of the right hand forms is interdialectal borrowing, not derivation by processes from the forms on the left. In Chapters 4, 5, and 6 evidence from hypoarticulate speech, loan phonology, and Hawaiian Japanese, respectively, will be examined to throw light on the synchronic status of these processes.

The down-gliding diphthongs in §2.3.2.2.1 labelled Group III (i.e. [ie̯, io̯, ue̯, uo̯] do not seen in general historically to have undergone SYL REV though they meet the input conditions as formulated for that process. (The example /kie̯ru/ > /kee̯ru/ in §2.3.2.2 is an example however.) Apparently SYL REV applied primarily to sequences where the syllabic and non-syllabic were the same height, as in Group II. However, we shall see that the application of SYL REV does include down-gliding diphthongs in hypoarticulate speech (cf Chapter 4) and loan words (cf Chapter 5).

2.3.3 Glide Epenthesis

Besides the lenition processes so far considered in this chapter there are also fortition processes which act to eliminate vowel sequences and restore a CVCV pattern. Glide Epenthesis is an optional fortition which interposes a non-syllabic between two vowels. This prevents the lenition of vowels in a sequence thus maximizing perception, the typical result of fortition processing.

(P8)   GLIDE EPENTHESIS:   0   →   [ - cons
[ - syll
[ α color
]
]
]
  /   [ + syll
[ α color
[ ! higher
]
]
]
  ____   [ + syll
[ - α color
[ ! lower
]
]
]

Examples:

1. /siawase/ → [siyawase] 'happiness'
2. /sio/ → [siyo] 'salt '
3. /iu/ → [iyu] 'to say'
4. /kore o/ → [koreyo] ' this (accusative)'
5. /o atuu̯/ → [owatsuu̯] 'hot (honorific)'

The application of Glide Epenthesis can result in mergers of the following sort:

6. /kioo̯/ 'the past' → [kiyoo̯] (merges with /kiyoo̯/ 'dexterity').
7. /kare ni aru/ 'belongs to him' → [kare ni yaru] (merges with /kare ni yaru/ 'give to him').
8. /uti ni oru/ 'be at home' → [utsi ni yore] (merges with /uti ni yoru/ 'call at home').
9. /o ari desu ka/ 'does it exist?' → [owari des ka] (merges with /owari desu ka/ 'are you finished?' )


NOTES

(1) daben 'idle chat' is a Sino-Japanese compound.
(2) I owe such of my understanding of vowel processes to Donegan (1978) and (Donegan) Miller (1972, 1973).
(3) The notation used in the formalization of processes will be alphabetical where no sacrifice in clarity will result. Where feature notation is used it will vary from cover terms such as 'glide,' 'height,' 'color' to the feature terms themselves. In all cases the allusion is to articulatory gestures. The notational device '!' (read 'especially') is from (Donegan) Miller (1972). The feature matrix for consonants will appear in Chapter III. That for vowels and glides is as follows:

 i e a o u y(i̯) w(u̯)
cons   ----- -----
syll   +++++ -----
pal  + +--- + +---
lab  - - - + + - - - + +
high   + - - - + + - - - +
low  -- + - - -- + - -


 

CHAPTER THREE

CONSONANTS

3.0 Consonants

3.1 Japanese has the following twelve underlying consonants:

Table 3.1

Underlying Consonant Inventory

p t s k h
b d dz g  
m n r    

Consonants will be referred to either segmentally, as above, or by feature specification. The following articulatorily based features will be necessary:

Table 3.2

Consonant Feature Matrix

p b t d s dz k g h m n r
cons + + + + + + + + + + + +
syll - - - - - - - - - - - -
son - - - - - - - - - + + +
high - - - - - - + + - - - -
back - - - - - - + + - - - -
low - - - - - - - - + - - -
ant + + + + + + - - - + + +
cor - - + + + + - - - + + +
voi - + - + - + - + - + + +
cont - - - - + - - - + - - -
nas - - - - - - - - - + + -
str - - - - + + - - - - - -
del rel - - - - ? + - - - - - -
flap - - - - - - - - - - - +
lab + + - - - - - - - + - -
pal - - - - - - - - - - - -

3.1.1 Underlying /dz/

The choice of /dz/ as the voiced counterpart of /s/ requires some comment. Most analyses posit /z/ with an accompanying statement that for most SJ speakers it is realized as [dz] initially and [z] intervocalically. Structurally, the /z/ analysis is more symmetrical and in terms of language universals it is usually true that languages with affricates have the corresponding fricatives as well. Nevertheless, from the point of view of NP it seems more natural to make the stronger form underlying and provide for the occurrence of the weaker 'allophone' by a lenition process. The NP analysis involves a context-sensitive fortition z → dz /.__ (. = syllable boundary) which governs the lexicon and a context-sensitive lenition dz → z / V-V ordered after it which governs derived structure. Han (1960) describes the distribution of the two variants as follows:

[dz] and [z] do not contrast. They are in free variation with some speakers and with others they are in complementary distribution. In the speech of the author, a number of spectrographic experiments revealed that [dz] occurs initially as in [dzuibun] (considerably) [dzoo] (elephant), whereas [z] occurs in non-initial position as in [suzus̆ii] (is cool) [s̆izuka] (is quiet). However, in slow careful speech [dz] may occur in non-initial position. (pp 49-50)

There is also data from non-standard dialects (cf. Iitoyo 1974) showing [dz] in all positions. Frequent substitutions in children's speech of the sort [yowamus̆i] → [yowamuts̆i] 'sissy' suggest that the z → dz fortition may be part of a more general process s, z → ts, dz with the voiceless portion eventually suppressed by the child as he approximates mature speech. Historically Arisaka (1957) claims that sibilants in ModJ were affricates in OJ. If so, the more general form of the process governed the OJ lexicon.(1)

3.1.2 Syllable initial (onset) distribution:

The twelve consonants combine with the five simple vowels /a, i, u, e, o/ and the rising diphthongs /ya, yu, yo, wa/ (each of which constitutes a short syllable itself) to form the following 100 short syllables:

Table 3.3

Short Syllable Inventory

a pa ka ba da ga sa dza ha ma na ra
i pi ki bi   gi si dzi hi mi ni ri
u pu ku bu   gu su dzu hu mu nu ru
e pe ke be de ge se dze he me ne re
o po ko bo do go so dzo ho mo no ro
ya pya kya bya   gya sya dzya hya mya nya rya
yu pyu kyu byu   gyu syu dzyu hyu myu nyu ryu
yo pyo kyo byo   gyo syo dzyo hyo myo nyo ryo
wa                      

/wa/ may not be preceded by a tautosyllabic consonant in SJ due to POST CONS LAB GLIDE DEL which governs the lexicon, but /kwa/ and /gwa/ occur in some dialects.

(P9)   POST CONS LAB GLIDE DEL:   /w/ → Ø / .C__

For an explanation of */di, du, dya, dyu, dyo/ see 3.1.4.2.5.

3.1.3 Syllable final (offset) distribution:

The consonants /p, t, k, s, n/ may also occur in syllable final position. As offset segments they have the value of one mora (they are often referred to as 'syllabic' or 'mora' consonants) and the syllables in which they occur are long syllables.(2) Thus the 105 short syllables of Table 3.3 form the basis for 525 possible long syllables when combined with each of the five offset consonants e.g. /ap, at, ak, as, an, tap, tat, tak, tas, tan/(3)

3.1.3.1 Distribution of offset nasal

Offset /n/ may be followed by any segment or by (word boundary), e.g. /den.po/ 'telegram', /han.tai̯/ 'opposite' /den.wa/ 'telephone' /hon.ya/ 'bookstore', /tan.i/ 'a unit', /gan#/ 'cancer'.

3.1.3.2 Distribution of offset obstruents

There is a sequential constraint on the occurrence of offset obstruents /p, t, k, s/ such that they may not be followed by # and that the following syllable must begin with an identical obstruent. The result is inter-syllabic geminates as in /ip.pai̯/ 'full', /ot.to/ 'husband', /kek.kyo.ku/ 'after all', /is.syo/ 'together'.

3.1.3.3 NP and GP analyses of geminate consonants compared

Most occurrences of geminate obstruents in Japanese are; in vocabulary of Chinese origin (cf. Kuroda 1964 for a thorough study of this and other sources). Japanese has many bi-morphemic lexical items borrowed from Chinese where the first element in the compound is a bi-syllabic morpheme ending in /-tu/ or /-ti/, e.g. /zetu/ 'tongue', /niti/ 'day' and the second element in the compound begins with a voiced sound, e.g.

/zetu + on/   'lingual sound'
/niti + dzyoo/   'every day'

However if the second morpheme of the compound begins with a voiceless sound (i.e. /k, t, h(<p), s/ the compound will have geminate /kk, tt, pp, ss/ at the morpheme boundary, e.g.

(i) /zetu/ 'tongue' (ii) /niti/ 'day'
a) /zek + ken/ 'dorsum' a) /nik + ken/ 'daily'
b) /zet + too/ 'apex' b) /nit + tee/ 'daily routine
c) /zep + poo/ 'tongue' c) /nip + poo/ 'daily report'
d) /zes + sen/ 'word war' d) /nis + si/ 'diary'

McCawley (1968) formulates a classic generative analysis of the data in (i) and (ii). He represents these Sino-Japanese compounds in the lexicon in their ungeminated forms and provides for gemination by 1) deleting the high vowel between /t/ and /k, t, h, s/ and 2) regressively assimilating the point and manner of articulation of the then morpheme-final /t/ to the following voiceless obstruent.

The NP analysis I am proposing has lexical representations less abstract than those proposed by McCawley. Compounds with geminate consonants will appear as such in the lexicon. On this analysis McCawley's obligatory high vowel deletion and regressive obstruent assimilation rules play no role in the underlying-to surface derivation of Sino-Japanese compounds. High vowel deletion is, in fact, a live process in Japanese, but its application is optional and not limited to SJ compounds. It manifests itself in the processing of hypoarticulate speech under purely phonological conditions as I will show in §4.1.1. As for regressive obstruent assimilation, it governs the lexicon assuring */tp, ts, tk/ and applies optionally though less generally than high vowel deletion in hypoarticulate speech (cf. §4.3.2). McCawley's analysis is rejected because processes which apply optionally and in a very general fashion under conditions implying phonetic motivation (i.e. in hypoarticulate speech) would have to be specified under non-phonetic conditions as applying obligatorily to a portion of the lexicon.

An alternative analysis suggested to me by Robert Cheng would have just two allomorphs – a vowel-final one (/zetu/, /niti/) and a consonant-final one (/zet/, /nit/). On this analysis the underlying representation of the above compounds would be as follows:

/zet+ken/ /nit+ken/ /zet+too/ /nit+tee/ /zet+poo/ /nit+poo/ /zet+sen/ /nit+si/

This analysis requires that REGR OBS ASSIM which governed the lexicon under the original analysis be constrained to apply only derivationally to yield superficial geminates. This cost would be worthwhile if all geminate obstruents in the language could be derived from /-t/ plus /p, t, k, s/, for we would then gain the advantage of allowing only one syllable-final obstruent (/t/) in the lexicon – a simpler phonology than one allowing four since in the former case only the process(es) responsible for barring syllable-final /t/ must be constrained so as to allow it whereas in the latter case the process(es) barring syllable-final /p, k, s/ must also be constrained so as to allow them. However, there are some geminate obstruents which according to the principles of PP cannot be derived from an underlying /t/. Consider the following alternations: [gakubu] 'faculty' [gakusei] 'student' [gakuhu] 'seat of learning' [gakuto] 'student' [gakkoo] 'school' Unlike the earlier set of /t/ morphemes those with /k/ do not geminate except when followed by homorganic /k/ as in [gakkoo]. There are thus only two allomorphs [gaku] and [gak], and the 'morphophonemic' representation of the latter as /gat/ is not justified since there is no alternation and the phonetic representation can be derived from a 'phonemic' one. The 'economy' of the alternative analysis is thus lost along with the justification for choosing it over the less abstract one originally proposed.

3.1.4 Obligatory Consonant Lenitions

The twelve underlying consonants are rendered pronounceable by the obligatory application of a number of lenition processes.

3.1.4.1 Offset /n/ lenition:

Offset /n/ is particularly susceptible to lenition occurring as it does in a 'weak' (syllable final) position. If it is followed by a continuant or # it will be realized as a nasalized continuant i.e. an offglide homorganic with the preceding vowel e.g. /tan.i/ [tãã̯.i] 'a unit', /honya/ [hõõ̯ya] 'bookstore', /ben.si/ [bẽẽ̯.s̆i] 'narrator', /bin.sen/ [bĩĩ̯.sẽẽ̯] 'stationery'. The following process provides for this case:

(P10)   PROG OFFSET /n/ ASSIM:   /n/   → [ - sul
[ + nas
[ α F
]
]
]
  /   [ V
[ α F
]
]
___ { [ + cont ]
{     #
}
}
 
(F = all features not specified)

If offset /n/ is followed by a non-continuant it will be realized as a nasal non-continuant homorganic with the following non-continuant, e.g. /sen.bee/ [sẽm.bee̯] 'cracker' /hon.too̯/ [hõn.too̯] 'true', /dzin.koo̯/ [dz̆iŋ.koo̯] 'population'. The following process provides for this case:

(P11)   REGR OFFSET /n/ ASSIM:   /n/ → [α position]   /   ___ [ - cont
[ α position
]
]

The vowel preceding offset /n/ is regressively nasalized by the following process:

(P12)   TAUTO-SYL REGR VOWEL NAS:   /V/ → [ + nas ]   /   [ + nas ]

(The contradictory context-free process

(P13)   VOWEL DENAS:   /V/ → [ - nas ]

governs the lexicon.)

3.1.4.2 Onset consonant lenitions

3.1.4.2.1 /h/ is pronounced [ɸ] (a voiceless bilabial fricative) before /u/.

(P14)   H LAB:   /h/ → [ɸ]   /   ___ u.   e.g. /huu̯too̯/ [ɸuu̯too̯] 'envelope'

3.1.4.2.2 /h/ is pronounced [ç] (a voiceless palatal fricative) before /i, y/ by the process PAL to be described in 3.1.4.2.5 below.

/h/[ç] / __ i, y   e.g. /hidoi̯/ [çidoi̯] 'severe'

3.1.4.2.3 /dz/ is pronounced [z] intervocalically:

(P15)   AFFRIC WEAKENING:   /dz/ → [z] / V_V

3.1.4.2.4 /g/ is pronounced [ŋ] intervocalically:

(P16)   STOP NAS: /g/ → [ŋ] /   V_V

3.1.4.2.5 Palatalization

A number of substitution types are frequently subsumed under the term 'palatalization'. Bhat (1974) gives numerous examples from a wide variety of languages of the fronting, raising, and/or spirantization of consonants under the influence of (usually) palatal vowels or glides. In Japanese 'palatalization' velar consonants are fronted to prevelar, labial consonants are raised, and coronal consonants are raised, affricated, and become [-anterior] all before /y, i/. /y/ after [-anterior] coronals is then deleted. As Bhat shows, these substitutions apply quite independently and in various combinations in the world's languages. In the following analysis based on principles of NP each substitution type will be attributed to the application of a distinct process. At appropriate points I will draw contrasts with two well-known alternative treatments – Bloch (1950) and McCawley (1968).

Palatalization is distinctive only before non-palatal vowels. Letting /s/ stand for all consonants the phonetic distribution is as follows:

Plain:  si se sa so su]
Palatalized:  s̆i se s̆a s̆o s̆u]

There is paradigmatic evidence in the verb morphology for underlying plain consonants before /i/ e.g.

[hanas̆i] /hanas + i/ infinitiveof 'to speak'
[hanase]/hanas + e/imperative"
[hanasanai̯]   /hanas + anai̯/   negative"
[hanasoo̯]/hanas + oo̯/tentative"
[hanasu]/hanas + u/non-past"

Palatalization is then derived via a context-sensitive lenition process:

PAL   [ + cons ] → [ + pal ]   /   ___ [ + syll
[ + pal
[ + high
]
]
]

(Note: For consonants [+pal] implies [+high, -back] i.e. either the fronting of [+back] or the raising of [-back] consonants.)

This process is attested in loan phonology where foreign words with plain consonants before /i/ are borrowed with [s̆] e.g. Eng 'sea' = Jpn [s̆I]. Palatalized consonants followed by back vowels are also analyzed as underlyingly plain with palatalization due to the presence of a following palatal glide e.g. [s̆a, s̆o, s̆u] = /sya, syo, syu/. On this analysis, which is also reflected in the native orthography, coronal consonants pattern with non-coronals e.g. [p'ya, s̆a k'ya] = /pya, sya, kya/. To provide for palatalization by /y/ in addition to /i/ PAL must be slightly generalized as follows:

(P17)   PAL   [ + cons] → [ + pal ]   /   ___ [ + pal
[ + high
] (4)
]