Skip to main content
  • Original Article
  • Open access
  • Published:

The effect of speaker gender and talker proficiency on the realization of Taiwan Min /dz/ among young speakers

Abstract

This study examined how speaker gender and talker proficiency affected realization of Taiwan Min /dz/ among young speakers of the 漳 Chiang and Mix dialect. Ten males and seven females with adequate proficiency were recruited to perform a paragraph reading task. Five major variant categories were observed, including dental sibilant, dental nonsibilant, velar obstruent, lateral, and retroflex. Gender played a role in some variant choices. Males used dental sibilants and retroflexes more often than females, while females used liquids more often than males. Dental nonsibilants and velar obstruents showed more complex distributions involving both gender and speaker dialect. Impact of talker proficiency was also found. High-level speakers generally used velar obstruents and retroflexes more often, and dental nonsibilants less often, than their mid-level counterparts. Dental sibilants and liquids showed distributions that were less straightforward and involved intricate interactions among gender, proficiency, and speaker dialect. Despite the range of variant choices, talkers were rather conservative, and most limited their choices to only a couple of variants for /dz/ realization. Males generally showed high intra- and inter-syllabic consistency regardless of their proficiency levels, while females showed a proficiency split. High-level females had high intra-syllabic consistency but low inter-syllabic consistency, while mid-level females had low intra- and inter-syllabic consistency. This study thus demonstrated that talkers’ realization of the variable Min /dz/ is a complex interaction among speaker gender, dialect, and proficiency.

1 Background

Taiwan Min, a language spoken in Taiwan, is genealogically derived from Southern Min, a Chinese language spoken in southeastern provinces of China (Norman 1988). It is the second largest language spoken in Taiwan, and more than 70% of the population claim having at least some passive knowledge of the language (Huang 黃宣範 1993). According to Ang 洪惟仁 (2013), there are three major dialects of Taiwan Min, 漳 Chiang, 泉 Chôan, and Mix, with uneven distributions.Footnote 1Chiang speakers are dominant in the west central inland areas, while 泉 Chôan speakers mainly reside in the northern part and the west central coastal strip. Speakers in the southwestern part of Taiwan predominantly speak a variety of Min that is a more balanced mixture of 漳 Chiang and 泉 Chôan, and is thus appropriately called the Mix dialect (Ang 洪惟仁 2005; Li 李仲民 2009).Footnote 2

According to Ang 洪惟仁 (2003, 2012), there are 17 onset consonants in Taiwan Min, as shown in Table 1. Among them, the voiced sibilant /dz/ attracts much attention. Phonotactically, it only occurs before high vowels /i u/ or glides /j w/. Phonetically, it can be realized as an affricate [dz] or a fricative [z]. Ogawa 小川尚義 (1907) and Ang 洪惟仁 (1997, 2003) argued that the two variants occur in a dialect-dependent fashion, with [dz] being more commonly found in the 泉 Chôan dialect and [z] being more frequently observed in the 漳 Chiang and the Mix dialect. However, all three dialects show a palatalized realization of [ʑ]/[dʑ] when the sound precedes /i/ or /j/. In order to avoid clutter, this study followed the literature tradition and used /dz/ to refer to this sound when its cross-dialectal phonemic status is intended [e.g., Ang 洪惟仁 (2012), Wang 王薈雯 (2014), and Yao 姚榮松 (1988)]. However, readers should keep in mind that variable pronunciations ranging from [z] to [dz] are represented by this symbol.

Table 1 The onset consonant inventory of Taiwan Min (Ang 洪惟仁 1997, 2003, 2012; Ogawa 小川尚義 1907). Please see text for explanation

The dialectal split of [z] and [dz] is not the only variability found in /dz/. A new variant [l] has evolved in the 泉 Chôan dialect and has become the dominant realization of /dz/ (Ang 洪惟仁 2003, 2012; Ang and Chang 洪惟仁, 張素蓉 2008; Hung 洪慧鈺 2007; Thoo 涂文欽 2009; Wang 王薈雯 2014). For young speakers, [l] is almost always used in spontaneous speech (Wang 王薈雯 2014) and solely used in certain syllable types in read speech (Ang 洪惟仁 2003, 2012; Ang and Chang 洪惟仁, 張素蓉 2008; Hung 洪慧鈺 2007). For older speakers, the realization rate of [l] is not as high, but it is still the predominant variant (> 75%) (Ang 洪惟仁 2003, 2012; Hung 洪慧鈺 2007). In other words, the [z]/[dz] realization is rapidly losing ground in an age-dependent manner, and [l] has become the new favorite across all age groups.

The situation is more complex in the 漳 Chiang dialect. Aside from [z]/[dz] and [l], there is also an additional [ɡ] variant, which occurs only before /i j/ (Ang 洪惟仁 2003, 2012), due to language contact with Hakka (Ang 洪惟仁 2012; Chuang et al. 莊雅雯等 2009), the third largest language spoken in Taiwan (Huang 黃宣範 1993). In other words, in addition to the two-way competition between [z]/[dz] and [l] before /u w/, which is similarly observed in the 泉 Chôan dialect, a three-way competition among [z]/[dz], [l], and [ɡ] before /i j/ is found for the 漳 Chiang dialect. This three-way competition is age-dependent and is especially intense between the [z]/[dz] and [l], with [ɡ] being a rather minor form (Ang 洪惟仁 2003, 2012). Like their 泉 Chôan counterparts, older 漳 Chiang speakers tend to adopt [z]/[dz], while younger speakers tend to adopt [l]. However, [z]/[dz] is still a robust variant for even the youngest speakers, and for middle-aged speakers, [z]/[dz] can still be a strong match for [l] in certain syllable types. On the other hand, [ɡ] is mainly used by young adults and middle-aged speakers and is not as preferred by other age groups.

Findings regarding the Mix dialect are more varied and less clear-cut. Most studies agreed that older speakers demonstrate a predominant usage of [z]/[dz], in addition to marginal realizations of [ɡ] and [l] (Ang 洪惟仁 1997; Chen 陳淑娟 1995; Lin 林珠彩 1995). However, Chen 陳雅玲 (2010, 2012) claimed that [l] is always one of the dominant realizations of /dz/ for older speakers, and whether they adopt [z]/[dz] or [ɡ] as an additional main competing variant is location-dependent, leaving the remaining third possibility in a relatively marginal status. For younger speakers, [z]/[dz] is much less preferred, and its distribution is age- (Chen 陳淑娟 1995; Chen 陳雅玲 2010, 2012; Lin 林珠彩 1995) and location-dependent (Chen 陳雅玲 2010, 2012), although some did not find a consistent effect (Khng 康韶真 2014). Findings regarding [ɡ] and [l] are less clear. Most agreed that [ɡ] is a predominant realization (Chen 陳淑娟 1995; Lin 林珠彩 1995), but some suggested that its use is also dwindling with age (Chen 陳雅玲 2010, 2012; Khng 康韶真 2014). Most studies found that the use of [l] is rising among younger speakers (Chen 陳淑娟 1995; Chen 陳雅玲 2010, 2012; Khng 康韶真 2014; Lin 林珠彩 1995), although some claimed that they did not find any use of [l] at all (Ang 洪惟仁 1997). Scholars also disagreed on the distribution of [ɡ] and [l]. Some claimed that they are in free variation before /i j/ and are in direct competition with each other (Chen 陳淑娟 1995; Chen 陳雅玲 2010, 2012; Khng 康韶真 2014), while others argued that they are in complementary distribution (Lin 林珠彩 1995), with [ɡ] being the sole realization before /i j/ and [l] being the sole realization before /u w/.

The above inconsistencies could at least be partially reconciled by time. Most earlier studies tended to observe a robust realization of [ɡ] [e.g., Ang 洪惟仁 (1997), Chen 陳淑娟 (1995), and Lin 林珠彩 (1995)], while more recent ones tended to find [ɡ] being gradually replaced by a rising trend of [l] [e.g., Chen 陳雅玲 (2010, 2012) and Khng 康韶真 (2014)]. It is thus possible that [ɡ] was once a fairly robust realization of /dz/ for the Mix dialect, but has been gradually losing its ground to [l], following the footsteps of the 漳 Chiang dialect.

A more recent study on all three dialects of Min implies that the realization of /dz/ might have further evolved. Using spectrographic data, Chuang and Fon (2017) found that young fluent speakers of Min still maintain their /dz/ realizations in a dialect-specific manner. [z]/[dz] persists to be robust for 漳 Chiang and Mix, and [ɡ] is still strong for Mix, even though [l] is the most predominant realization for all three dialects. 泉 Chôan is the only dialect that completely forgoes [z]/[dz] and does not use [ɡ] at all. In addition, two new realization variants have been identified. A dental nonsibilant realization ranging from [d] to [ð] was found to be commonly adopted by all three dialects. Since Min does not have /d/ in its inventory, the addition of this realization was conjectured to fill the “void” of the system so that the plosive set becomes more “balanced” (cf. Table 1). A second realization that receives little mention in other previous studies is a retroflex variant similar to Mandarin realizations of /ʐ/, which is found in all three dialects, but more robust in 漳 Chiang and 泉 Chôan. This is assumed to be a negative transfer from Mandarin due to a cross-linguistic analogy between the phonological systems of Mandarin and Min. Since /dz/ and /ʐ/ are the only voiced sibilant in Min and Mandarin, respectively, bilingual speakers of the two languages might have made an abstract connection between the two and thus realize Min /dz/ in a fashion similar to Mandarin /ʐ/.Footnote 3 As a consequence, Mandarin retroflex realizations ranging from [ʐ] to [ɻ] are also found to be possible realizations for Min /dz/. Generally speaking, despite the incongruences across the previous studies, it seems safe to say that with regard to /dz/ realization, 泉 Chôan is the most innovative and adopts new variants most willingly, while Mix is the most conservative and tends to forcefully preserve the old forms. 漳 Chiang is somewhere in-between.

Some of the variants mentioned above are genealogically related (Fig. 1). Ang 洪惟仁 (2003, 2012) argued that [dz] is the oldest realization of /dz/ but has undergone some dialect-dependent simplifications due to articulatory difficulty. The 泉 Chôan dialect tends to simplify it into a stop [d], which is later converted into a lateral [l], following a general diachronic lenition trend in the language (Ang 洪惟仁 2012; Yang 楊秀芳 1982). The 漳 Chiang dialect has experienced a different route by simplifying [dz] into [z], which is later also transformed into [l]. The converging development of the two dialects has made [l] currently the most preferred form. The [ɡ] and [ʐ] variants are not related to [l] in their development and arose through language contact (Ang 洪惟仁 2003, 2012; Chuang and Fon 2017). The newly observed [d] variant is likely an effort to reverse the diachronic /d/→[l] lenition so as to maintain the phonemic balance of the system (Chuang and Fon 2017). Of the variants attested in the literature, /dz/→[z], /dz/→[l], and /dz/→[d] could be regarded as internally motivated sound changes, while /dz/→[ɡ] and /dz/→[ʐ] are best characterized as externally motivated sound changes [cf. Hickey (2012)].

Fig. 1
figure 1

An illustration showing the reconstructed sound change process of /dz/. The solid arrows indicate those proposed by Ang 洪惟仁 (2003, 2012), and the dashed arrows indicate those proposed by Chuang and Fon (2017). Sounds surrounded by a single dotted square are variants attested in the literature. The double dotted square signals are currently the most preferred sound variant. Block arrows indicate the sources for the externally motivated sound changes [cf. Hickey (2012)]. Please see text for explanation

Among the four variants, [dz], [z], [ɡ], and [l], Ang 洪惟仁 (2003, 2012) predicted that the former three are gradually phasing out and [l] will eventually be the sole realization for Min /dz/ in a dialect- and syllable-dependent manner, with 泉 Chôan moving at a faster pace than 漳 Chiang, and syllables with a rounded vowel/glide or a final nasal being more compliant to the sound change than those without. However, Chuang and Fon (2017) showed that the newly developed dental nonsibilant is a potential rising competitor for [l]. Even though [l] is currently still the most robust realization, almost all speakers from all three dialects adopt the dental nonsibilant as part of their inventory in realizing /dz/. Therefore, it is likely that the journey of /dz/ might not end with /dz/ being merged with /l/, as was predicted by Ang 洪惟仁 (2003, 2012), but /dz/ might further meander into a dental nonsibilant, given enough time.

Ang 洪惟仁 (2012) attributed the motivation for the variation of /dz/ to three factors. First of all, there is a universal tendency to avoid voiced sibilants in languages around the world. The UCLA Phonological Segment Inventory Database (Maddieson 1984) showed that as high as 92% of its 451 languages investigated incorporate voiceless sibilants in their inventory, while only 51% of the languages include at least one voiced sibilant.Footnote 4 Similarly, 93% of the 629 languages in the P-base database (Mielke 2012) have voiceless sibilants in their inventory, while only 55% include at least one voiced sibilant.Footnote 5 This general avoidance might have an articulatory basis (Ohala 1983). To maintain voicing and friction noise at the same time requires a delicate balance of air pressure in the vocal system so that the intraoral pressure is reliably lower than the subglottal pressure, but substantially higher than the atmospheric pressure. Perceptually, voicing also reduces the RMS frication noise of sibilants, so that voiced sibilants become less distinguishable from their nonsibilant counterparts, resulting in a lower identification rate for voiced sibilants (Balise and Diehl 1994; Miller and Nicely 1955; Singh and Black 1966; Wang and Bilger 1973). Therefore, cross-linguistically, voiced sibilants tend to be shorter in duration and are often restricted to marginal use in the language (Żygis 2008). They also have more varied realizations and are more likely to become voiceless sibilants [English: Smith (1997); Dutch: Gussenhoven and Bremmer Jr. (1983); Portuguese: Jesus and Shadle (2003)] or voiced approximants [Gulf Arabic: Holes (1990); Haitian Creole: Tinelli (1981)]. It is thus conjectured that the predominant realization of [l] for Taiwan Min /dz/ is likely following the same universal tendency.

Secondly, Ang 洪惟仁 (2012) claimed that /dz/ has a low functional load in Taiwan Mandarin, and is rather restricted phonotactically, as it can only occur before /i j u w/, but not before other open vowels. Even within the allowed combinations, there exist many gaps, and there are thus far fewer lexical entries using the sound than its voiceless counterparts. According to the 臺灣閩南語常用詞辭典 Taiwan Southern Min Common Word Dictionary (National Languages Committee 國語推行委員會2011), there are 2927 entries for words containing /si/-, /sj/-, /su/-, or /sw/-initial syllables, and 2111 entries for words containing /tsi/-, /tsj/-, /tsu/-, or /tsw/-initial syllables, but only 711 entries for words containing /dzi/-, /dzj/-, /dzu/-, or /dzw/-initial syllables, which is approximately a 4:3:1 ratio. In addition, Ang 洪惟仁 (2012) asserted that all /dz/-initial syllables belong to the sound category of 讀冊音 thák-chheh-im ‘pronunciation of study’, which is used mainly in literary contexts, and have lower frequency of use in oral communication.Footnote 6 Since the major communication channel for Min is oral, speakers are likely to have much less chance of using the sound than what raw counts in the dictionary suggest.Footnote 7 As a result, relatively little confusion would be incurred if /dz/ is to be merged with other categories.

Finally, Ang 洪惟仁 (2003, 2012) argued that the phonological system can well withstand the potential impact imposed by the /dz/→/l/ merger without major reorganization. As indicated in Table 1, stops are the only obstruents that have a full series of places of articulation in Min, while oral affricates and fricatives occur exclusively at the coronal position. This implies that changes occurring in oral affricates and fricatives would not have jeopardized the phonological stability of the system like it would have in stops. Even if /dz/ is completely merged with /l/, there are no other oral affricate (泉 Chôan) or fricative (漳 Chiang and Mix) series that might be at stake. On the other hand, if any of the stop series, say, /p/, is to be merged with another category, say, /pʰ/, then the system runs the risk of facing a cascading impact on stops in other places of articulation as well, which is likely not welcomed by the system, as it implies major phonological reorganization. However, since merging /dz/ with /l/ incurs only minimal cost on Min phonology, Ang 洪惟仁 (2003, 2012) asserted that it is more likely to be tolerated.

2 Specific aims

There are three specific aims in this study. First, one would like to examine whether gender would be an underlying factor to account for the variability of this sound change. Previous studies on /dz/ variation mainly focused on geographical/dialectal differences [e.g., Ang 洪惟仁 (2003, 2012), Ang and Chang 洪惟仁, 張素蓉 (2008), Chen 陳淑娟 (1995), Chen 陳雅玲 (2010, 2012), Chuang and Fon (2017), Hung 洪慧鈺 (2007), Khng 康韶真 (2014), Lin 林珠彩 (1995), Thoo 涂文欽 (2009), and Wang 王薈雯 (2014)], and little has been discussed on the potential effect of gender. However, as gender has always been a prominent factor for sound changes and variations [e.g., English: Holmes (1999); Japanese: Kong et al. (2012); Mandarin: Baran (2014), Fon et al. (2011), and Zhang (2005); Tibetan: Reynolds (2012); Yami: Rau et al. (2009)], it is surmised that there might also be a gender difference in the realization of Min /dz/. Previous research showed that the effect of gender on sound changes is not always straightforward and has more to do with the nature of the change and how it interacts with the two genders (Labov 1990; Maclagan et al. 1999; Trudgill 2000). In general, females are more conservative than males when it comes to variations with stable social stratifications and favor the standard form over the nonstandard form (Labov’s Principle I). However, when facing changes that are still volatile, their attitude is much dependent on the nature of the changes. For conscious changes (termed “changes from above”), females also choose the prestigious form (Labov’s Principle Ia), while for unconscious changes (termed “changes from below”), they are often innovators and tend to adopt new forms more fervently than males (Labov’s Principle II). This is generally called the “gender paradox” of sound change.

In the case of Min /dz/, Ang 洪惟仁 (2003) surveyed 769 university freshmen with regard to the realization variant with which they most identify (Table 2). Results showed that young 泉 Chôan speakers identified with [l] more than [z]/[dz] for all syllable types. However, such a strong trend was not observed among 漳 Chiang and Mix speakers, whose preference for [l] was rather syllable-dependent. In some phonotactic environments, [l] and [z]/[dz] even showed equal preferences. As for the [ɡ] variant, the identification rates were unanimously low across all three dialects, with the Mix dialect having the highest identification rate and the 泉 Chôan dialect the lowest. In any case, no more than 15% was found for this variant.

Table 2 A summary of Ang’s 洪惟仁 (2003) survey on the realization variants of /dz/ with which college students most identify. The last row is the actual realization rates of young speakers from Ang 洪惟仁 (2003, 2012). CV+RX-N: syllables with a rounded vowel/glide but without a final nasal; CV-RX-N: syllables with an unrounded vowel/glide but without a final nasal; CV-RX+N: syllables with both an unrounded vowel/glide and a final nasal. Prod: production results from Ang 洪惟仁 (2003, 2012)

Although it is difficult to outright equate identification with prestige, a comparison with the actual production results from Ang 洪惟仁 (2003) showed that some kind of value judgment has indeed intervened, as there was a large discrepancy between the identification rates and the actual realization rates. In particular, even though the identification rates for [l] in 漳 Chiang and 泉 Chôan speakers were only 50 and 64%, respectively, the realization rates were much higher, around 73% for 漳 Chiang and 97% for 泉 Chôan. In contrast, although the identification rates for [z]/[dz] among 漳 Chiang and 泉 Chôan speakers were 45 and 34%, respectively, the variant was relatively minor in the former (26%) and could only be considered as extremely marginal in the latter (4%) when production is considered. [ɡ] was the only variant that showed comparable identification and realization rates. In other words, it is probably rather safe to say that the identification rates reported in Ang 洪惟仁 (2003) to a large extent reflected the prestige judgment of the listeners, rather than just pure identification of neutral dialectal indicators (cf. Labov 1971).

Therefore, based on the results in Table 2, one could deduce that [l] only has a clear prestige in the 泉 Chôan dialect. This prestige is somewhat diminished in the Mix dialect and is virtually nonexistent in the 漳 Chiang dialect, in which both [z]/[dz] and [l] seem to be of approximately the same standing. The [ɡ] variant is considered the least prestigious regardless of dialects. If one combines Ang’s 洪惟仁 (2003) findings with Labov’s (1990) predictions on volatile sound changes regarding gender, then one would expect female 漳 Chiang speakers to take the lead and prefer [l] to [z]/[dz] more than their male counterparts, as females are partial to innovative forms (i.e., Principle II). One would also expect female 泉 Chôan and Mix speakers to similarly prefer [l] to [z]/[dz] more than their male peers since females favor variants of higher social status (i.e., Principle Ia). However, female speakers of all three dialects should unanimously avoid the [ɡ] variant more than male speakers in general due to its lower social status (i.e., Principle Ia).

As for the two new variants mentioned in Chuang and Fon (2017), the dental nonsibilant seems to be a novel form that does not have any attached negative connotation, and might thus be considered a Principle II case. As a consequence, it is predicted that females would be more likely to use this variant than male speakers. On the other hand, the retroflex variants are likely due to a negative transfer from Mandarin, and might thus be considered a Principle Ia case. Therefore, females might be more likely to avoid such a variant than their male counterparts. However, since Ang 洪惟仁 (2003) has not included these two variants in his judgment study, it is less clear whether these two variants are indeed perceived as conjectured above.

Secondly, this study would like to examine whether speaker proficiency would affect /dz/ realization. Previous studies mostly recruited speakers based on the place of origin and residency of the speakers themselves and/or those of the speakers’ parents [e.g., Ang 洪惟仁 (2003, 2012), Ang and Chang 洪惟仁, 張素蓉 (2008), Chen 陳淑娟 (1995), Chen 陳雅玲 (2010, 2012), Chuang and Fon (2017), Hung 洪慧鈺 (2007), Khng 康韶真 (2014), Lin 林珠彩 (1995), Thoo 涂文欽 (2009), and Wang 王薈雯 (2014)]. In other words, speakers’ Min level is assumed to be not only proficient but also homogeneous in these studies. However, ever since the implementation of the Mandarin-only policy in 1946 in Taiwan, non-national languages such as Min have been largely restricted to private domains (Huang 黃宣範 1993), and even in this realm, Mandarin is gradually encroaching. According to the 2010 Population and Housing Census conducted by the Department of Census in Taiwan 行政院主計總處 (2013), the ratio of using Mandarin as a home language is negatively correlated with speaker age, while that of Min is positively correlated. More than 90% of speakers below age 45 reported using Mandarin at home, while less than 70% of speakers under age 14 claimed to use Min.

Even for those who speak Min, their proficiency is also on the decline. According to a poll conducted by the United Daily News Poll Center 聯合報系民意調查中心 (2002), Min proficiency is positively correlated with speaker age.Footnote 8 For speakers below age 30, only 43% of them are able to speak Min fluently, compared to 74% for those over age 40. In addition, for Min parents having children under age 20, even though 70% of them claimed to be proficient in Min, only 27% reported that their children are also fluent in the language, and 8% of them admitted that their children cannot speak Min at all. In other words, one should expect Min proficiency among young adult speakers in their 20s to be highly variable, which might be one of the underlying factors contributing to the current Min sound changes. Young Min speakers who are more fluent in the language might be more conservative than their less fluent counterparts in maintaining the older realizations of /dz/ (Chuang and Fon 2017), resulting in more usages of the older variants [z]/[dz] and [ɡ], which appeared more than 20 years ago [cf. Chen 陳淑娟 (1995)]. Newer variants, such as the retroflex and the dental nonsibilant (Chuang and Fon 2017), might be more commonly found among speakers who are less proficient and thus more innovative. As for [l], since it is regarded as a result of phonological simplification [cf. Ang 洪惟仁 (2012)], one suspects that more proficient talkers would be less likely to use this form due to conservativeness.

Alternatively, intra-talker relative strength between Mandarin and Min might have an impact. Lai and Hsu (2013) studied a contact-induced sound change of [ɮ]→[l] among young speakers of Yami, an Austronesian language spoken on the Orchid Island in Taiwan, due to influences of Mandarin.Footnote 9 They found that the rate of change is largely dependent on speakers’ Mandarin competence and frequency of use, as Mandarin has [l] but not [ɮ] in its inventory. Therefore, if an analogous situation could be found in the realization of Min /dz/, then one would expect to see speakers with lower Min proficiency (and thus lower frequency of use) to be more inclined to realize Min /dz/ as [l] or a retroflex variant, both of which are likely to gain support from the Mandarin inventory [cf. Duanmu (2007)]. For those with higher Min proficiency, they are more inclined to use [ɡ] and the dental nonsibilant, which are nonexistent in Mandarin. Prediction with [z]/[dz] is less straightforward. Although [z]/[dz] is not incorporated as part of the Mandarin phonemic inventory, Min-accented Mandarin in Taiwan tends to realize Mandarin dental retroflex /ʐ/ as [z] due to Min influence (Chan 1984; Chuang et al. 2012; Chuang et al. 2015). This implies that more proficient Min speakers might in fact be more likely to gain “reverse” support from Mandarin for their [z] realizations in Min, as their Mandarin should presumably be more Min-accented, and thus more [z] is used. Regardless of the accounts, one would expect to find more proficient speakers to be more inclined to produce [z]/[dz] and [ɡ], and less proficient speakers to be more inclined to produce [l] and the retroflex variant.

Finally, this study would like to investigate how consistent speakers of different backgrounds are in realizing /dz/. For sound changes in progress, there is usually a high level of inter-speaker variability, as individual speakers might be at different stages of a sound change (Janson 1983). However, it is less clear whether individual speakers are consistent when realizing sound changes in progress. Previous studies are fairly ambivalent regarding this issue. Gósy (2013) found that in an ongoing vowel change in Hungarian, inter- and intra-speaker variability are equally high. On the other hand, Yu et al. (2015) showed that English-speaking individuals are not likely to differ in their coarticulatory phonetic realizations across time, showing high within-speaker consistency. Although Chuang and Fon (2017) found that young Min speakers of all three dialects showed relatively high intra-speaker consistency with regard to the realization of /dz/, it is unclear whether this consistency is gender-specific, or whether it is a more general phenomenon, as they included only female speakers. Since females are generally more sensitive to sound changes than males (Labov 1990; Maclagan et al. 1999; Trudgill 2000), it is possible that intra-speaker consistency is restricted to females only, and more variations are likely to be found among male language users.

Moreover, one also suspects that speaker proficiency might interact with intra-speaker variability. Although relatively few studies have dealt with this issue, research from second language speakers seems to suggest a negative correlation between speaker proficiency and performance variability [e.g., Nip and Blumenfeld (2015)]. If this correlation also holds for first language speakers who are less fluent in the language due to dwindling frequency of use, then one would expect to see higher intra-speaker variability among speakers of lower proficiency. On the other hand, if the first language acquisition process can somehow make the native speakers become more resistant to the influence of speaker proficiency on within-speaker variability, then one would find the overall intra-speaker variability to be fairly comparable across speakers of different proficiency levels. This would be especially interesting for the case of Taiwan Min because speaker gender generally covaries with language proficiency and females tend to be less proficient in Min than males (Huang and Fon 2007).

This study investigated the abovementioned three specific aims by using long connected speech. Except for Wang 王薈雯 (2014), who adopted spontaneous monologs as the target of study, and Chuang and Fon (2017), who used paragraphs, most of the previous research on Min /dz/ realization was restricted to employing syllables, word lists, and/or isolated sentences only [e.g., Ang and Chang 洪惟仁, 張素蓉 (2008), Chen 陳淑娟 (1995), Chen 陳雅玲 (2010, 2012), Hung 洪慧鈺 (2007), Khng 康韶真 (2014), Lin 林珠彩 (1995), and Thoo 涂文欽 (2009)]. Although using more controlled stimuli already allows one a glimpse of how the sound might vary, it does not provide a panoramic view of how speakers’ articulation might fluctuate. Therefore, this study followed the design of Chuang and Fon (2017) and utilized extended paragraphs to allow talkers to reveal a fuller spectrum of their realization variability so that interaction among speaker gender, Min proficiency, and within-speaker consistency could be more clearly analyzed.

3 Method

3.1 Participants

Seventeen fluent Mandarin-Min bilinguals, aged between 20 and 26 at the time of recording (\( \overline{X} \) = 22.76, SD = 1.50), participated in the experiment. All speakers were university students and did not suffer from any speech and language disorders based on self-report. Ten of the participants were male and seven were female.Footnote 10 As all of the participants speak Min and Mandarin natively and have acquired both languages before age 8, they can unanimously be considered as early bilinguals [cf. Beardsmore (1986)]. Among them, ten could be deemed as simultaneous bilinguals, as they acquired both languages at birth, while seven of them were deemed as sequential bilinguals [cf. De Houwer (1995)]. Of the latter group, five of them acquired Min first and two of them acquired Mandarin first. Table 3 shows the distribution of the participants with regard to their acquisition order of Min and Mandarin.

Table 3 Distribution of acquisition order of Min and Mandarin

Participants were asked to self-rate their Min and Mandarin proficiency levels and Min frequency of use on a seven-point Likert scale, with 1 being the least proficient/frequent and 7 being the most proficient/frequent. Nine of the speakers were also asked to rate their Mandarin frequency of use. As shown in Fig. 2, speakers demonstrated higher proficiency in Mandarin than Min [t(16) = 3.85, p = .001] and used Mandarin much more frequently than Min in their everyday life [t(8) = 2.71, p < .05]. Min proficiency and Min frequency of use were highly correlated [r(15) = .73, p < .001]. Those who used Min more frequently had higher Min proficiency. There was a near-significant gender effect with regard to Min proficiency. Male speakers were in general more proficient in Min than female speakers [t(15) = 2.05, p = .06]. However, their Min and Mandarin frequencies of use and Mandarin proficiency levels were about the same [Min frequency: t(15) = − .16, ns.; Mandarin frequency: t(7) = .51, ns.; Mandarin proficiency: t(6.00) = 1.00, ns.].Footnote 11 There was no significant difference between simultaneous bilinguals and sequential bilinguals who acquired Min first with regard to Min proficiency and Min frequency of use [proficiency: t(13) = − .58, ns.; frequency: t(13) = −.98, ns.]. In general, despite inter-speaker variability in the self-ratings, it is safe to assume that all speakers had an above-average Min proficiency level, as none of the speakers self-rated him-/herself as lower than 4, and all except for one self-rated themselves as 5 or above.

Fig. 2
figure 2

Min and Mandarin self-rated Likert averages for a proficiency level and b frequency of use. Numbers inside each bar indicate average scores on the Likert scale. The error bars represent standard errors. Asterisks indicate significance in post hoc tests and the one in parentheses indicates near-significance

The Min dialect group to which each participant belonged was coded according to self-provided demographic and residential information. Speaker dialect was mainly determined by parental dialects, as Min is considered to be predominantly a language of private domains (Huang 黃宣範 1993), and thus, the home should be a major setting for Min usage. A survey on eight of the participants regarding their domain of Min usage using a seven-point Likert scale supported this assumption. Speakers were more likely to use the language when conversing with their immediate family members (parents, grandparents, and siblings) than when talking to other acquaintances (extended family members, professors/teachers, and friends/classmates) [\( \overline{X} \) = 5.47 vs. 3.17, t(7) = 3.32, p = .01].Footnote 12 Parental dialects were derived from mapping parental hometowns onto Ang’s 洪惟仁 (2013) Min dialectal map. In cases where the two parents came from different dialectal areas, speaker residency was used as an operational secondary criterion for dialect determination. Only two speakers, one male and one female, were judged according to the second criterion. The 22-year-old male, whose father grew up in Changhua City (a 漳 Chiang dialect area) and mother grew up in Tainan (a Mix dialect area), had lived in Changhua City for 15 years, and was thus coded as belonging to the 漳 Chiang dialect. The 21-year-old female, whose father grew up in Beidou Township of Changhua County (a 漳 Chiang dialect area) and mother grew up in Lugang Township of Changhua County (a 泉 Chôan dialect area), had lived in the Wuri District of Taichung City (a 漳 Chiang dialect area) for 19 years, and was thus coded as belonging to the 漳 Chiang dialect. A double check on the recordings of the two speakers showed that the dialectal codings were congruent with their speech samples. In total, there were ten 漳 Chiang dialect speakers and seven Mix dialect speakers (Table 4).Footnote 13 Independent t tests showed no dialectal effect regarding Min and Mandarin proficiency, or Min and Mandarin frequency of use [Min proficiency: t(15) = − .86, ns.; Mandarin proficiency: t(15) = − .83, ns.; Min frequency: t(15) = − .93, ns.; Mandarin frequency: t(2.00) = 1.00, ns.].

Table 4 Distribution of dialect groups

3.2 Materials

The material adopted in this study was the same as that used in Chuang and Fon (2017), which was a short paragraph embedding six /dz/-initial target syllables. Two of the syllables, 如 /zu/ ‘person’s name’ and 熱 jόah /zwaʔ/ ‘hot’, were followed by a rounded segment /u/ or /w/, and four of the syllables, 二/字 /zi/ ‘two; Chinese character’, 入 jíp /zip/ ‘to enter’, 柔 jiû /zju/ ‘gentle’, and 忍 jím /zim/ ‘to put up with’, were followed by an unrounded segment /i/ or /j/. Among the six, three of them, , jíp, and , appeared more than once, with the former two each occurring twice and the latter one occurring five times. As a consequence, there were in total six tokens preceding /i/ or /j/ and six preceding /u/ or /w/. The paragraph was presented in Chinese characters conforming as closely as possible to the recommended characters promoted by the Ministry of Education in Taiwan (National Languages Committee 國語推行委員會 2011) with minor revisions, which were necessary to facilitate smooth reading.Footnote 14 Please refer to Chuang and Fon (2017) for more details of the paragraph.

3.3 Equipment

Recordings were done using a KORG MR-1000 digital recorder with a Sennheiser HMD 25-1 head-mounted microphone at a sampling rate of 44,100 Hz and were later downsampled to 22,050 Hz using Adobe Audition CS6.

3.4 Procedure

Participants were seated comfortably in a sound-treated room wearing a head-mounted microphone and were asked to complete a questionnaire on their language background before the recording started. A native Mandarin-Min bilingual experimenter, who was not one of the authors, checked with each of the speakers to ensure they could fluently and correctly read out the paragraph before the actual recording began. Participants were asked to read the paragraph in a natural manner. Six of the speakers (two males and four females) were requested to read the text a second time due to random omission and/or unclear pronunciation of the target syllables.Footnote 15 All analyses were done on only the last rendition of the reading. The recording session lasted less than 15 min, and speakers were compensated for their participation with a small monetary reward.

3.5 Transcription

Phonetic realizations of all the target syllables were independently labeled by the two authors, both of whom were Mandarin-Min bilinguals as well as trained phoneticians, using the Praat software (Boersma and Weenink 2017). The inter-labeler reliability was fairly high for broad phonetic categories, with a level of agreement of .80 and a Cohen’s kappa of .73 (p < .0001). As the second author used narrower phonetic categories, the following analyses adopted her labels in order to provide a more complete and detailed picture of /dz/ realization.

4 Results

4.1 Realization of /dz/

Four speakers (two males and two females) inadvertently omitted the syllable 入 jíp in one of the 入去 jíp-khì ‘to enter’ tokens. One speaker produced 入去 jíp-khì twice due to a repeat. One speaker substituted one of the 阿如 A-jû ‘person’s name’ tokens with the third person pronoun 伊 i ‘3rd person singular pronoun’, and another replaced 溫柔 un-jiû ‘gentle’ with a synonymous word 幼秀 iù-siù ‘graceful’. As a consequence, there were 6 (tokens/condition) × 2 (vowel environments) × 17 (speakers) − 4 (omissions) + 1 (repeat) − 2 (substitutions) = 199 tokens of /dz/-initial syllables. Among them, 101 syllables contained /i/ or /j/ and 98 syllables contained /u/ or /w/.

There were in total 23 different phonetic variants found for the realization of /dz/, which could be broadly categorized into seven types. As shown in Table 5, laterals (hereafter [L]) were the most popular among speakers, accounting for 42% of the data. Laterals were more likely to appear before a rounded than an unrounded vowel/glide, and all [L]s except for two tokens were realized as [l].

Table 5 Distribution of /dz/ realizations with regard to roundedness. The first three realizations ([Z], [G], [L]) are those that were commonly discussed in the previous literature, while the rest are not. [Z]: dental sibilants; [G]: velar obstruents; [L]: laterals; [D]: dental nonsibilants; [R]: retroflexes; [V]: labials; Ø: total deletion. Asterisks indicate significance in post hoc tests, and asterisks in parentheses indicate near-significance. Subscript numbers after each phonetic variant indicate its frequency of occurrence

Dental sibilants (hereafter [Z]) were the next largest category, accounting for 21% of the data. Unlike [L], they were more likely to occur before an unrounded vowel/glide. Five phonetic realizations were found for [Z], among which, [ʑ], [z], and [dʑ] were the most common, together accounting for 95% of all [Z] tokens.

The next largest category was dental nonsibilants (hereafter [D]), which accounted for 16% of the data. The [D] category was more likely to appear before an unrounded vowel/glide, and varied widely, with seven different phonetic variations observed. Among them, [d]-like sounds were the most common, accounting for 78% of the total nonsibilant tokens, and [ð]-like sounds were the second most common, accounting for 16% of the variant.

There was also a substantial category of retroflexes (hereafter [R]), accounting for 13% of the total data. [R] only occurred before a rounded vowel/glide, and the majority was realized as a retroflex approximant [ɻ], accounting for 76% of the category. There were also five instances of a retroflex fricative [ʐ], accounting for 20% of the variant. Together, the four categories of [L], [Z], [D], and [R] accounted for 90% of the total /dz/ realization.

Though less common, there were also several tokens of velar obstruents (hereafter [G]), accounting for 7% of the total data. [G] only occurred before an unrounded vowel/glide, and [ɡ]-like sounds were the most common, accounting for 71% of the category.

Finally, there were two minor categories, labials (hereafter [V]) and total deletion (hereafter Ø). Both categories only appeared before an unrounded vowel, and all three tokens of [V] were realized as a labiodental fricative [v]. Both categories were contributed by only one speaker and were likely idiosyncratic variations.

As sound categories showed differential preferences with regard to the roundedness of the following segment (Table 5), a likelihood ratio test was performed to confirm this observation. Results showed that the distribution was significant [χ2(6) = 75.30, p < .0001]. Post hoc analyses using Beasley and Schumacker’s (1995) method indicated that [L] and [R] were more likely to occur before rounded segments, while [Z], [D], [G], and [V] were more likely to occur before unrounded ones ([L]: p < .01; [Z]: p = .09; [D]: p = .001; [V]: p = .08; [R] and [G]: p < .0001).Footnote 16 In the following section, except for consistency analyses, only categories contributed by multiple talkers were included to avoid confounding of idiosyncrasies. In other words, [V] and Ø were excluded from both gender and proficiency analyses.

4.2 Realization of /dz/ vs. gender

Generally speaking, there was not much difference in the variant categories employed by different speaker groups (Fig. 3). Except for Mix males, all the other speaker groups showed similar categories, [Z], [L], [D], and [R] in the rounded environment, and [Z], [G], [L], and [D] in the unrounded environment. Mix males had only three categories for both environments, lacking [D] in the rounded and [G] in the unrounded.

Fig. 3
figure 3

Distribution of /dz/ realizations with regard to gender in a rounded and b unrounded environments for the 漳 Chiang dialect and c rounded and d unrounded environments for the Mix dialect. Numbers at the upper right hand corner of the bars indicate the total number of tokens. Numbers inside each bar sections represent the percentages. The thin lines inside the [Z] sections indicate the divide between affricate (left) and fricative (right) realizations. [Z] sections without a thin line indicate fricative realizations only. Asterisks indicate significance in post hoc tests, and the asterisk in parentheses indicates near-significance. Please see text for explanation

The difference between the two genders mainly lay in their preferences for the variant categories. As shown in Fig. 3, there was an overall gender preference for [Z], [R], and [L]. Males were more likely to adopt [Z] and [R] than females, while females were more likely to adopt [L] than males, especially in the rounded environment. Pearson’s chi-squared tests performed on the overall distribution and the rounded environment showed that gender was a significant factor [overall: χ2(4) = 16.74, p < .01; rounded: χ2(3) = 17.36, p < .001].Footnote 17 Post hoc analyses confirmed the above observations for [Z] (p < .05), [R] (p < .01), and [L] in the rounded environment (p < .001). The trend for [L] in the unrounded environment was not significant. It is also interesting to note that for the 漳 Chiang dialect, male speakers showed both affricate and fricative realizations of [Z], while female speakers only had fricative realizations. However, for the Mix dialect, both genders showed some affricate realizations of [Z], and female speakers showed even more such realizations than males.

The distributions for [G] and [D] were more complex, and seemed to be affected by a combination of factors. [G] only occurred in the unrounded environment (Table 5) and was generally preferred by 漳 Chiang males and Mix females. Separate chi-squared tests were executed for the two dialects in the unrounded environment in order to confirm this. Results showed that gender was a significant factor for both dialects [漳 Chiang: χ2(3) = 9.58, p < .05; Mix: χ2(3) = 11.17, p < .05]. Post hoc analyses confirmed that 漳 Chiang males indeed adopted significantly more [G] than 漳 Chiang females (p < .05), while Mix females adopted significantly more [G] than Mix males (p < .01).

[D] was a viable option for both the rounded and the unrounded environment (Table 5). However, it was less common in the former than the latter. Post hoc analyses regarding the chi-squared test for the rounded environment also showed that gender did not play a role in the distribution of [D]. However, for the unrounded environment, [D] reflected a complex interaction of gender and dialect, which was similar to [G], yet in the opposite direction. It was generally preferred by 漳 Chiang females and Mix males instead. Post hoc analyses confirmed this observation (漳 Chiang: p < .01; Mix: p = .067).

Due to the complex interactions among dialect, gender, and vowel context, different speaker groups showed different dominant variant categories. For the rounded environment, [L] was the most common variant for 漳 Chiang speakers and female Mix speakers, while [R] was the most common for male Mix speakers. However, no single variant was predominant in the unrounded environment, and multiple options seemed to be equally strong. Generally speaking, 漳 Chiang males and Mix females were more similar. They both had three equally robust categories, [Z], [G], and [L]. On the other hand, 漳 Chiang females and Mix males were more alike. The former had two equally robust categories, [L] and [D], while the latter had three, [Z], [L], and [D].

There was another interesting interaction between gender and dialect. For male speakers, dialectal distinction was maintained in both rounded and unrounded environments. Likelihood ratios showed that dialect was a significant factor for determining variant distribution in both contexts [rounded: χ2(3) = 10.82, p = .01; unrounded: χ2(3) = 12.28, p < .01]. For the rounded environment, 漳 Chiang males adopted more [L] and [D], but fewer [R], than Mix males ([L]: p < .05; [D]: p = .09; [R]: p = .01). For the unrounded environment, 漳 Chiang males adopted more [G], but fewer [D], than Mix males (p < .05). On the other hand, dialectal distinction was maintained only in the unrounded environment for female speakers [rounded: χ2(3) = .82, ns.; unrounded: χ2(3) = 8.28, p < .05]. 漳 Chiang females adopted more [D], but fewer [G], than Mix females (p < .05). No dialectal difference was found for the rounded environment.

4.3 Realization of /dz/ vs. Min proficiency

Talker proficiency was not comparable across gender (Table 6). Males were generally more proficient than females, which is a tendency that was also found in previous research [cf. Huang and Fon (2007)] and more or less reflects the status quo of young Min speakers nowadays. In order to examine how Min proficiency interacted with variant realizations, talkers were divided into two proficiency subgroups based on their self-rated seven-point Likert scale by using the median as an operational dividing point in order to facilitate analyses and maximize power. As shown in Table 6, the median for males lay between 6 and 7 and the median for females lay between 5 and 6. Although the dividing criteria were different for the two genders, the main purpose of the analyses was to examine the effect of relative proficiency on the realization of /dz/, and thus, the potential effect of absolute Likert scale ratings should not be of much concern. For the purpose of conciseness, the above-median subgroups were conveniently referred to as the high proficiency groups while the below-median subgroups were labeled as the mid proficiency groups in this study. However, it is worth noting that all speakers had sufficient command of the language to successfully perform the reading task and carry out everyday conversation in Min. In the following, separate analyses were carried out for the two genders.

Table 6 Distribution of speakers’ proficiency levels with regard to dialect and gender. The dotted squares indicate the subgroups. “H”: high; “M”: mid. Please see text for explanation

Figure 4 shows the distribution of /dz/ realizations regarding speaker proficiency for males. It is interesting to find that males of different proficiency levels tended to have different realizations of /dz/. Mid-level males were consistent across dialects. They had [Z], [L], and [R] for the rounded environment, and [Z], [L], and [D] for the unrounded environment. High-level males showed more dialectal differences. For the 漳 Chiang dialect, high proficiency males had four categories for both environments, [Z], [L], [D], and [R] for the rounded, and [Z], [G], [L], and [D] for the unrounded. However, for the Mix dialect, high-level males had only two for both environments, [Z] and [R] for the rounded, and [Z] and [L] for the unrounded.

Fig. 4
figure 4

Distribution of /dz/ realizations with regard to proficiency in a rounded and b unrounded environments for male 漳 Chiang speakers and c rounded and d unrounded environment for male Mix speakers. Numbers at the upper right-hand corners of the bars indicate the total number of tokens. Numbers in the bar sections represent the percentages. The thin lines inside the [Z] sections indicate the divide between affricate (left) and fricative (right) realizations. [Z] sections without a thin line indicate fricative realizations only. Asterisks indicate significance in post hoc tests, and asterisks in parentheses indicate near significance. Please see text for explanation

Preferences for different realizations were not only proficiency-dependent, but could also be dialect- and vowel-dependent. As shown in Fig. 4, high-level 漳 Chiang males used fewer [Z] than their mid-level counterparts, while high-level Mix males used more [Z] than their mid-level counterparts instead. In addition, mid-level 漳 Chiang males and high-level Mix males adopted both fricative and affricate realizations of [Z], while high-level 漳 Chiang males and mid-level Mix males used fricative realizations only. Likelihood ratios showed that proficiency was a significant factor in determining the realization of /dz/ for both dialects [漳 Chiang: χ2(4) = 15.12, p < .01; Mix: χ2(3) = 17.32, p < .001],Footnote 18 and post hoc analyses confirmed the observed trends [漳 Chiang: p = .059; Mix: p = .01].

Preferences for [L] did not seem as clear-cut. Although there seemed to be a trend for mid-level talkers to use [L] more often than their high-level counterparts, it was found only in specific vowel environments in the two dialects. For 漳 Chiang speakers, this was found in the unrounded context, while for Mix speakers, this was found in the rounded context. Likelihood ratios showed that proficiency was indeed significant in determining /dz/ realization in these two environments [漳 Chiang-unrounded: χ2(3) = 13.90, p < .01; Mix-rounded: χ2(2) = 8.31, p < .05], and post hoc analyses confirmed the above observations (漳 Chiang-unrounded: p = .15; Mix-rounded: p = .01). For the 漳 Chiang dialect in the rounded environment, and the Mix dialect in the unrounded environment, although proficiency still played a significant role in determining /dz/ realization [漳 Chiang-rounded: χ2(3) = 9.48, p < .05; Mix-unrounded: χ2(2) = 15.05, p < .001], post hoc tests regarding the distributions of [L] were not significant.

[G] was only observed among 漳 Chiang male speakers and only in the unrounded environment. As shown in Fig. 4, all of the [G] tokens were contributed by high-level male talkers. In fact, almost half of the tokens from high-level males were realized as [G]. Post hoc analyses confirmed this observation and indicated that high-level talkers indeed used more [G] than their mid-level counterparts (p = .001).

Like [G], [R] also occurred in only one of the environments. The distribution in Fig. 4 seemed to indicate that high-level talkers were more likely to use [R] than mid-level ones, especially for the Mix dialect. However, post hoc analyses showed that the distribution was not significant (p = .22).

Similar to [Z] and [L], [D] can occur in both environments. However, its interaction with talker proficiency seemed to align more with vowel context than with speaker dialect. As indicated in Fig. 4, both 漳 Chiang and Mix male males showed a similar trend in the unrounded environment. Mid-level males were more likely to use [D] than high-level ones. The pattern was especially obvious for the Mix dialect. A likelihood ratio across the two dialects showed that proficiency was a significant factor in determining /dz/ realization in the unrounded environment [χ2(3) = 21.83, p < .0001], and post hoc analyses confirmed the above observations (p < .01). For the rounded environment, an opposite trend was found, at least for the 漳 Chiang talkers. Mid-level speakers were less likely to use [D] instead. This was confirmed by post hoc analyses (p < .05).

For females, the pattern was somewhat different, and proficiency seemed to exert a more homogeneous effect across vowel contexts and speaker dialects (Fig. 5). In general, high proficiency females were more likely to employ [L] and less likely to employ [Z] than mid proficiency ones regardless of vowel contexts. Also, affricate realizations of [Z] were found exclusively in the unrounded environments of the mid-level Mix female. The rest showed fricative realizations only. An overall likelihood ratio across dialects and vowel contexts showed that proficiency was indeed a significant factor in determining the realization of /dz/ [χ2(4) = 26.01, p < .0001], and post hoc analyses confirmed the above observations ([L]: p = .09; [Z]: p < .001). The effect for [L] was the strongest in the rounded context of the Mix females. A likelihood ratio for this combination showed that proficiency was again a significant factor for /dz/ realization [χ2(3) = 8.37, p < .05], and post hoc analyses indicated that high-level talkers adopted significantly more [L] than the mid-level talker (p < .05).

Fig. 5
figure 5

Distribution of /dz/ realizations with regard to proficiency in a rounded and b unrounded environments for female 漳 Chiang speakers and c rounded and d unrounded environment for female Mix speakers. Numbers at the upper right-hand corners of the bars indicate the total number of tokens. Numbers in the bar sections represent the percentages. The thin line inside one of the [Z] sections indicates the divide between affricate (left) and fricative (right) realizations. [Z] sections without a thin line indicate fricative realizations only. Asterisks indicate significance in post hoc tests, and asterisks in parentheses indicate near significance. Please see text for explanation

[R] and [G] patterned in a similar fashion as [L]. Across the two dialects, high-level females were more likely to use [R] and [G] than the mid-level females. Likelihood ratios indicated that proficiency was a significant factor for /dz/ realization for both the rounded and the unrounded environments [rounded: χ2(3) = 17.04, p < .001; unrounded: χ2(3) = 14.14, p < .01], and post hoc analyses confirmed the above observations ([R]: p = .069; [G]: p < .05).

[D] seemed to be the only one that was more context- and dialect-dependent for female speakers. Although there was an overall near-significant trend for mid-level talkers to produce more [D] than their high-level counterparts (p = .09), this was mainly due to 漳 Chiang females in the rounded environment, and Mix females in general. Proficiency did not seem to play a role in determining realizations of [D] for 漳 Chiang females in the unrounded environment. Therefore, two separate analyses were run for the proficiency effect. One was for the Mix dialect in general, and the other was for the rounded environment across the two dialects. A likelihood ratio for the former showed significance [χ2(4) = 20.09, p < .001]. Post hoc analyses indicated that there was a tendency for mid-level females to use more [D] than their high-level counterparts (p = .08). For the latter, post hoc analyses similarly confirmed a significant effect of proficiency, and mid-level females again used [D] more often than high-level females (p < .05).

4.4 Realization of /dz/ vs. speaker consistency

Speaker consistency was operationally defined by the total number of variants a speaker adopted. The more variants a talker used, the less consistent he/she was in /dz/ realization. Although as many as four and six variants were observed as viable options in the rounded and unrounded environments, respectively (cf. Table 5), the majority used one to two variants in the rounded environment and two to three variants in the unrounded environment at most (Fig. 6). In fact, only one speaker adopted as many as four variants and only in the unrounded context.

Fig. 6
figure 6

The overall distribution of the number of variants used by talkers

In order to examine whether gender and proficiency affected consistency, two sets of analyses were performed, one at a general level, and the other at a syllable-specific level. The former refers to the number of variants used for all stimuli, while the latter refers to the number of variants used for syllables with multiple tokens only, i.e., 如 ‘person’s name’, 字/二 ‘Chinese character; two’, and 入 jíp ‘to enter’

Figure 7 shows the effect of gender and proficiency at the general level. It is clear from the figure that male speakers did not show much effect of proficiency. Both high- and mid-level speakers alike showed similar degrees of consistency. Only vowel environment seemed to be a potential factor. /dz/s in the unrounded environment were slightly more variable than those in the rounded one. However, a vowel (2) × proficiency (2) two-way mixed-design showed that none of the effects reached significance. Statistically, male speakers maintained relatively consistent across various proficiency levels and vowel environments.

Fig. 7
figure 7

The effect of proficiency on speaker consistency with regard to the number of variants used in /dz/ realization for all syllables in a male and b female talkers. Red asterisks indicate statistical significance. Error bars represent standard errors. Please see text for explanation

For females, both vowel environments and proficiency levels seemed to be affecting speaker consistency (Fig. 7b). Unrounded vowels again seemed to have elicited larger variability than rounded ones, but mid proficiency speakers also showed more variability than their high proficiency counterparts. This trend was found to be more prominent in the rounded than the unrounded environment. A vowel (2) × proficiency (2) two-way mixed-design showed that all three effects were (near-) significant [vowel: F(1, 5) = 15.17, p < .05, \( {\widehat{\eta}}^2 \) = .75; proficiency: F(1, 5) = 5.03, p = .07, \( {\widehat{\eta}}^2 \) = .50; vowel × proficiency: F(1, 5) = 5.08, p = .07, \( {\widehat{\eta}}^2 \) = .50]. Post hoc pairwise comparisons concerning the interaction effect showed that proficiency was a potent factor only in the rounded environment. Mid proficiency speakers were significantly more variable than high proficiency speakers and on average used about one more variant than their more proficient counterparts (1.50 vs. 2.67, p < .05). In the unrounded environment, the proficiency effect did not reach significance. Similarly, although there seemed to be a general trend for unrounded environments to have more variable /dz/ realizations than rounded ones, the effect was only significant among the high proficiency speakers (p < .01). For the mid proficiency speakers, the difference was not significant.

Figure 8 shows the effect of gender and proficiency at the syllable level. For males, proficiency did not seem to be a potent factor. Regardless of syllables, speakers of high- and mid-level alike showed similar numbers of variants. However, different syllables did seem to show slightly different degrees of variability, with 入 jíp ‘to enter’ being the most consistent. All speakers used only one variant to realize this syllable. A syllable (3) × proficiency (2) two-way mixed-design indicated that only the main effect of syllable was near-significant [F(2, 16) = 2.89, p = .08, \( {\widehat{\eta}}^2 \) = .27]. Post hoc pairwise comparisons showed that jíp was (near-) significantly realized with fewer variants than both and (jíp vs. : p < .05; jíp vs. : p = .09). No other effect was significant.

Fig. 8
figure 8

The effect of proficiency on speaker consistency with regard to the number of variants used in /dz/ realization for the three syllables of multiple tokens, 如 ‘person’s name’, 字/二 ‘Chinese character; two,’ and 入 jíp ‘to enter’, in a male and b female talkers. Asterisks indicate statistical significance, and asterisks in parentheses indicate near-significance. Error bars represent standard errors. Please see text for explanation

For female speakers, both syllable type and speaker proficiency seemed to have exerted a larger effect (Fig. 8b). For both 如 ‘person’s name’ and 入 jíp ‘to enter’, mid proficiency speakers showed more variability than their high proficiency counterparts. The effect was much attenuated in 字/二 ‘Chinese character; two’. 入 jíp was again the most consistent syllable for all speakers, while 如 was the most variable syllable for mid proficiency speakers. A similar syllable (3) × proficiency (2) two-way mixed-design was performed. Results showed that both main effects were (near-) significant [syllable: F(2, 10) = 5.58, p < .05, \( {\widehat{\eta}}^2 \) = .53; proficiency: F(1, 5) = 5.03, p = .07, \( {\widehat{\eta}}^2 \) = .50]. The interaction effect was not significant. Post hoc pairwise comparisons indicated that talkers used significantly fewer variants in realizing 入 jíp than 如 (p < .05), and mid proficiency speakers were near-significantly more variable than their high proficiency counterparts (p = .07).

5 Discussion

5.1 Realization of /dz/

The variety found in /dz/ realization in this study was much like what was observed in Chuang and Fon (2017) and more. In addition to the already reported [Z], [G], [L], [D], and [R], there were also speakers that used [V] and Ø. Although the latter two were contributed by single speakers only, it is worth noticing that the speaker that contributed [V] produced three tokens of it, indicating its within-talker robustness. Therefore, it is unclear whether the occurrence of [V] was merely due to idiosyncrasy, or it was a newly rising variant and the talker specificity found here was in fact a result of sampling error. On the other hand, Ø was more likely to be random inadvertence, as there was only one token observed. At any rate, the wide spectrum of variants for /dz/ realization found in this study implies that /dz/ realization in Taiwan Min is currently in a volatile state among young speakers, regardless of gender. Although this is more in line with Chuang and Fon’s (2017) view, it is in stark contrast with most earlier studies [cf. Ang 洪惟仁 (2003, 2012), Chen 陳淑娟 (1995), Chen 陳雅玲 (2010, 2012), Khng 康韶真 (2014)], in which only [Z], [G], and [L] were observed. One suspects that the current findings might have been due to a serious decline in proficiency among young Min speakers (Directorate-General of Budget 行政院主計總處 2013; The United Daily News Group Poll Center 聯合報系民意調查中心 2002). As speakers become less proficient in the language, they might become less certain about how a particular phoneme should be realized, resulting in larger variability. If this is the case, then one would expect even larger variability to be observed for future generations to come. Alternatively, the wide variety of /dz/ realization might have stemmed from a new surge of sound change that was motivated by the articulatorily demanding /dz/ (Ang 洪惟仁 2012; Ohala 1983). Specifically, the two main classical realization categories of /dz/, [Z] and [G], did not have any phonemic correspondences in Mandarin, the most predominant language among young Min speakers nowadays (Huang 黃宣範 1993). As a consequence, in addition to adopting [L], /dz/ might be forced to further evolve into something that is even more “Mandarin-friendly”. The inclusion of [R] could be deemed as part of this process. More studies would be needed in order to provide a longitudinal documentation of the development of /dz/ realization.

5.2 Realization of /dz/ vs. gender

The effect of gender was robust in realizing all five major variant categories of /dz/, but in a manner more complicated than expected. Interestingly, only [Z] and [R] showed a simple gender effect. [L], [G], and [D] all demonstrated an intricate interaction with speaker dialect and/or vowel context to various degrees.

Being the classic old form for /dz/ realization (Ang 洪惟仁 2003, 2012), [Z] showed a distribution that was fairly consistent with one’s predictions, with male speakers preferring this variant more than their female counterparts. This is fairly understandable based on Labov’s (1990) framework, since females are more likely than males to abandon an old form that does not have a particularly high connotation [cf. Ang 洪惟仁 (2003)]. As a result, males became the main conserver of this variant.

The distribution of [R] also showed a similar pattern, with males being more likely to adopt this variant than females. In fact, the Mix male speakers even used [R] as the predominant variant in realizing /dz/ in the rounded environment. However, the underlying cause was likely to be completely different from that of [Z]. As [R] was considered a new realization of /dz/ likely arising from Mandarin negative transfer (Chen 陳雅玲 2010, 2012; Chuang and Fon 2017), a negative connotation might have been attached, to which female speakers are more sensitive than their male counterparts [cf. Labov (1990), Maclagan et al. (1999), Trudgill (2000)]. As a consequence, female speakers were less likely to adopt this new variant, even though they welcomed others (e.g., [D]). Since Ang 洪惟仁 (2003) did not include [R] as part of his judgment study, future judgment tests would be needed in order to confirm this conjecture.

The distribution of [L] displayed an opposite pattern, with females showing a stronger preference than males. Since [L] was judged to be a variant with which speakers largely identify (Ang 洪惟仁 2003), this is in support of one’s prediction. Females should indeed show more usages of this variant. What is of peculiar interest is that such gender patterning was only found in the rounded environment. For the unrounded environment, gender preference was not obvious. This could not possibly be due to differential connotations across vowel contexts, which were not observed in Ang 洪惟仁 (2003) [cf. Table 2]. Instead, one suspects that this might have something to do with the stability of the two environments. As Ang 洪惟仁 (2003, 2012) mentioned, the shift from [Z] to [L] in the realization of Min /dz/ is more advanced and closer to complete in the rounded than the unrounded environment. It is thus possible that even though female speakers tend to adopt more eagerly new variants that are neutral or positive (Labov 1990), they would only do so in environments that are more stable and facilitating. In environments that are still volatile, females do not necessarily venture out as much. In order to have a clearer understanding of how gender preference interacts with rule progression (and thus phonological context), more studies on other sound variations would be needed.

[G] and [D] formed an interesting pair in the unrounded environment. For the 漳 Chiang dialect, females were more inclined to use the new form of [D], but less inclined to use the old form of [G], than males. This was generally in line with one’s prediction, as Labov’s (1990) model would predict that female speakers tend to favor non-negative new forms (i.e., [D]) but avoid non-positive old forms (i.e., [G]) [cf. Ang 洪惟仁 (2003, 2012), Table 2]. What is intriguing, however, is the pattern found for the Mix dialect, which was the exact opposite. Mix males avoided [G] altogether and showed a great liking for [D], while Mix females showed a robust usage of [G], but did not like [D] as much. In other words, the sound preference of the Mix speakers seemed to go directly against one’s predictions based on Labov’s (1990) model.

One suspects that this might be due to intricate interaction with dialect identity and dialect attitude. As [G] has been a signature variant for this major dialect in Taiwan (Ang 洪惟仁 1997; Chen 陳淑娟 1995; Chen 陳雅玲 2010, 2012; Chuang and Fon 2017; Khng 康韶真 2014; Lin 林珠彩 1995), counteracting the general trend of forgoing [G] altogether [cf. Chen 陳雅玲 (2010, 2012), Khng 康韶真 (2014)] might thus be socially favorable among fluent peer-speakers of this conservative dialect [cf. Chuang and Fon (2017)]. On the other hand, since [G] is already phasing out and is not tied to dialect identity among 漳 Chiang speakers (Ang 洪惟仁 2003, 2012), females of this more innovative dialect were more eager to follow the general trend by diminishing the use of [G] than their male counterparts.

The differential attitudes between genders across the two dialects could also be observed in the realization of [D]. As this is a rather new variant possibly with neutral connotation (Chuang and Fon 2017), females from a more innovative 漳 Chiang dialect were rather open in adopting the category. On the other hand, Mix females were more reserved towards this variant, likely not because they have attached anything seriously negative to it, but because they were from a more conservative dialect in general. As a result, they were more hesitant in adopting new variants. In other words, it is possible that female talkers in a conservative dialect tend to be more conservative than their male counterparts, while female talkers of an innovative dialect tend to be more innovative, ceteris paribus. This reverse patterning could also be observed in the realization of [Z]. Although there was an overall gender effect on the realization of [Z], the proportions for affricative-vs.-fricative realizations of [Z] were different in the two dialects (Fig. 3). In the more innovative 漳 Chiang dialect, females did not use the older affricative realization at all. Only males did. On the other hand, in the more conservative Mix dialect, females not only used the affricative realization of [Z], but also used it more often than their male counterparts. This supports the conjecture that there was indeed a difference in attitude for the two genders across the two dialects.

Although intricate interactions between gender and dialect have not been previously reported for realizations of Min /dz/, the general principle the speakers abided by was rather consistent with Labov’s (1990) claim of females being the leader of a non-stigmatized dialect change. The only minor revision that might be needed here is that a conservative dialect could potentially disfavor new variants across the board and thus all new variants are considered as at least slightly stigmatized, which would tend to be more vigorously avoided by female talkers. The flip side of this is that females from a conservative dialect seemed to more fervently support the “signature” sound of the dialect than their fellow male speakers. More studies would be needed in order to confirm this.

In summary, different sound categories seemed to be associated with different identifications. [Z], [R], and [L] were mainly attached to gender. Males tended to use [Z] and [R] while females tended to use [L]. This implies that the /Z/→/L/ sound change proposed by Ang 洪惟仁 (2003, 2012) was likely led by female speakers, while the /Z/→/R/ realization observed in Chuang and Fon (2017) was likely led by males. Since the rounded environment was more felicitous to both rules, the gender difference in the usage of [L] and [R] was more prominent in such a context. On the other hand, [G] and [D] were more tied to attitudes towards sound changes. More conservative speaker groups tended to use [G] while more innovative ones tended to use [D]. Since conservativeness interacted with gender and dialect in an intricate manner, the use of these two variants reflected such a convolution. In a conservative dialect like Mix, females were more adamant in maintaining the traditions of the language, and thus more [G] and fewer [D] were used. However, in a more innovative dialect like 漳 Chiang, males were the conservers of the tradition instead, and thus, it was the males, who used more [G] and fewer [D]. In other words, because males and females have differential sensitivity towards the inclination of the language, their reactions towards [G] and [D] are much dependent on their gender and dialect.

5.3 Realization of /dz/ vs. Min proficiency

Like gender, talker proficiency was a potent factor in shaping Min /dz/ realization. However, the pattern observed in this study did not always go as predicted. For some variants like [L], there were additional interactions with talker gender, while for variants like [D] and [R], vowel context, talker gender, talker proficiency, and talker dialect worked together in a complex manner.

Among the five variants, the distribution of [G] was the most straightforward. Except for Mix males, who did not use the variant at all, the variant was exclusively found among high-level talkers for all the other speaker groups. In particular, proficient 漳 Chiang males and proficient Mix females were the main forces in conserving the variant, and approximately half of their /dz/ tokens were realized as [G]. This is in line with one’s prediction. As [G] is an articulatorily challenging old form [cf. Maddieson (2013); Ohala (1983)], which is absent from Mandarin phonology [cf. Duanmu (2007)], it is not surprising that only high-level talkers were found to have adopted it.

The distribution of [Z] showed an interaction among speaker proficiency, speaker gender, and talker dialect. Except for Mix males, all the other speaker groups showed a negative correlation between the use of [Z] and speaker proficiency, and high-level talkers were less likely to use [Z] than mid-level talkers. The trend was especially prominent among female speakers, as all tokens except for one were produced by mid-level talkers only. This is surprisingly in direct contrast with one’s prediction, as one had originally assumed that high-level talkers are generally more conservative and are thus more reluctant to let go of an old form like [Z]. Instead, it seemed that forgoing [Z] is a trend not caused by incompetency of Min users, but rather, it is a movement for ease of articulation actively led by high-level talkers. On the other hand, Mix males were the only ones that showed a positive correlation between the use of [Z] and speaker proficiency, fulfilling one’s original prediction. In fact, [Z] was even the dominant variant for /dz/ realization among high-level Mix users in the unrounded environment, accounting for 67% of the total data (Fig. 4d). This shows that there might be two speaker-group-dependent trends for [Z]. Except for Mix males, most proficient talkers are participating actively in eradicating the use of [Z]. This is especially true among high-level females. For proficient Mix males, however, they are fighting the dominant trend and trying to conserve [Z] instead.

For [L], the distribution showed a gender split. Female speakers demonstrated an overall positive correlation between the use of [L] and talker proficiency, and [L] was more likely found in more proficient speakers. In contrast, although its pattern was less consistent, [L] was more often observed in less proficient speakers for males, showing a negative correlation instead. This was inconsistent with one’s prediction, as one had originally predicted that high-level talkers should generally be less likely to adopt a variant arising from ease of articulation regardless of gender. However, it seemed that for a non-negative variant like [L] [cf. Ang 洪惟仁 (2003)], proficient females tended to take the lead and adopt the variant in full force, while proficient males were more conservative and less accepting. If one assumes that high-level speakers of a talker group are better representatives of the attitude of the group as a whole than mid-level speakers, then this gender-dependent interaction with talker proficiency is consistent with the gender-dependent attitude observed in the gender effect above. As females were more open to adopting [L] than their male counterparts, proficient females and proficient males tended to intensify this difference by becoming even more and less accepting to [L] adoption, respectively. This also implies that the process of [Z] abolition and [L] adoption (i.e., the [Z]→[L] change) is likely not led by female speakers in general, as suggested above, but by proficient females in particular. Males and less proficient females were mere followers of this trend.

[R] was only found in the rounded environment, and except for 漳 Chiang males, who did not show much of a proficiency effect, all the other groups demonstrated a positive correlation between talker proficiency and the use of [R]. This is rather intriguing, as one’s original prediction was the exact opposite. Since [R] is new and likely carries a negative connotation due to a Mandarin transfer [cf. Chen 陳雅玲 (2010, 2012); Chuang and Fon (2017)], one would have expected speakers who are more proficient in Min to be more conservative about and thus more immune to negative phonological transfers from Mandarin. One suspects that the underlying cause for this seemingly unexpected result might be due to an intricate inter-language interaction. Previous studies have shown that [L] and [R] are both likely realizations for Mandarin /ʐ/ among Mandarin-Min bilinguals whose Mandarin is heavily influenced by Min (Chan 1984; Chuang et al. 2015).Footnote 19 This connection between [L] and [R] might later be carried over to realizations for Min /dz/ due to analogy between the two languages, as /dz/ and /ʐ/ are both the sole voiced sibilant in Min and Mandarin inventory, respectively, and [L] is a common realization for the two phonemes [cf. Chuang and Fon (2017)]. Since one would expect that high-level Min speakers are more likely to show a stronger impact of Min on their Mandarin due to more robust phonological representation of Min, they might have also established a stronger connection between [L] and [R] via realizations for Mandarin /ʐ/ than their less proficient counterparts, resulting in more frequent usages of [R] for Min /dz/. In other words, even though [R] realization is due to a negative transfer from Mandarin, this negative transfer is more likely to happen when speakers use heavily Min-accented Mandarin, which is more often found among proficient than less proficient Min talkers. This implies that [R] might have paradoxically become a variant that implies high Min proficiency. As shown in Fig. 4, the use of [R] was especially prominent among proficient Mix males, and more than two thirds of their /dz/ tokens in the rounded context were realized as [R]. Less proficient Mix males came as a close second, with 42% of their /dz/ tokens in the rounded context being realized as [R]. Therefore, one suspects that the adoption of [R] was likely initiated by Mix males in general, and their proficient members in particular, while female talkers and 漳 Chiang males were followers of this trend. Since ratings regarding the connection of [R] are lacking, more studies would be needed in order to confirm the current conjecture. Whether [R] would eventually become a dominant realization for all speaker groups would also merit further research.

Finally, [D] seemed to show a general negative correlation with talker proficiency. Except for 漳 Chiang females in the unrounded context and 漳 Chiang males in the rounded context, mid-level speakers showed greater preference for [D] than their high-level counterparts. This tendency is especially strong among Mix males, as their mid-level talkers realized 73% of their /dz/ as [D] in the unrounded context, while no single instance of [D] was found among high-level Mix males. In contrast, high-level 漳 Chiang males seemed to demonstrate a stronger preference for [D] in the rounded environment than their mid-level counterparts, who showed no sign of using [D]. For 漳 Chiang females in the unrounded context, no proficiency effect was found, and both high- and mid-level speakers showed equally strong preferences for [D], accounting for 42% of the total tokens. The above pattern was largely in line with one’s predictions. Since [D] was a rather new variant and does not have a clear positive connotation, the conservative attitude held by high-level talkers might have prevented them from embracing the variant full-heartedly. The distribution also seemed to imply that 漳 Chiang females and less proficient Mix males were the ones that were leading the use of [D]. Other speakers were only followers of the trend. This might also explain why no proficiency effect was found for 漳 Chiang females. Even if a proficiency effect had existed, it might have long been mitigated through time. More studies would be needed in order to understand the relationship between language proficiency and attitude on the one hand, and listeners’ attached connotation of [D] on the other. Whether 漳 Chiang males in the rounded context showed a genuine opposite trend of the major tendency or whether it is a pattern due primarily to sampling error would also require further studies.

The distribution above seemed to imply that although variant newness and connotation do interact with talker proficiency, the mapping between the two is not as simple as one had originally conjectured, and some qualifications are necessary. In general, proficient talkers were indeed more conservative, and were more likely to use old forms than their mid-level counterparts. However, the results of [G] and [Z] suggest that the choice of which old forms to conserve seemed to be group-dependent and somewhat coincidental in nature. Proficient Mix males showed a strong preference for [Z], while proficient Mix females and proficient 漳 Chiang males showed a strong preference for [G]. For all three speaker groups, the old forms were in fact the predominant realizations in the unrounded context, constituting about half of the tokens. For newer forms, proficiency seemed to interact more with social connotation. Variants that imply high proficiency are more likely to be adopted by proficient talkers of both genders, as was the case with [R], while variants with neutral-to-slightly positive connotation [cf. Ang 洪惟仁 (2003)] were more likely adopted by females across dialects, as was the case with [L]. Finally, for new forms that do not have clear positive connotation, proficient talkers tend to avoid using the variant due to their conservative nature, as was the case with [D]. In other words, although proficient talkers are generally more cautious in adopting new variants, as one had originally predicted, their unwelcoming attitude could be attenuated by positive social connotation of the variant, and the effect is much stronger on females than on males.

Although one would usually consider proficient speakers as leaders of a language, since they are the ones that are more likely to use the language more frequently and thus exert a larger impact in shaping it, the situation might not be as clear-cut in the case of Min. As mentioned above, due to the strong influence of the official language Mandarin, the average Min proficiency and frequency of use are on a drastic decline, especially among younger speakers (Directorate-General of Budget 行政院主計總處 2013; Huang 黃宣範 1993; The United Daily News Group Poll Center 聯合報系民意調查中心 2002). As a consequence, high-level young talkers are drastically outnumbered by their mid- and low-level counterparts and might thus lose their critical impact on the language. If this trend is left to itself without any intervention, young mid-level talkers might become the ones that are taking the lead in shaping the language, since unlike their low-level counterparts, they still use Min on a daily basis. Future studies are necessary in order to observe whether this trend is going to be realized as predicted.

5.4 Realization of /dz/ vs. speaker consistency

Even though as many as four and six variants were observed across all speakers in the rounded and unrounded environment, respectively, individual talkers were rather prudent in their choices of variants and mostly limited themselves to only one to two variants in the rounded environment and two to three variants in the unrounded environment. The effect of vowel context on the number of variants used was likely not genuine but was rather due to the design of the study instead. This could be evidenced by the fact that the vowel context effect disappeared (or was even somewhat reversed) when only individual syllables were compared (Fig. 8). Since there are only two unique syllables in the rounded environment, but as many as five unique syllables in the unrounded one, the higher variability found in the latter context implies that talkers in general showed higher intra- than inter-syllabic consistency.

However, this tendency was much affected by speaker gender and talker proficiency. Male speakers were in general more consistent and their number of variants adopted showed little impact of these language-internal and/or language-external factors. Intra- and inter-syllabic consistency were comparably high across the two levels of male talkers, as speakers in general adopted only one or two variants for a given context/syllable. On the other hand, much variability was observed among female speakers of different Min proficiency levels. For high-level speakers, intra-syllabic consistency was still high and quite comparable to that of their male counterparts, but their inter-syllabic consistency was quite low. For mid-level females, both their intra- and inter-syllabic consistency were comparably low.

The gender-dependent pattern in variant consistency was likely due not (only) to gender per se, but to differential proficiency levels underlying gender. As mentioned above, previous studies showed that young female speakers tended to be less proficient in Min than their male counterparts (Huang and Fon 2007). This was also true in this study based on self-rated data (see Fig. 2a). Therefore, it is possible that there might be a proficiency effect independent of the interaction between gender and proficiency. More proficient speakers are more likely to demonstrate high consistency at both inter- and intra-syllabic level in variable sound realizations. As speakers’ proficiency level declines, inter-syllabic consistency is likely the first to suffer, but intra-syllabic consistency could still be maintained to a large extent. Once the proficiency level lowers below a certain threshold, intra-syllabic consistency is also likely to be affected, creating even more variability.

6 Conclusion

Although previous research on Min /dz/ realization focused mainly on the geographical distributions of different variants [e.g., Ang 洪惟仁 (2003, 2012), Ang and Chang 洪惟仁, 張素蓉 (2008), Chen 陳淑娟 (1995), Chen 陳雅玲 (2010, 2012), Chuang and Fon (2017), Hung 洪慧鈺 (2007), Khng 康韶真 (2014), Lin 林珠彩 (1995), Thoo 涂文欽 (2009), and Wang 王薈雯 (2014)], this study demonstrated that like variants in other languages, Min /dz/ variation, and perhaps also Min variations in general, are not immune to more language-universal factors like gender and proficiency. With some necessary qualifications, it is generally safe to say that females are innovators for non-negative new forms and conservers for positive old forms while males are innovators for negative new forms and conservers for non-positive old forms, conforming to the Labovian model for the most part (Labov 1990; Maclagan et al. 1999; Trudgill 2000). However, the connotation of a variant could be rather fluid and demonstrates dialect-dependent patterns. Different speaker groups thus reacted to the same variants according to their dialect-specific connotations.

On the other hand, high-level talkers are generally conservers of old forms and innovators of new forms that imply high proficiency, while mid-level talkers are innovators of new forms that lack clear positive connotation. For new forms that are slightly positive, the effect of gender might also need to be factored in. However, not all old forms are uniformly conserved by high-level talkers, and different speaker groups might demonstrate idiosyncratic preferences for their choices. This shows that the effect of proficiency for Taiwan Min might be more complicated than what was found for Yami [cf. Lai and Hsu (2013)].

Since Min females are generally less proficient than Min males in Taiwan (Huang and Fon 2007), it is tempting to overlay the effects of gender and proficiency and align males with high-level talkers and females with mid-level talkers. Although there indeed seemed to be some supporting evidence for such an alignment, as in the case of [R] and [L], patterns of other variants indicated that a simple alignment could not have been the whole story. For example, although high-level talkers were more likely to preserve [G] and avoid [D], there was no clear-cut gender effect for the two variants. Instead, an intertwined gender- and dialect-dependent pattern was found. Similarly, although males were more likely to preserve [Z] than females, the expected proficiency effect was only observed among Mix males, and not among other speaker groups. This suggests that the two factors of gender and proficiency should at least be partially independent of each other, even though sometimes they may indeed align. This could also be clearly observed in the consistency effect. Although high-level females were on a par with high-level males in terms of intra-syllabic consistency, their inter-syllabic consistency was much lower and patterned more with mid-level females, who showed both low intra- and inter-syllabic consistency. In other words, speaker gender and talker proficiency not only affect one’s variant choices, but also affect one’s consistency with regard to these choices.

It is possible that the interactive pattern between gender and proficiency is not exclusive to the realization of Min /dz/, but might also apply to other variant realizations in Min, or even other languages in general. This would be especially interesting to observe for non-official languages with declining use like Min. Since females tend to lead non-negative new sound changes (Labov 1990; Maclagan et al. 1999; Trudgill 2000), and in a language with declining use, the speakers are likely dominated by mid-level talkers at best, most of whom are in fact females, how this would mold the manifestation of language variation would be worth further investigation, as it would enlighten us with more understanding of the essence of language change itself.

Notes

  1. Chiang and 泉 Chôan are more commonly referred to by their Mandarin transliterations, Zhang and Quan, respectively. However, this study uses the Taiwan Min terms to refer to these two dialects so as to show respect to the speakers of the language. Church Romanization is adopted for Min throughout the article.

  2. Min speakers mainly reside in the west half of Taiwan. Between 漳 Chiang and 泉 Chôan, the former is more prevalent in Taiwan. However, most speakers show various degrees of Chiang-Chôan mixture due to frequent travels within the island (Ang 洪惟仁 1985), and thus, the Mix dialect has currently become the most dominant variety on the island (National Languages Committee 國語推行委員會 2011).

  3. As Mandarin has been the official language in Taiwan since 1945, all Min speakers born after that are naturally bilingual and are fluent in Mandarin also (Huang 黃宣範 1993).

  4. The source of the database is http://archive.phonetics.ucla.edu/. Accessed 9 May 2017.

  5. Mielke (2012) indicated a language count of 549 in the P-base database. However, the actual number in fact added up to 629 for some unknown reason. The latter was used as the denominator for calculating ratios. The source of the database is http://pbase.phon.chass.ncsu.edu/query_inventory. Accessed 9 May 2017.

  6. Like many Chinese languages, a great number of Taiwan Min words have two context-dependent pronunciations (Yang 楊秀芳 1982). 讀冊音 Thák-chheh-im is used in literary contexts while 講話音 kng-ōe-im ‘pronunciation of speech’ is used in colloquial contexts.

  7. Except for a few Min scholars and writers, and congregations affiliated with the Presbyterian Church in Taiwan, which uses the Min Bible and Min Hymns in the majority of its services in plain churches, most Min speakers use Min only in oral communication and are not familiar with the written form of Min. According to the statistics provided by the Presbyterian Church in Taiwan台灣基督長老教會總會 (2015a, 2015b), its congregation accounted for 1.1% of the total population in 2015, and about 78% of the congregation from plain presbyteries attended services conducted in Min.

  8. The United Daily News Poll Center has undergone reorganization and has been renamed as The United Marketing Research Company since 2002.

  9. For older speakers of Yami, both [l] and [ɮ] are allophones of /l/. The latter is used before a high vowel /i/ and the former is realized elsewhere. This distinction is disappearing for speakers below age 40 (Lai and Hsu 2013).

  10. Results of the female speakers have already been reported in a dialectal study of /dz/ in Chuang and Fon (2017). The current paper focused only on gender and proficiency effects and the interaction between the two.

  11. Adjusted df’s were adopted throughout this study when Levene’s tests for equality of variances were significant.

  12. Interestingly, Pearson correlation showed that speakers’ Min proficiency was also correlated with their Min frequency of use with their immediate family members only [family: r(6) = .93, p < .001; others: r(6) = .24, ns.]. Similarly, speakers only judged their Min frequency of use based on how frequent they converse with their family members in Min [family: r(6) = .86, p < .01; others: r(6) = .24, ns.].

  13. Speakers from the 泉 Chôan dialect were not included because there were fewer fluent young speakers available at the time of the experiment. This likely reflected the vitality of the dialect among young speakers.

  14. Although the Ministry of Education in Taiwan has recommended a character-based Min transcription system (National Languages Committee 國語推行委員會 2011), the pronunciation of some characters were not transparent to young native speakers in a pretest. In order to facilitate smooth elicitation, alternatives based on a make-do system widely used in Min lyrics and creative writing were adopted for these specific characters.

  15. Since reading Min out loud is not an activity commonly performed by Min speakers, many of the participants tended to paraphrase what they read instead of reading verbatim for the first pass. As a consequence, they were asked to read again, as many of the omissions/alternations were in fact target words.

  16. Post hoc analyses for chi-square tests were executed using Beasley and Schumacker’s (1995) method throughout the paper.

  17. Unlike many variation studies, which mainly used linear mixed-effects (LME) modeling analyses, chi-square was chosen in this study as a means to statistically evaluate the effect of gender on /dz/ realization. This was mainly due to two reasons. First of all, the number of cases included was too small for an ideal overall LME analysis, and the model failed to converge. Secondly, there were five main realization categories of interest, which would not easily fit into a binary LME analysis. However, in order to check for a potential random subject effect, LME models were run on the current dataset by collapsing realizations into two categories, the older variants, including [Z] and [G], and the newer variants, including [L], [D], and [R]. The results were largely in accord with the chi-square analyses. For a detailed description of the LME analyses on the gender effect, please see Appendix 1.

  18. LME analyses examining the proficiency effect were run to check for a potential random subject effect. Please see Appendix 2 for a detailed description of the results.

  19. The use of [L] for Mandarin /ʐ/ realization is thought to be due to a negative Min transfer (Chan 1984; Kubler 1985a, 1985b).

References

  • Ang, Uijin 洪惟仁. 1985. A study on Taiwan Holo tone 臺灣河佬話聲調研究. Taipei: Zili Wanbao.

  • Ang, Uijin 洪惟仁. 1997. Southern Min dialects in Kaohsiung County 高雄縣閩南語方言. Kaohsiung: Kaohsiung County Government.

  • Ang, Uijin 洪惟仁. 2003. The motivation and direction of sound change: On the competition of Minnan dialects Chang-chou and Chuan-chou and the emergence of General Taiwanese 音變的動機與方向:漳泉競爭與台灣普通腔的形成. Ph.D. dissertation. Hsinchu: National Tsing Hua University.

  • Ang, Uijin 洪惟仁. 2005. Looking at the changes of Taiwan Southern Min from dialect maps graphed in two different eras 從兩個時期製作的方言地圖看台灣閩南語的變化. Paper presented at the 9th Min Dialect International Symposium 第九屆閩方言國際學術研討會. Fuzhou, China.

  • Ang, Uijin 洪惟仁. 2012. The drift of change of the initial /j-/ of Southern Min 閩南語入字頭(日母)的音變潮流. Journal of Taiwanese Languages and Literature 臺灣語文研究 7: 1–32.

  • Ang, Uijin 洪惟仁. 2013. The distribution and regionalization of varieties in Taiwan 台灣的語種分布與分區. Language and Linguistics 語言暨語言學 14: 313–367.

  • Ang, Uijin, and Su-Rong Chang 洪惟仁, 張素蓉. 2008. The gradient distribution of Quanzhou dialect in the Haixian area of the Taichung County: A socio-geodialectological study 台中縣海線地區泉州腔的漸層分布──一個社會地理方言學的研究. In Proceeding of sociolinguistics and functional grammar 社會語言學與功能語法論文集, ed. Hsu S. Wang and Fu-mei Hsu 王旭, 徐富美, 13–43. Taipei: Crane.

  • Balise, Raymond R., and Randy L. Diehl. 1994. Some distributional facts about fricatives and a perceptual explanation. Phonetica: International Journal of Speech Science 51(1–3): 99–110.

  • Baran, Dominika. 2014. Linguistic practice and identity work: Variation in Taiwan Mandarin at a Taipei County high school. Journal of Sociolinguistics 18(1): 32–59.

  • Beardsmore, Hugo B. 1986. Bilingualism: Basic principles (Vol. 2). Clevedon: Multilingual Matters.

  • Beasley, T. Mark, and Randall E. Schumacker. 1995. Multiple regression approach to analyzing contingency tables: Post hoc and planned comparison procedures. The Journal of Experimental Education 64(1): 79–63.

  • Boersma, Paul, and David Weenink. 2017. Praat: Doing phonetics by computer (Version 6.0) [Computer Program]. Available at http://www.praat.org/. Accessed 19 Mar 2017.

  • Chan, Hui-chen. 1984. The phonetic development of Mandarin /ʐ/ in Taiwan: A sociolinguistic study. M.A. Thesis. Taipei: Fu Jen Catholic University.

  • Chen, Su-Chuan 陳淑娟. 1995. A study on the sound change from the chu-rhyme to the shi-rhyme in Guanmiao dialect 關廟方言「出歸時」的研究. M.A. Thesis. Taipei: National Taiwan University.

  • Chen, Ya-ling 陳雅玲. 2010. Phonetic variation in Chuanchou-accented Southern Min as spoken in coastal Kaohsiung City 高雄市海岸地帶偏泉腔閩南語的語音變異. M.A. Thesis. Hsinchu: National Hsinchu University of Education.

  • Chen, Ya-ling 陳雅玲. 2012. The variation and change of j-initial syllables in Kaohsiung Southern Min 高雄市閩南語入字頭的變異與變化. Paper presented at the 9th Workshop on the Relationship between the Racial Migration and Distribution of Languages or Dialects 第九屆語言文化分佈與族群遷徒工作坊. Taipei, Taiwan.

  • Chuang, Ya-Wen, Chong-Wei Feng, and Ru-Yi Chen 莊雅雯, 馮鐘緯, 陳如億. 2009. The difference of the variants of j-initial syllables between Helauke and non-Helauke areas 〈入〉字頭「g」變體在鶴佬客地區與非鶴佬客地區之差異. Paper presented at the 3rd Workshop on the Relationship between the Racial Migration and Distribution of Languages or Dialects 第三屆語言文化分佈與族群遷徙工作坊. Kaohsiung, Taiwan.

  • Chuang, Yu-Ying, Ying-Jie Chiang, and Janice Fon. 2012. The effects of context and Min dialect on the realizations of /ʐ/ variants in Taiwan Mandarin. Paper presented at the 2nd Workshop on Sound Change. Kloster Seeon, Germany.

  • Chuang, Yu-Ying, and Janice Fon. 2017. On the dialectal variations of voiced sibilant /dz/ in Taiwan Min young speakers. Lingua Sinica 3: 1–34.

  • Chuang, Yu-Ying, Sheng-Fu Wang, and Janice Fon. 2015. Cross-linguistic interaction between two voiced fricatives in Mandarin-Min simultaneous bilinguals. In Proceedings of the 18 th International Congress of Phonetic Sciences, Paper number 0311, ed. The Scottish Consortium for ICPhS 2015, 1–4. Glasgow, UK: The University of Glasgow Available at https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2015/Papers/ICPHS0311.pdf. Accessed 12 July 2017.

  • De Houwer, Annick. 1995. Bilingual language acquisition. In The handbook of child language, ed. Paul Fletcher and Brian MacWhinney, 219–250. Oxford: Wiley-Blackwell.

    Google Scholar 

  • Directorate-General of Budget, Accounting and Statistics, Executive Yuan 行政院主計總處. 2013. The 2010 population and housing census: A general summary report on the statistic results and analyses 99年人口及住宅普查總報告提要分析. Taipei: Directorate-General of Budget, Accounting and Statistics, Executive Yuan Available at http://www.dgbas.gov.tw/ct.asp?xItem=31969&ctNode=3272&mp=1. Accessed 17 July 2017.

    Google Scholar 

  • Duanmu, San. 2007. The phonology of standard Chinese ( 2nd ed.). Oxford: Oxford University Press.

  • Fon, Janice, Jui-mei Hung, Yi-Hsuan Huang, and Hui-ju Hsu. 2011. Dialectal variations on syllable-final nasal mergers in Taiwan Mandarin. Language and Linguistics 12(2): 273–311.

  • Gósy, Mária. 2013. Inter-speaker and intra-speaker variability indicating a synchronous speech sound change. In VL1xx: Papers in linguistics presented to László Varga on his 70th birthday, ed. Péter Szigetvári, 313–332. Budapest: Tinta Publishing House.

    Google Scholar 

  • Gussenhoven, Carlos, and Rolf H. Bremmer Jr. 1983. Voiced fricatives in Dutch: Sources and present-day usage. North-Western European Language Evolution 2: 55–71.

    Article  Google Scholar 

  • Hickey, Raymond. 2012. Internally and externally motivated language change. In The handbook of historical sociolinguistics, ed. Juan M. Hernández-Compoy and Juan C. Conde-Silvestre, 401–421. Malden, MA: Wiley-Blackwell.

    Google Scholar 

  • Holes, Clive. 1990. Gulf Arabic. London: Routledge.

    Google Scholar 

  • Holmes, Janet. 1999. Setting new standards: Sound changes and gender in New Zealand English. Cuadernos de Filología Inglesa 8: 147–175.

    Google Scholar 

  • Huang, Shuanfan 黃宣範. 1993. Language, society, and ethnic identity: A study on language sociology in Taiwan 語言, 社會與族群意識: 台灣語言社會學的研究. Taipei: Crane.

  • Huang, Yi-Hsuan, and Janice Fon. 2007. The effect of acquisition order and word relatedness on code-switching in balanced bilingual speakers. In Proceedings of the 16th international congress of phonetic sciences, ed. Jürgen Trouvain and William J. Barry, 1577–1580. Saarbrücken: Univ. des Saarlandes.

    Google Scholar 

  • Hung, Hui-yu 洪慧鈺. 2007. An investigation and analysis on the sound change of j-initial syllables in Fangyuan township, Changhua County 彰化縣芳苑鄉〈入〉字頭音變的調查與分析. Annual of Graduate School of Chinese Literature Soochow University 有鳳初鳴年刊 3: 119–133.

  • Janson, Tore. 1983. Sound change in perception and production. Language 59(1): 18–34.

  • Jesus, Luis M.T., and Christine H. Shadle. 2003. Temporal and devoicing analysis of European Portuguese fricatives. In Proceedings of the International Congress of Phonetic Sciences, ed. Maria-Josep Sole, Daniel Recasens, and Joaquin Romero, 779–782.

  • Khng, Siau-tsin 康韶真. 2014. A phonetic survey and variation analyze of Taiwanese in Kaohsiung 高雄台語語音的調查佮演變分析. M.A. Thesis. Taipei: National Taiwan Normal University.

  • Kong, Eun Jong, Mary E. Beckman, and Jan Edwards. 2012. Voice onset time is necessary but not always sufficient to describe acquisition of voiced stops: The cases of Greek and Japanese. Journal of Phonetics 40(6): 725–744.

  • Kubler, Cornelius C. 1985a. The development of Mandarin in Taiwan: A case study of language contact. Taipei: Student Book.

    Google Scholar 

  • Kubler, Cornelius C. 1985b. The influence of Southern Min on the Mandarin of Taiwan. Anthropological Linguistics 27(2): 156–176.

  • Labov, William. 1971. The study of language in its social context. In Advances in the sociology of language, ed. Joshua A. Fishman, vol. 1, 152–216. The Hague: Mouton.

    Google Scholar 

  • Labov, William. 1990. The intersection of sex and social class in the course of linguistic change. Language Variation and Change 2(2): 205–254.

  • Lai, Li-Fang, and Huiju Hsu. 2013. Contact-induced sound change: Analysis of the alveolar lateral fricative in Yami. Eesti ja soome-ugri keeleteaduse ajakiri - Journal of Estonia and Finno-Ugric Linguistics 4: 31–50.

    Article  Google Scholar 

  • Li, Zhongmin 李仲民. 2009. A retrospective and perspective view on the postwar geolinguistics of Taiwan Southern Min 戰後臺灣閩南語地理語言學的回顧與展望. Journal of Taiwanese Languages and Literature 臺灣語文研究 4: 107–151.

  • Lin, Ju-Tsai 林珠彩. 1995. A preliminary investigation on the phonetics of some lexical items comparing across three generations of Taiwan Min speakers—using the Lin family in Xiaogang, Kaohsiung City as an example 台灣閩南語三代間語音詞彙的初步調查與比較──以高雄市小港區林家為例. M.A. thesis. Taipei: National Taiwan Normal University.

  • Maclagan, Margaret A., Elizabeth Gordon, and Gillian Lewis. 1999. Women and sound change: Conservative and innovative behavior by the same speaker. Language Variation and Change 11(1): 19–41.

  • Maddieson, Ian. 1984. Patterns of sounds. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Maddieson, Ian. 2013. Voicing and gaps in plosive systems. In The world atlas of language structures online, ed. Matthew S. Dryer and Martin Haspelmath. Leipzig: Max Planck Institute for Evolutionary Anthropology.

    Google Scholar 

  • Mielke, Jeff. 2012. P-base query inventory. Available at http://pbase.phon.chass.ncsu.edu/query_inventory. Accessed 9 May 2017.

  • Miller, George A., and Patricia E. Nicely. 1955. An analysis of perceptual confusions among some English consonants. Journal of Acoustical Society of America 27(2): 338–352.

  • National Languages Committee 國語推行委員會. 2011. The Taiwan Southern Min common word dictionary 臺灣閩南語常用詞辭典. Taipei: Ministry of Education, Taiwan Available at http://twblg.dict.edu.tw/holodict_new/index.html. Accessed 19 Mar 2017.

  • Nip, Ignatius S.B., and Henrike K. Blumenfeld. 2015. Proficiency and linguistic complexity influence speech motor control and performance in Spanish language learners. Journal of Speech, Language, and Hearing Research 58(3): 653–668.

  • Norman, Jerry. 1988. Chinese. Cambridge: Cambridge University Press.

    Google Scholar 

  • Ogawa, Naoyoshi 小川尚義 (ed.) 1907. A composite Japanese-Taiwanese dictionary 日臺大辭典. Taipei: Office of the Governor-General of Taiwan.

  • Ohala, John J. 1983. The origin of sound patterns in vocal tract constraints. In The production of speech, ed. Peter F. MacNeilage, 189–216. New York: Springer-Verlag.

    Chapter  Google Scholar 

  • R Core Team. 2016. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing Available at https://www.R-project.org/. Accessed 14 December 2016.

    Google Scholar 

  • Rau, Victoria D., Hui-Huan A. Chang, and Maa-Neu Dong. 2009. A tale of two diphthongs in an indigenous minority language: Yami of Taiwan. In Variation in indigenous minority languages, ed. James N. Stanford and Dennis R. Preston, 259–279. Amsterdam/Philadelphia: John Benjamins.

    Chapter  Google Scholar 

  • Reynolds, Jermay J. 2012. Language variation and change in an Amdo Tibetan village: Gender, education and resistance. Ph.D. dissertation. Washington, DC: Georgetown University.

  • Singh, Sadanand, and John W. Black. 1966. Study of twenty-six intervocalic consonants as spoken and recognized by four language groups. Journal of Acoustical Society of America 39(2): 372–387.

  • Smith, Caroline L. 1997. The devoicing of /z/ in American English: Effects of local and prosodic context. Journal of Phonetics 25(4): 471–500.

  • The Presbyterian Church in Taiwan 台灣基督長老教會總會. 2015a. Church attendance statistics in 2015 2015年禮拜人數. Available at http://pctcontent.pct.org.tw/ModuleControl/downfile/L01-2-%e6%95%99%e5%8b%a2%e7%b5%b1%e8%a8%88-%e7%a6%ae%e6%8b%9c%e4%ba%ba%e6%95%b8(OK).pdf. Accessed 7 July 2017.

  • The Presbyterian Church in Taiwan 台灣基督長老教會總會. 2015b. Congregation population statistics in 2015 2015年信徒總數. Available at http://pctcontent.pct.org.tw/ModuleControl/downfile/L01-4-%e6%95%99%e5%8b%a2%e7%b5%b1%e8%a8%88-%e4%bf%a1%e5%be%92%e7%b8%bd%e6%95%b8(OK).pdf. Accessed 9 May 2017.

  • The United Daily News Group Poll Center 聯合報系民意調查中心. 2002. The legacy and disappearance of mother tongues 母語的傳承與流失, United Daily News 14. Available at http://www.hakkaonline.com/thread-5123-1-1.html. Accessed 7 July 2017.

  • Thoo, Bun-khim 涂文欽. 2009. A geo-dialectological study on Southern Min in Changhua County 彰化縣閩南語地理方言學研究. Paper presented at the The 10th National Conference on Linguistics 第十屆全國語言學論文研討會. Taoyuan, Taiwan.

  • Tinelli, Henri. 1981. Creole phonology. The Hague: Mouton.

    Book  Google Scholar 

  • Trudgill, Peter. 2000. Sociolinguistics: An introduction to language and society. New York: Penguin Books.

  • Wang, Hui-Wen 王薈雯. 2014. The performances of the Kinmen dialect of young people in Kinmen using 8 young people born in Kinmen in the 80s as examples 金門青年的金門話表現情形:以八位1980年代出世的金門青年做例. M.A. Thesis. Taipei: National Taiwan Normal University.

  • Wang, Marilyn D., and Robert C. Bilger. 1973. Consonant confusions in noise: A study of perceptual features. Journal of Acoustical Society of America 54(5): 1248–1266.

  • Yang, Hsiu-fang 楊秀芳. 1982. A study on the literary-colloquial system of Southern Min 閩南語文白系統的研究. Ph.D. dissertation. Taipei: National Taiwan University.

  • Yao, Rong-song 姚榮松. 1988. The phonological system in Hui Yin Miao Wu and some related questions 彙音妙悟的音系及其相關問題. Bulletin of Chinese 國文學報 17: 251–281.

  • Yu, Alan C.L., Carissa Abrego-Collier, Jacob Phillips, Betsy Pillion, and Daniel Chen. 2015. Investigating variation in English vowel-to-vowel coarticulation in a longitudinal phonetic corpus. In Proceedings of the 18 th International Congress of Phonetic Sciences, Paper number 0519, ed. The Scottish Consortium for ICPhS 2015, 1–4. Glasgow: The University of Glasgow.

  • Zhang, Qing. 2005. A Chinese yuppie in Beijing: Phonological variation and the construction of a new professional identity. Language in Society 34(3): 431–466.

  • Żygis, Marzena. 2008. On the avoidance of voiced sibilant affricates. ZAS Papers in Linguistics 49: 23–45.

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Ministry of Science and Technology in Taiwan (project number MOST104-2410-H-002-160-MY3). The authors would like to thank past and present members of the Phonetics Lab in Graduate Institute of Linguistics at National Taiwan University for their helpful comments and support, especially Ying-Chieh Chiang, who helped with the recording, and Sheng-fu Wang, who helped with sound labeling. Many thanks go to all the talkers that participated in the study. Special thanks go to two anonymous reviewers, who provided fruitful thoughts to make this work better. Without them, this paper could not have been finished. Naturally, all the faults are ours.

Author information

Authors and Affiliations

Authors

Contributions

YC designed the experiment and collected data with the help of JF. JF conceived of the scope of this study and interpreted the results. Both authors were involved in data analyses and manuscript preparation.  Both authors read and approved the final manuscript. The authors declared that they have no competing interests.

Corresponding author

Correspondence to Janice Fon.

Ethics declarations

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendices

The glmer function in the lme4 package of R (R Core Team 2016) was used to perform the linear mixed effects analysis in this study. Due to the small data size, separate analyses were done for the gender (Appendix 1) and the proficiency effect (Appendix 2).

1.1 Appendix 1

LME analyses on the gender effect

As shown in Fig. 3, there was potentially a three-way interaction among vowel (rounded/unrounded), gender (male/female), and dialect (Chiang/Mix). As the size of the dataset makes it difficult for models containing higher-level interactions to converge, separate models were run on rounded and unrounded vowel environments. For the rounded environment, gender and dialect (without the interaction term) were entered into the model as fixed effects, and the intercept for subjects was entered as a random effect. The dependent variable was collapsed into a binary distinction of older (i.e., [Z] and [G], coded 0) and newer variants (i.e., [L], [D], and [R], coded 1). Results showed that none of the fixed effects was significant (Table 7). This was likely due to the fact that there were only three tokens of old realizations for female speakers, even though they were from multiple speakers.

Table 7 Model summary of the effect of gender on realizations in the rounded environment

For the unrounded environment, gender and dialect, along with their interaction term, were entered into the model as fixed effects, and the intercept for subjects was entered as a random effect. Results showed that the interaction effect was near-significant (p = .10). Unlike Chiang males, who used more older forms than their female counterparts, Mix males used more newer variants instead. This was consistent with the results found using chi-square analyses.

Table 8 Model summary of the effect of gender on realizations in the unrounded environment.

1.2 Appendix 2

LME analyses on the proficiency effect

As shown in Figs. 4 and 5, the effect of proficiency was even more complex and involved more higher-level interactions. In order to arrive at a converging model, separate analyses were run on three datasets in order to focus more on the potential proficiency effect. For Chiang males, vowel and proficiency, along with their interaction term, were entered into the model as fixed effects, and the intercept for subjects was entered as a random effect. Results showed that both the main effect of vowel and the interaction effect were (near-) significant (Table 9). Older realizations were more likely to be retained in the unrounded environment. Mid-level speakers were more likely to adopt newer variants in the unrounded environment than their high-level counterparts but were less likely to do so in the rounded environment. This was consistent with the findings in the chi-squared analyses.

Table 9 Model summary of the effect of proficiency on the realizations of Chiang males

For Mix males, an analogous model was built to examine the effects. Results showed that none of the effects were significant (Table 10). A closer examination of the data showed that this was likely due to the small size of the data, as there were only two talkers in each speaker group, and [Z] was always found in only one of the talkers within each group. Inclusion of more speakers would be needed in future studies in order to see whether this is merely a sampling error or whether there is indeed much heterogeneity within this speaker group.

Table 10 Model summary of the effect of proficiency on the realizations of Mix males

For females, since Fig. 5 seems to indicate a similar trend between the two dialects, speakers from both dialects were collapsed together for model testing. Vowel, dialect, and proficiency (without the interaction term) were entered into the model as fixed effects, and the intercept for subjects was entered as a random effect. Results showed that the main effect of proficiency is near-significant. Mid-level speakers were more likely to use older forms than high-level speakers. This is consistent with the chi-square results.

Table 11 Model summary of the effect of proficiency on the realizations of females

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chuang, YY., Fon, J. The effect of speaker gender and talker proficiency on the realization of Taiwan Min /dz/ among young speakers. lingua. sin. 4, 1 (2018). https://doi.org/10.1186/s40655-017-0033-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40655-017-0033-4

Keywords