頁籤選單縮合
題 名 | Some Studies on Min-Nan Speech Processing |
---|---|
作 者 | Kuo, Wei-chih; Ho, Chen-chung; Zhong, Xiang-rui; Liang, Zhen-feng; Yu, Hsiu-min; Wang, Yih-ru; Chen, Sin-horng; | 書刊名 | International Journal of Computational Linguistics & Chinese Language Processing |
卷 期 | 12:4 2007.12[民96.12] |
頁 次 | 頁391-410 |
分類號 | 312.85 |
關鍵詞 | Min-Nan text-to-speech system; Speech recognition; Model-based tone labeling; |
語 文 | 英文(English) |
英文摘要 | In this paper, three studies of Min-Nan speech processing are presented. The first study concerns the implementation of a high-performance Min-Nan TTS system. On the basis of the waveform templates of 877 base-syllables used as basic synthesis units and through the application of the RNN-based prosody generation method and the PSOLA algorithm for prosody modification, this Min-Nan TTS system can convert texts, represented in both Han-Luo (f<"g> ) and Chinese logographic writing systems, into natural Min-Nan speech. An informal, subjective listening test confirms that the system performs well and the synthetic speech sounds natural for well-tokenized Min-Nan texts and for automatically tokenized Chinese logographic texts. The second investigation concerns the realization of a Min-Nan speech recognizer. It adopts the initial-final-based HMM approach with a simple base-syllable bigram language model. A base-syllable recognition rate of 65.1% has been achieved. Finally, a model-based tone labeling method is presented. This method adopts a statistical model to eliminate the affections of all factors other than /tone on the syllable pitch contour for automatic tone labeling. Experimental results confirm that this method outperforms the conventional VQ-based approach. |
本系統中英文摘要資訊取自各篇刊載內容。