頁籤選單縮合
題名 | Some Studies on Min-Nan Speech Processing= |
---|---|
作者 | Kuo, Wei-chih; Ho, Chen-chung; Zhong, Xiang-rui; Liang, Zhen-feng; Yu, Hsiu-min; Wang, Yih-ru; Chen, Sin-horng; |
期刊 | International Journal of Computational Linguistics & Chinese Language Processing |
出版日期 | 20071200 |
卷期 | 12:4 2007.12[民96.12] |
頁次 | 頁391-410 |
分類號 | 312.85 |
語文 | eng |
關鍵詞 | Min-Nan text-to-speech system; Speech recognition; Model-based tone labeling; |
英文摘要 | In this paper, three studies of Min-Nan speech processing are presented. The first study concerns the implementation of a high-performance Min-Nan TTS system. On the basis of the waveform templates of 877 base-syllables used as basic synthesis units and through the application of the RNN-based prosody generation method and the PSOLA algorithm for prosody modification, this Min-Nan TTS system can convert texts, represented in both Han-Luo (f<"g> ) and Chinese logographic writing systems, into natural Min-Nan speech. An informal, subjective listening test confirms that the system performs well and the synthetic speech sounds natural for well-tokenized Min-Nan texts and for automatically tokenized Chinese logographic texts. The second investigation concerns the realization of a Min-Nan speech recognizer. It adopts the initial-final-based HMM approach with a simple base-syllable bigram language model. A base-syllable recognition rate of 65.1% has been achieved. Finally, a model-based tone labeling method is presented. This method adopts a statistical model to eliminate the affections of all factors other than /tone on the syllable pitch contour for automatic tone labeling. Experimental results confirm that this method outperforms the conventional VQ-based approach. |
本系統之摘要資訊系依該期刊論文摘要之資訊為主。