查詢結果分析
相關文獻
- 基於VQ/HMM之國語文句翻語音中音節音長與振幅參數產生之方法
- A New Training Method for Speech Recognition
- 隱藏式馬可夫模型應用在以腦電波信號為根據之睡眠自動判讀
- 應用隱藏式馬可夫模型於有考慮雜訊的電力品質干擾事件之辨識
- Vector Quantization of Images with Codeword-Rotation Algorithm
- 以連續型隱藏式馬可夫模型來計算中文簽名之動態相似度值
- A High Fidelity Image Coding Using VQ-BTC
- An Overview of RNN-Based Mandarin Speech Recognition Approaches
- Modified Search Order Coding for Vector Quantization Indexes
- 一個適用於近似週期信號的新自適性向量量化法及其在心電圖資料壓縮上的應用
頁籤選單縮合
題 名 | 基於VQ/HMM之國語文句翻語音中音節音長與振幅參數產生之方法=A VQ/HMM Based Method to Generate Syllable Duration and Amplitude Parameters for Mandarin Text-to-speech |
---|---|
作 者 | 古鴻炎; 簡敏昌; | 書刊名 | 電腦學刊 |
卷 期 | 13:3 2001.09[民90.09] |
頁 次 | 頁21-30 |
分類號 | 312.23 |
關鍵詞 | 文句翻語音; 韻律參數; 音節音長; 音節振幅; 向量量化; 隱藏式馬可夫模型; Text-to-speech; Prosodic parameters; Syllable duration; Syllable amplitude; Vector quantization; Hidden Markov model; |
語 文 | 中文(Chinese) |
中文摘要 | 本文提出一種使用向量量化(VQ)分類與隱藏式馬可夫模型(HMM)之方法,來分別建 立國語音節之音長,和振幅數值的產生模型,以便在國語文句翻語音系統裡使用。在此,兩 個模型共同稱為音長與振幅之隱藏式馬可夫模型 (HA-HMM)。 在模型訓練之階段,關於訓練 語句之音節音長與振幅的正規化問題,我們也提出了兩個有效的作法;接著建立量化分類之 碼書, 依相鄰音節的量化分類結果, 來組合出各訓練語句對應的觀測符號序列,用以訓練 DA-HMM。在合成階段,則根據呼吸群及詞邊界訊息,來決定狀態轉移序列,再依據狀態和各 字的測符號去查 DA-HMM 模型的輔助參數,就可得到欲合成的語句中各音節的音長和振幅數 值。 由測試實驗的結果得知,內部測試時,DA-HMM 模型對音節音長和振幅的平均預測誤差 分別是 22ms 與 1.1dB,而在外部測試時,平均預測誤差分別是 44ms 與 2.3dB。此外,主 觀的聽測評估也顯示,DA-HMM 模型的確可提升合成語音的自然度。 |
英文摘要 | In this paper, a method using vector-quantization (VQ) classification and hidden Markov model (HMM) is proposed to model Mandarin syllable duration and amplitude respectively. These two models are to be used in a Mandarin text-to-speech system, and are called DA-HMM to together. In the training phase, two effective methods are also developed to normalize the durations and amplitudes of the syllables comprising each training sentesce. Then, codebooks for quantization classification are constructed. After classification, the VQ codes of adjacent syllables in a training sentence are then combined to form the observation symbol sequence for DA-HMM training. In the synthesis phase, the information of breath-groups and word-boundaries are used to determine a state transition sequence. Then, according to the states and encoded observation symbols, the duration and amplitude parameters for each syllable of a sentence to be synthesized can be looked up from the auxiliary parametes of DA-HMM. the results of test experiments show that in inside test, the average prediction errors of a syllable's duration and amplitude are 22 ms and 1.1dB respectively, and in outside test, the average prediction erors are 44 ms and 2.3dB respectively. In addition, subjective tests also show that DA-HMM can indeed promote the naturalness level of the synthesized speech. |
本系統中英文摘要資訊取自各篇刊載內容。