頁籤選單縮合
| 題 名 | A Maximum Entropy Approach for Semantic Language Modeling |
|---|---|
| 作 者 | Chueh, Chuang-hua; Wang, Hsin-min; Chien, Jen-tzung; | 書刊名 | International Journal of Computational Linguistics & Chinese Language Processing |
| 卷 期 | 11:1 民95.03 |
| 頁 次 | 頁37-55 |
| 分類號 | 312.13 |
| 關鍵詞 | Language modeling; Latent semantic analysis; Maximum entropy; Speech recognition; |
| 語 文 | 英文(English) |
| 英文摘要 | The conventional n-gram language model exploits only the immediate context of historical words without exploring long-distance semantic information. In this paper, we present a new information source extracted from latent semantic analysis (LSA) and adopt the maximum entropy (ME) principle to integrate it into an n-gram language model. With the ME approach, each information source serves as a set of constraints, which should be satisfied to estimate a hybrid statistical language model with maximum randomness. For comparative study, we also carry out knowledge integration via linear interpolation (LI). In the experiments on the TDT2 Chinese corpus, we find that the ME language model that combines the features of trigram and semantic information achieves a 17.9% perplexity reduction compared to the conventional trigram language model, and it outperforms the LI language model. Furthermore, in evaluation on a Mandarin speech recognition task, the ME and LI language models reduce the character error rate by 16.9% and 8.5%, respectively, over the bigram language model. |
本系統中英文摘要資訊取自各篇刊載內容。