頁籤選單縮合
題名 | Building a Bracketed Corpus Using Φ[feb4]Statistics= |
---|---|
作者 | Lee,Yue-shi; Chen,Hsin-hsi; |
期刊 | International Journal of Computational Linguistics & Chinese Language Processing |
出版日期 | 19970800 |
卷期 | 2:2 1997.08[民86.08] |
頁次 | 頁1-23 |
分類號 | 310.153 |
語文 | eng |
關鍵詞 | 自然語言應用; 電腦語言; Bracketed corpus; Probabilistic chunkers; Treebank; Φ[feb4]statistics; |
英文摘要 | Research based on treebanks is ongoing for many natural language applications. However, the work involved in building a large-scale treebank is laborious and time-consuming. Thus, speeding up the process of building a treebank has become an important task. This paper proposes two versions of probabilistic chunkers to aid the development of a bracketed corpus. The basic version partitions part-of-speech sequences into chunk sequences, which form a partially bracketed corpus. Applying the chunking action recursively, the recursive ersion generates a fully bracketed corpus. Rather than using a treebank as a training corpus, a corpus, which is tagged with part-of-speech information only, is used. The experimental results show that the probabilistic chunker has a correct rate of more than 94% in producing a partially bracketed corpus and also gives very encouraging results in generating a fully bracketed corpus. These two versions of chunkers are simple but effective and can also be applied to many natural language applications. |
本系統之摘要資訊系依該期刊論文摘要之資訊為主。