查詢結果分析
來源資料
頁籤選單縮合
| 題 名 | 新聞事件偵測與追蹤之分群分類演算法研究 |
|---|---|
| 作 者 | 黃純敏; 陳聰宜; 詹雅筑; | 書刊名 | 資訊科技國際期刊 |
| 卷 期 | 8:1 2014.06[民103.06] |
| 頁 次 | 頁70-78 |
| 分類號 | 312.13 |
| 關鍵詞 | 事件偵測與追蹤; 中文斷詞; 分群; 分類; News event detection and tracking; Chinese term segmentation; Cluster; Classification; |
| 語 文 | 中文(Chinese) |
| 中文摘要 | 過去研究在進行文件群聚分析時,如以詞庫方式斷詞者,多採CKIP進行中文斷詞處理。礙於其處理傳輸量的嚴格限制,以及斷詞過於瑣碎的缺點,使得研究在處理字詞上,需多次批次上傳,斷詞結果亦需進一步過濾與合併。本研究以平行處理方式比較CKIP與自行開發的中文斷詞系統(Chinese Corpus Segmentation,CCS)搭配國家圖書館主題標目,做為文件分群之前置處理,研究結果證實使用專業詞庫確實可提升分群成效,事件偵測準確率高達85%。在事件追蹤實驗中以SVM、KNN及Naive Bayes三種分類演算法做為測試評比對象,結果顯示,SVM表現最佳,其分類準確度高達91.33%。 |
| 英文摘要 | Numerous studies relied on CKIP to process Chinese term segmentation as a preprocessing for cluster analysis. Due to its strict limitation of transmission volume and the need of further processing of term filtering and merging, this study adopted a professional corpus composed of subject headings along with a self-developed Chinese Corpus Segmentation (CCS). The results showed that CCS outperforms CKIP in terms of performance and term quality in processing cluster analysis with a high precision rate of 85%. Furthermore, in order to provide high quality news tracking results, we compared SVM, KNN, and Naïve Bayes with regard to the accuracy of classification result. Results showed that SVM was the best among the others, with a high precision rate of 92%. |
本系統中英文摘要資訊取自各篇刊載內容。