查詢結果分析
來源資料
頁籤選單縮合
題 名 | 銅活字中文辨識之序列比對演算法研究 =A Study of Sequence Alignment Algorithms on Movable Copper Type Chinese Recognition |
---|---|
作 者 | 侯玉松; 吳德玲; 林世勇; 謝智偉; 林婉婷; | 書刊名 | 長庚科技學刊 |
卷 期 | 7 2007.12[民96.12] |
頁 次 | 頁123-137 |
分類號 | 312.84 |
關鍵詞 | 古籍數位化; 光學中文辨識; 廣域比對; 生物資訊學; Digitization of ancient books; Chinese optical character recognition; Global alignment; Bioinformatics; |
語 文 | 中文(Chinese) |
中文摘要 | 中國古籍文獻數量龐大,因此古籍數位化為國家圖書館的重點計畫。以現有的光學中文辨識軟體,都難以對銅活字古籍文獻做到有效的比對,因此有其研究的重要性。以基因序列比對法的原理,針對銅活字的特性做資料壓縮,以插入、替代、刪除的方式處理銅活字所轉換而成的0/1序列。即可嘗試套用相似DNA序列比對方法,如:廣域比對法等,比對兩字的0/1序列的相似度,以達到辨識的效果。本論文將研究生物資訊學的DNA序列比對法,應用於字庫搜尋比對的可行性,檢討其辨識準確度與執行速度,最後由實驗結果得知:本論文提出的演算法,其辨識率正確率可達90%左右,表示本演算法可應用於銅活字辨識。 |
英文摘要 | Digitization of ancient books is an important project of National Central Library since Chinese ancient documents are huge in quantity. Nevertheless, ancient books using movable copper type could not be recognized effectively by current optical character recognization systems. Therefore, digitization of ancient books is important in research. We will use sequence alignment algorithms to recognize movable copper type Chinese. Firstly, an image of Chinese word was translated to 0/1 sequences in row-major, then we can compare two words by using comparing methods about DNA sequences, such as the global alignment method. Therefore, an unknown word can be recognized by obtaining the similarity of two words. We will study the application of DNA sequence comparing methods in bioinformatics to searching n word bases. The correct rate and execution time will be analyzed. In experiment result, correct rate of recognition for our algorithm was about 90%. Therefore, our algorithm could apply effectively to movable copper type Chinese recognition in future. |
本系統中英文摘要資訊取自各篇刊載內容。