題 名 | 以語意網與自動化標註提升資料檢索效能之研究=Approaches to Enhancing the Performance of Information Retrieval Based on Semantic Web and Automated Tagging |
作 者 | 陳志達; | 書刊名 | 南臺學報 |
卷 期 | 40:1 2015.03[民104.03] |
頁 次 | 頁1-14 |
分類號 | 028.7 |
關鍵詞 | 語意網; 本體論; 語意標籤; 語意搜尋; 中心語主導原則; Semantic web; Ontology; Semantic tagging; Semantic search; Head-driven principle; |
語 文 | 中文(Chinese) |
中文摘要 | 便捷且普及化的資訊科技,讓各式的資訊得以傳遞至各個角落,但由於至今的資訊量還是不斷的在增加中,然而目前的資料搜尋機制大多採用關鍵字詞搜尋技術,以致於使用者於查詢時需耗費大量地時間和精力在瀏覽和過濾網頁上,像是查找某當地之美食與觀光景點,可能因資料量過多、使用者所輸入的關鍵字語意不清,甚至是查詢介面與整體頁面過於繁瑣等因素,而造成資訊不正確、查詢結果大多不符合需求等問題發生,進而導致使用者只能前往某些著名的觀光景點遊玩。所以,為改善此問題,本研究將先從資料前置處理下手,即利用語意網與自動標註等相關技術來建立資料標準轉換機制,以彌補存在於網際網路中大多為非結構或半結構化的資料。此外,人性化的查詢機制也是本研究致力研究的目標,其目的在於強化查詢結果與擴大使用的年齡層,所以特將使用者輸入的文字增設了能剖析自然語句裡涵意之語意查詢系統,而非單純只以關鍵字與布林字元來進行搜尋資訊。另外,為讓知識能再利用,不單是建立好美食旅遊的本體論,還將流存於網際網路中有價值的網頁也一併加入其中,但為求降低儲存成本,將只製作語意註釋文件和存儲該原文之連結,以讓知識達到再利用而並不是一味的增加資料量。 |
英文摘要 | Being in the era of the development of information technology, all kinds of information can now easily be transmitted anywhere. Due to the rapidly growing volume of information, current information retrieval mecha-nisms that use keyword matching technology are relatively slow when searching for data on the web. For exam-ple, when users try to search for local sightseeing places and famous cuisine, they will probably need spend considerable time going through an excessive amount of unwanted data. Inevitably, users will only find some of the information which they actually need. In order to improve this problem, this study introduces a [new?] system which utilizes data pre-processing, semantic web, and automatic tagging techniques to create an efficient data conversion mechanism. Most non- or semi-structured web information is manipulated before the semantic searching process. In addition, this study is also committed to providing a user-friendly query mechanism. The purpose is to enhance the usefulness of search results and extend the range of user ages. Furthermore, the system will build the ontology of tourism and store valuable information in a knowledge base in which knowledge can be reused. The system will produce only semantic tagging files and store the textual link of valuable information in order to reduce storage costs. Knowledge obtained through the system will be useful and reusable rather than blindly increasing the quantity of useless data clogging the Internet. |