頁籤選單縮合
題 名 | A Missing Data Treatment for Data Mining Applications in Medical Information Systems=醫療資訊系統資料分析之遺漏值處理 |
---|---|
作 者 | 廖上智; 李憶農; | 書刊名 | The Kaohsiung Journal of Medical Sciences |
卷 期 | 17:4 2001.04[民90.04] |
頁 次 | 頁198-206 |
分類號 | 419.21 |
關鍵詞 | 醫療資訊系統; 資料分析; 遺漏值; Missing data; Data mining; Classification; Discriminant analysis; Neural networks; |
語 文 | 英文(English) |
中文摘要 | 當從醫療資訊系統截取部份資料進行分析時,因個別病患之診療過程 不同, 故未具備標準化之資料內容,因此遺漏值為常見之狀況。而現行市面上 較普遍之數量分析電腦軟體,多將整筆具遺漏值之資料忽略不列入分析, 以至於 流失了許多有價值的資料,或以簡單的取代法而造成結果的偏差。本研究提出一 簡易的遺漏值處理技術,並經統計及類神經網路 (人工智慧) 兩種不同理論基礎 的數量分析方法測試,結果顯示此技術較簡單的取代法好 (判別技術之判別率為 0.997 : 0.974),且不需繁複的計算程序,亦可滿足即使不熟悉各種先進數量分析 方法技術的醫療資訊系統所有可能使用者,包括醫院管理者、臨床工作者、及研 究人員,能在短時間內取得可用的資料,並進一步藉此推動醫療資訊系統之發展。 |
英文摘要 | To apply user-friendly, easily operated and accessible tools to handle missing data resulting from an auto-stored medical information system, these tools are applied to satisfy general users from different disciplines (i.e. statistics and machine-learning), followed by medical information system development. This study attempts to develop a new logic separation inference method applied to a database with a format like most real-world medical records containing many missing data and miscellaneous variables. It is expected that this method should have better performance than currently accessible methods. The newly developed logic separation inference method shows a classification power of 0.997 (elimination method is 1), which is better than the simple replacing method (replaced by mode shows 0.974). Both inference methods (mode and mean) have superior classification power to the simple replacing method. The missing data treatment processes introduced in this study can be completed on a MS Excel spreadsheet without any complicated calculation; therefore, they can satisfy general users. This new missing data treatment method is only applied up to 60% of the missing data (missing at random). However, when there is large amount of data, it is expected that this method also can be applied to a database missing more than 60%. |
本系統中英文摘要資訊取自各篇刊載內容。