頁籤選單縮合
題 名 | 當反應變數值具有錯誤期望值時候的無母數迴歸估計 |
---|---|
作 者 | 黃景祿; | 書刊名 | 中國統計學報 |
卷 期 | 33:4 1995.12[民84.12] |
頁 次 | 頁465-478 |
分類號 | 319.51 |
關鍵詞 | 無母數迴歸分析; 雙重取樣方法; 核估計量; Nonparametric regression; Double sampling scheme; Kernel estimator; |
語 文 | 中文(Chinese) |
中文摘要 | 在隨機取樣的無母數迴歸分析中,當反應變數值具有錯誤的期望值時,本研究探討核估計量對母體迴歸函數的估計效果。反應變數值具有錯誤期望值的發生情況,例如受訪者會傾向於低估自己的收入,低估自己的飲酒量,或者會高估自己的生活支出,以及高估自己在車禍中的受傷程度。在這些情況中,我們實際取到的反應變數值的期望值並不等於此反應變數值的真正期望值。此時冒然地應用核估計量到這些錯誤的樣本上,我們很容易理解這個核估計量將會導出一個偏差的(biased) 母體迴歸函數估計量。我們研究這個偏差量到底有多大。另外,我們也考慮要如何修正這個偏差量。為此,我們使用了雙重取樣方法 (double sampling scheme)。在 n個樣本所形成的主樣本 (main sample) 中,隨機抽取k個樣本,1≦k≦n。給與這k個子樣本詳細測度以取得具有正確期望值的反應變數值,藉由這k個子樣本中同時具有正確與不正確期望值之反應變數值,我們估計出主樣本中反應變數值的錯誤程度。由此錯誤程度,我們對主樣本中的反應變數值做適當修正。應用核估計量於修正後的反應變數值,我們可以得到一個漸近不偏的(asymptotically unbiased) 母體迴歸函數估計量,而且此估計量比只用k個子樣本上的正確反應變數值之統計量更有效率 (efficiency)。這個觀點是因為前者較後者,使用了主樣本所提供的訊息。 |
英文摘要 | For the random design nonparametric regression, in the case that the responses have erroneous expectations, the performance of the kernel estimator is investigated. This case occurs, for example, in surveys of personal income, alcohol consumption behaviors, and injuries where litigations are possible. Under this situation, it is easy to understand that the resulting kernel estimator is biased for the true regression function. In this study, we have two objectives. One is to study the amount of the bias. The other is to adjust for the bias. To correct for the bias, the double sampling scheme is considered. At the first stage, a main sample of n units is measured by the fallible device. At the second stage, a validation subsample of k units is randomly drawn from the main sample and measured by a true device, where 1<k<n. Using both sets of measurements obtained by the fallible and true devices, the mismeasurement for the responses in the main sample is estimated. By the estimation, the responses in the primary sample are corrected. Applying the kernel estimator to the corrected data. an asymptotical1y unbiased estimator for the true regression function fol1ows. We show that the resulting estimator is more efficient than the one using only the responses with correct expectations in the validation subsample. This result is caused by the fact that the former uses the information provided by both the main sample and the validation subsample, but the latter uses only the information contained in the validation subsample. |
本系統中英文摘要資訊取自各篇刊載內容。