頁籤選單縮合
題 名 | Penalized Q-Learning for Dynamic Treatment Regimens |
---|---|
作 者 | Song, Rui; Wang, Weiwei; Zeng, Donglin; Kosorok, Michael R.; | 書刊名 | Statistica Sinica |
卷 期 | 25:3 2015.07[民104.07] |
頁 次 | 頁901-920 |
分類號 | 410.028、410.028 |
關鍵詞 | Dynamic treatment regimen; Individual selection; Multi-stage; Penalized Q-learning; Q-learning; Shrinkage; Two-stage procedure; |
語 文 | 英文(English) |
英文摘要 | A dynamic treatment regime effectively incorporates both accrued information and long-term effects of treatment from specially designed clinical trials. As these become more and more popular in conjunction with longitudinal data from clinical studies, the development of statistical inference for optimal dynamic treatment regimes is a high priority. This is very challenging due to the difficulties arising form non-regularities in the treatment effect parameters. In this paper, we propose a new reinforcement learning framework called penalized Q-learning (PQ-learning), under which the non-regularities can be resolved and valid statistical inference established. We also propose a new statistical procedure---individual selection---and corresponding methods for incorporating individual selection within PQ-learning. Extensive numerical studies are presented which compare the proposed methods with existing methods, under a variety of non-regular scenarios, and demonstrate that the proposed approach is both inferentially and computationally superior. The proposed method is demonstrated with the data from a depression clinical trial study. |
本系統中英文摘要資訊取自各篇刊載內容。