TY - JOUR
T1 - Impact of Integrating Machine Learning in Comparative Effectiveness Research of Oral Anticoagulants in Patients with Atrial Fibrillation
AU - Han, Sola
AU - Suh, Hae Sun
N1 - Publisher Copyright:
© 2022 by the authors.
PY - 2022/10
Y1 - 2022/10
N2 - We aimed to compare the ability to balance baseline covariates and explore the impact of residual confounding between conventional and machine learning approaches to derive propensity scores (PS). The Health Insurance Review and Assessment Service database (January 2012–September 2019) was used. Patients with atrial fibrillation (AF) who initiated oral anticoagulants during July 2015–September 2018 were included. The outcome of interest was stroke/systemic embolism. To estimate PS, we used a logistic regression model (i.e., a conventional approach) and a generalized boosted model (GBM) which is a machine learning approach. Both PS matching and inverse probability of treatment weighting were performed. To evaluate balance achievement, standardized differences, p-values, and boxplots were used. To explore residual confounding, E-values and negative control outcomes were used. In total, 129,434 patients were identified. Although all baseline covariates were well balanced, the distribution of continuous variables seemed more similar when GBM was applied. E-values ranged between 1.75 and 2.70 and were generally higher in GBM. In the negative control outcome analysis, slightly more nonsignificant hazard ratios were observed in GBM. We showed GBM provided a better ability to balance covariates and had a lower impact of residual confounding, compared with the conventional approach in the empirical example of comparative effectiveness analysis.
AB - We aimed to compare the ability to balance baseline covariates and explore the impact of residual confounding between conventional and machine learning approaches to derive propensity scores (PS). The Health Insurance Review and Assessment Service database (January 2012–September 2019) was used. Patients with atrial fibrillation (AF) who initiated oral anticoagulants during July 2015–September 2018 were included. The outcome of interest was stroke/systemic embolism. To estimate PS, we used a logistic regression model (i.e., a conventional approach) and a generalized boosted model (GBM) which is a machine learning approach. Both PS matching and inverse probability of treatment weighting were performed. To evaluate balance achievement, standardized differences, p-values, and boxplots were used. To explore residual confounding, E-values and negative control outcomes were used. In total, 129,434 patients were identified. Although all baseline covariates were well balanced, the distribution of continuous variables seemed more similar when GBM was applied. E-values ranged between 1.75 and 2.70 and were generally higher in GBM. In the negative control outcome analysis, slightly more nonsignificant hazard ratios were observed in GBM. We showed GBM provided a better ability to balance covariates and had a lower impact of residual confounding, compared with the conventional approach in the empirical example of comparative effectiveness analysis.
KW - atrial fibrillation
KW - comparative effectiveness research
KW - machine learning
KW - propensity score
UR - http://www.scopus.com/inward/record.url?scp=85139980370&partnerID=8YFLogxK
U2 - 10.3390/ijerph191912916
DO - 10.3390/ijerph191912916
M3 - Article
C2 - 36232216
AN - SCOPUS:85139980370
SN - 1661-7827
VL - 19
JO - International Journal of Environmental Research and Public Health
JF - International Journal of Environmental Research and Public Health
IS - 19
M1 - 12916
ER -