Review ArticleValidation, updating and impact of clinical prediction rules: A review
Introduction
Prediction rules or prediction models, often also referred to as decision rules or risk scores, combine multiple predictors, such as patient characteristics, test results, and other disease characteristics, to estimate the probability that a certain outcome is present (diagnosis) in an individual or will occur (prognosis). They intend to aid the physician in making medical decisions and in informing patients. Table 1 shows an example of a prediction rule.
In multivariable prediction research, the literature often distinguishes three phases: (1) development of the prediction rule; (2) external validation of the prediction rule (further referred to as “validation”), that is, testing the rule's accuracy and thus generalizability in data that was not used for the development of the rule, and subsequent updating if validity is disappointing; and (3) studying the clinical impact of a rule on physician's behavior and patient outcome (Table 2) [1], [2], [3], [4], [5]. A fourth phase of prediction research may be the actual implementation in daily practice of prediction rules, which endured the first three phases [4]. A quick Medline-search using a suggested search strategy [6] demonstrated that the number of scientific articles discussing prediction rules has more than doubled in the last decade; 6,744 published articles in 1995 compared to 15,662 in 2005. A striking fact is that this mainly includes papers concerning the development of prediction rules. A relatively small number regards the validation of rules and there are hardly any publications showing whether an implemented rule has impact on physician's behavior or patient outcome [3], [4].
Lack of validation and impact studies is unfortunate, because accurate predictions—commonly expressed in good calibration (agreement between predicted probabilities and observed outcome frequencies) and good discrimination (ability to distinguish between patients with and without the outcome)—in the patients that were used to develop a rule are no guarantee for good predictions in new patients, let alone for their use by physicians [1], [3], [4], [7], [8]. In fact, most prediction rules commonly show a reduced accuracy when validated in new patients [1], [3], [4], [7], [8]. There may be two main reasons for this: (1) the rule was inadequately developed and (2) there were (major) differences between the derivation and validation population.
Many guidelines regarding the development of prediction rules have been published, including the number of potential predictors in relation to the number of patients, methods for predictor selection, how to assign the weights per predictor, how to shrink the regression coefficients to prevent overfitting, and how to estimate the rule's potential for optimism using so-called internal validation techniques such as bootstrapping [1], [2], [7], [8], [9], [10], [11], [12], [13], [14].
Compared to the literature on the development of prediction rules, the methodology for validation and studying the impact of prediction rules is underappreciated [1], [4], [8]. This paper provides a short overview of the types of validation studies, of possible methods to improve or update a previously developed rule in case of disappointing accuracy in a validation study, and of important aspects of impact studies and implementation of prediction rules. We focus on prediction rules developed by logistic regression analysis, but the issues largely apply to prediction rules developed by other methods such as Cox proportional hazard analysis or neural networks. The methodology applies both to diagnostic and prognostic prediction rules and is illustrated with examples from diagnostic and prognostic research.
Section snippets
Examples of disappointing accuracy of prediction rules
Even when internal validation techniques are applied to correct for overfitting and optimism, the accuracy of prediction rules can be substantially lower in new patients compared to the accuracy found in the patients of the development population. For example, the generalizability of an internally validated prediction rule for diagnosing a serious bacterial infection in children presenting with fever without apparent source was disappointing [15]. In the development study, the area under the
Updating prediction rules
When a validation study shows disappointing results, researchers are often tempted to reject the rule and directly pursue to develop new rules with the data of the validation population only. However, although the original prediction rules usually have been developed with large data sets, validation studies are frequently conducted with much smaller patient samples. The redeveloped rules are thus also based on smaller samples. Furthermore, it would lead to many prediction rules for the same
Impact analysis
To ascertain whether a validated diagnostic or prognostic prediction rule will actually be used by physicians, will change or direct physicians' decisions, and will improve clinically relevant process parameters (such as number of bed days, length of hospital stay, or time to diagnosis), patient outcomes, or reduces costs, an impact study or impact analysis should be performed [3], [4]. In the ideal design of an impact study, physicians or care units are randomized to either the index
Implementation of prediction rules
When a rule has frequently been proven to be accurate in diverse populations, the more likely it is that the prediction rule can be successfully applied in practice [1], [4], [8]. Yet, there are still reasons why the rule is not as successful in daily practice.
First, physicians may feel that their often implicit estimation of a particular predicted probability is at least as good as the probability calculated with a prediction rule, and may therefore not use or follow the rule's predictions [3]
Final comments
We have given an overview of types of validation studies, of methods to improve or update a previously developed diagnostic or prognostic prediction rule in case of disappointing accuracy in a validation study, and of aspects of impact studies and the implementation of prediction rules. A validated, and if necessary updated, rule may cautiously be applied in new patients that are similar to the patients in the development and validation populations. However, when the user has reasons to believe
Acknowledgments
We gratefully acknowledge the support by The Netherlands Organization for Scientific Research (ZonMw 016.046.360; ZonMw 945-04-009).
References (60)
- et al.
Internal and external validation of predictive models: a simulation study of bias and precision in small samples
J Clin Epidemiol
(2003) - et al.
Internal validation of predictive models: efficiency of some procedures for logistic regression analysis
J Clin Epidemiol
(2001) - et al.
External validation is necessary in prediction research: a clinical example
J Clin Epidemiol
(2003) - et al.
Early mortality in coronary bypass surgery: the EuroSCORE versus The Society of Thoracic Surgeons risk algorithm
Ann Thorac Surg
(2004) - et al.
Substantial effective sample sizes were required for external validation studies of predictive logistic regression models
J Clin Epidemiol
(2005) - et al.
Diagnostic accuracy of D-dimer test for exclusion of venous thromboembolism: a systematic review
J Thromb Haemost
(2007) Between iatrotropic stimulus and interiatric referral: the domain of primary care research
J Clin Epidemiol
(2002)- et al.
Accuracy of clinical assessment of deep-vein thrombosis
Lancet
(1995) - et al.
A study to develop clinical decision rules for the use of radiography in acute ankle injuries
Ann Emerg Med
(1992) - et al.
Updating methods improved the performance of a clinical prediction model in new patients
J Clin Epidemiol
(2008)
Systematic reviews with individual patient data meta-analysis to evaluate diagnostic tests
Eur J Obstet Gynecol Reprod Biol
What do we mean by validating a prognostic model?
Stat Med
Clinical prediction rules. A review and suggested modifications of methodological standards
JAMA
Users' guides to the medical literature: XXII: how to use articles about clinical decision rules. Evidence-Based Medicine Working Group
JAMA
Translating clinical research into clinical practice: impact of using prediction rules to make decisions
Ann Intern Med
Clinical prediction rules. Applications and methodological standards
N Engl J Med
Searching for clinical prediction rules in MEDLINE
J Am Med Inform Assoc
Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors
Stat Med
Assessing the generalizability of prognostic information
Ann Intern Med
Regression, prediction and shrinkage
JR Stat Soc B
A leisurely look at the bootstrap, the jackknife, and cross-validation
Am Stat
Regression modelling strategies with applications to linear models, logistic regression, and survival analysis
Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets
Stat Med
European system for cardiac operative risk evaluation (EuroSCORE)
Eur J Cardiothorac Surg
Risk factors and outcome in European cardiac surgery: analysis of the EuroSCORE multinational database of 19030 patients
Eur J Cardiothorac Surg
Risk stratification in heart surgery: comparison of six score systems
Eur J Cardiothorac Surg
EuroSCORE: a systematic review of international performance
Eur J Cardiothorac Surg
Risk stratification analysis of operative mortality in heart and thoracic aorta surgery: comparison between Parsonnet and EuroSCORE additive model
Eur J Cardiothorac Surg
Logistic or additive EuroSCORE for high-risk patients?
Eur J Cardiothorac Surg
Validation of European System for Cardiac Operative Risk Evaluation (EuroSCORE) in North American cardiac surgery
Eur J Cardiothorac Surg
Cited by (438)
Performance of models for predicting 1-year to 3-year mortality in older adults: a systematic review of externally validated models
2024, The Lancet Healthy LongevityExternal validation of a predictive model for reintubation after cardiac surgery: A retrospective, observational study
2024, Journal of Clinical AnesthesiaFrom evidence to clinical practice: Bridging the gap of new liver cancer therapies in Latin America.
2024, Annals of HepatologyExternal validation and update of a predictive model of acute kidney injury in adult patients hospitalized in intensive care
2024, Acta Colombiana de Cuidado IntensivoIndependent External Validation of a Preoperative Prediction Model for Delirium After Cardiac Surgery: A Prospective Observational Cohort Study
2023, Journal of Cardiothoracic and Vascular AnesthesiaClinical decision rules in primary care: Necessary investments for sustainable healthcare
2023, Primary Health Care Research and Development