June 6, 2018

Leveraging historical data for high-dimensional regression adjustment, a machine learning approach.


Scientific Presentation made on June 2018 at the Promoting Statistical Insight Conference, London, United-Kingdom.

The amount of data collected from patients involved in clinical trials is continuously growing. All those patient’s characteristics are potential covariates that could be used to improve study analysis and power. At the same time, the development of computerized systems simplifies the access to huge amount of historical data. However, it is still difficult to leverage those big data when dealing with small clinical trials, such as in Phases I and II. Their restricted number of patients limits the possible number of covariates included in the analysis. The purpose of this talk is to present how machine learning can overcome this problem by taking advantage of historical data with larger sample sizes. Our approach is to pre-specify the combination of the baseline covariates by building a “meta-covariate”. In small studies, using this meta-covariate alone will limit the loss of degrees of freedom while making the best uses of all generated data. Two advantages of fitting the covariates on independent data are to free the modeling from the study constraints and to limit the risk of overfitting. Those are of particular interest with complex data, i.e non-normal distribution or in the presence of non-linearities. Our experiments show that the gain in power over standard approaches increases with the number of covariates or the decrease in the study sample size. Using simulated data, we also analyze the benefits of this methodology when the historical data are not representative of the study of interest. We also put the approach in perspective with the regulatory guideline on the use of adjustment for baseline covariates.      
Scientific Presentation
Samuel Branders, PhD; Guillaume Bernard, PhD; Alvaro Pereira, PhD
June 3, 2018
Promoting Statistical Insight Conference

Related Posts


Correcting For The Individual Patient Regression To The Mean Effect

Often, the primary endpoint of RCTs is defined as a change from baseline of a continuous outcome. In…

Type: Scientific Poster
Authors: Samuel Branders, PhD; Guillaume Bernard, PhD; Alvaro Pereira, PhD
Conference: American Society for Clinical Pharmacology and Therapeutics
Read More

Do Environmental Parameters Influence The Prediction Of The Placebo Response?

This proof-of-concept study on peripheral neuropathic pain patients investigates the potential influence of the investigator on the placebo…

Read More

Bayesian Modeling Of The Placebo Response In Neuropathic Pain

In analgesia randomized clinical trials (RCTs), the magnitude and the variability of the placebo response have a negative…

Type: Scientific Poster
Authors: Samuel Branders, PhD; Alvaro Pereira, PhD; Frederic Clermont, PhD; Chantal Gossuin; Dominique Demolle, PhD
Conference: Promoting Statistical Insight Conference
Read More