Automatically Labeling Clinical Trial Outcomes: A Large-Scale Benchmark for Drug Development

Chufan Gao, Jathurshan Pradeepkumar, Trisha Das, Shivashankar Thati, Jimeng Sun

University of Illinois Urbana-Champaign

Download Dataset Code Paper Collab Notebooks


Abstract

Background The cost of drug discovery and development is substantial, with clinical trial outcomes playing a critical role in regulatory approval and patient care. However, access to large-scale, high-quality clinical trial outcome data remains limited, hindering advancements in predictive modeling and evidence-based decision-making.

Methods We present the Clinical Trial Outcome (CTO) benchmark, a fully reproducible, large-scale repository encompassing approximately 125,000 drug and biologics trials. CTO integrates large language model (LLM) interpretations of publications, trial phase progression tracking, sentiment analysis from news sources, stock price movements of trial sponsors, and additional trial-related metrics. Furthermore, we manually annotated a dataset of clinical trials conducted between 2020 and 2024 to enhance the quality and reliability of outcome labels.

Results  The trial outcome labels in the CTO benchmark agree strongly with expert annotations, achieving an F1 score of 94 for Phase 3 trials and 91 across all phases. Additionally, benchmarking standard machine learning models on our manually annotated dataset revealed distribution shifts in recent trials, underscoring the necessity of continuously updated labeling approaches.

Conclusions By analyzing CTO's performance on recent clinical trials, we demonstrate the ongoing need for high-quality, up-to-date trial outcome labels. We publicly release the CTO knowledge base and annotated labels at https://chufangao.github.io/CTOD, with regular updates to support research on clinical trial outcomes and inform data-driven improvements in drug development.

Definition: Trial Success

Clinical trial outcomes are multifaceted and have diverse implications. These outcomes can involve meeting the primary endpoint as defined in the study, advancing to the next phase of the trial, obtaining regulatory approval, impacting the financial outcome for the sponsor (either positively or negatively), and influencing patient outcomes such as adverse events and trial dropouts.

Our paper defines the trial outcome as a binary indicator (0 for Failure and 1 for Success), inidcating whether the trial achieves its primary endpoints and can progress to the next stage of drug development. For example, for Phase 1 and 2 trials, success may mean moving to the next phase, such as from Phase 1 to Phase 2, and from Phase 2 to Phase 3. In Phase 3, success is measured by regulatory approval.

Dataset Viewer

Example Views

Below is a preview of the full, raw, dataset. The full dataset + descriptions can be accessed here.

Usage Instructions

Citation

@article{gao2024automatically,
  title={Automatically Labeling Clinical Trial Outcomes: A Large-Scale Benchmark for Drug Development},
  author={Gao, Chufan and Pradeepkumar, Jathurshan and Das, Trisha and Thati, Shivashankar and Sun, Jimeng},
  journal={arXiv preprint arXiv:2406.10292},
  year={2024}
}

Other Material and Related Work

Special Thanks

License

The dataset is licensed under the MIT license.