Automatically Labeling Clinical Trial Outcomes: A Large-Scale Benchmark for Drug Development

Chufan Gao, Jathurshan Pradeepkumar, Trisha Das, Shivashankar Thati, Jimeng Sun

University of Illinois Urbana-Champaign

Download Dataset Code Paper Slides Tutorial Notebooks Introduction Video

Abstract

Background The cost of drug discovery and development is substantial, with clinical trial outcomes playing a critical role in regulatory approval and patient care. However, access to large-scale, high-quality clinical trial outcome data remains limited, hindering advancements in predictive modeling and evidence-based decision-making.

Methods We present the Clinical Trial Outcome (CTO) benchmark, a fully reproducible, large-scale repository encompassing approximately 125,000 drug and biologics trials. CTO integrates large language model (LLM) interpretations of publications, trial phase progression tracking, sentiment analysis from news sources, stock price movements of trial sponsors, and additional trial-related metrics. Furthermore, we manually annotated a dataset of clinical trials conducted between 2020 and 2024 to enhance the quality and reliability of outcome labels.

Results The trial outcome labels in the CTO benchmark agree strongly with expert annotations, achieving an F1 score of 94 for Phase 3 trials and 91 across all phases. Additionally, benchmarking standard machine learning models on our manually annotated dataset revealed distribution shifts in recent trials, underscoring the necessity of continuously updated labeling approaches.

Conclusions By analyzing CTO's performance on recent clinical trials, we demonstrate the ongoing need for high-quality, up-to-date trial outcome labels. We publicly release the CTO knowledge base and annotated labels at https://chufangao.github.io/CTOD, with regular updates to support research on clinical trial outcomes and inform data-driven improvements in drug development.

Definition: Trial Success

Clinical trial outcomes are multifaceted and have diverse implications. These outcomes can involve meeting the primary endpoint as defined in the study, advancing to the next phase of the trial, obtaining regulatory approval, impacting the financial outcome for the sponsor (either positively or negatively), and influencing patient outcomes such as adverse events and trial dropouts.

Our paper defines the trial outcome as a binary indicator (0 for Failure and 1 for Success), inidcating whether the trial achieves its primary endpoints and can progress to the next stage of drug development. For example, for Phase 1 and 2 trials, success may mean moving to the next phase, such as from Phase 1 to Phase 2, and from Phase 2 to Phase 3. In Phase 3, success is measured by regulatory approval.

Dataset Viewer

Example Views

View human labels' study dates and overall status

SELECT 
  nct_id, 
  study_first_submitted_date, 
  study_first_posted_date, 
  completion_date, 
  overall_status, 
  labels
FROM 
  human_labels

Get interesting positive CTO predictions. I.e. Phase 3 trials with less than 100% predicted probability of success
```
SELECT *
  FROM phase3_cto_preds
  WHERE pred_proba != 1
  ORDER BY pred_proba DESC
  LIMIT 10
```
We see that these trials are likely to succeed given that each trial had a positive effect on stock price and were able to be linked to a previous trial, despite there not being an explicit p-value.

Below is a preview of the full, raw, dataset. The full dataset + descriptions can be accessed here.

Usage Instructions

The latest version will always be shown in Huggingface. Instructions in obtaining the full dataset is shown there as well.

You can also load specific files using the Python Pandas library

import pandas as pd
CTO_phase1_preds = pd.read_csv("https://huggingface.co/datasets/chufangao/CTO/raw/main/phase1_CTO_rf.csv")
CTO_phase2_preds = pd.read_csv("https://huggingface.co/datasets/chufangao/CTO/raw/main/phase2_CTO_rf.csv")
CTO_phase3_preds = pd.read_csv("https://huggingface.co/datasets/chufangao/CTO/raw/main/phase3_CTO_rf.csv")

Please see Tutorials for examples getting started with this dataset. This includes off-the-shelf Google Collab notebooks!

Citation

@article{gao2024automatically,
  title={Automatically Labeling Clinical Trial Outcomes: A Large-Scale Benchmark for Drug Development},
  author={Gao, Chufan and Pradeepkumar, Jathurshan and Das, Trisha and Thati, Shivashankar and Sun, Jimeng},
  journal={arXiv preprint arXiv:2406.10292},
  year={2024}
}

Other Material and Related Work

Special Thanks

A huge thanks to SerpApi for their powerful news search API--an invaluable resource for scalably gathering clinical trial news, making our research faster and more efficient.

License

The dataset is licensed under the MIT license.