Machine Learning Methods for Survival Analysis with Clinical and Transcriptomics Data of Breast Cancer

Research output: Chapter in Book/Report/Conference proceedingChapter

433 Downloads (Pure)

Abstract

Breast cancer is one of the most common cancers in women worldwide, which causes an enormous number of deaths annually. However, early diagnosis of breast cancer can improve survival outcomes enabling simpler and more cost-effective treatments. The recent increase in data availability provides unprecedented opportunities to apply data-driven and machine learning methods to identify early-detection prognostic factors capable of predicting the expected survival and potential sensitivity to treatment of patients, with the final aim of enhancing clinical outcomes. This tutorial presents a protocol for applying machine learning models in survival analysis for both clinical and transcriptomic data. We show that integrating clinical and mRNA expression data is essential to explain the multiple biological processes driving cancer progression. Our results reveal that machine-learning-based models such as random survival forests, gradient boosted survival model, and survival support vector machine can outperform the traditional statistical methods, i.e., Cox proportional hazard model. The highest C-index among the machine learning models was recorded when using survival support vector machine, with a value 0.688, whereas the C-index recorded using the Cox model was 0.677. Shapley Additive Explanation (SHAP) values were also applied to identify the feature importance of the models and their impact on the prediction outcomes.

Original languageEnglish
Title of host publicationComputational Biology and Machine Learning for Metabolic Engineering and Synthetic Biology
EditorsKumar Selvarajoo
PublisherSpringer
Pages325-393
Number of pages69
Volume2553
ISBN (Electronic)9781071626177
ISBN (Print)9781071626160
DOIs
Publication statusPublished - 11 May 2022

Publication series

NameMethods in Molecular Biology
PublisherHumana Press
ISSN (Print)1064-3745

Bibliographical note

© 2023. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.

Funding Information:
AO and CA acknowledge the support of Earlier.org through their Research Grant “Application of computational models of breast cancer for early-detection personalised tests.” CA acknowledges the support of EPSRC and The Alan Turing Institute through their Turing Network Development Award, and the Children’s Liver Disease Foundation through their Research Grant.

Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.

Fingerprint

Dive into the research topics of 'Machine Learning Methods for Survival Analysis with Clinical and Transcriptomics Data of Breast Cancer'. Together they form a unique fingerprint.

Cite this