Publication:
Analysis of Cross-Combinations of Feature Selection and Machine-Learning Classification Methods Based on [F-18]F-FDG PET/CT Radiomic Features for Metabolic Response Prediction of Metastatic Breast Cancer Lesions

Research Projects
Organizational Units
Journal Issue
Abstract
Simple Summary Breast cancer is a leading cause of morbidity and mortality worldwide. The metastatic disease is largely responsible for cancer patient deaths, and its treatment implies usually different therapies. In this context, the prediction of response to treatment or depiction of treatment-resistant phenotypes is essential in clinical practice, especially in the new era of precision medicine. In this study, we used several combinations of feature selection methods and machine-learning classifiers to construct predictive models of the metabolic response to the treatment of metastatic breast cancer lesions. These models were based on clinical variables and radiomic features extracted from 2-deoxy-2-[F-18]fluoro-D-glucose positron emission tomography/computed tomography ([F-18]F-FDG PET/CT) images, obtained prior to the treatment. Our main goal was to know if this prediction was feasible and to identify those combinations with better predictive performance. We found that several combinations were successful to predict the metabolic response to treatment, of which the least absolute shrinkage and selection operator (Lasso) + support vector machines (SVM) had the best mean performance in terms of area under the curve, in both training and validation cohorts. Model performances depended largely on the selected combinations. Background: This study aimed to identify optimal combinations between feature selection methods and machine-learning classifiers for predicting the metabolic response of individual metastatic breast cancer lesions, based on clinical variables and radiomic features extracted from pretreatment [F-18]F-FDG PET/CT images. Methods: A total of 48 patients with confirmed metastatic breast cancer, who received different treatments, were included. All patients had an [F-18]F-FDG PET/CT scan before and after the treatment. From 228 metastatic lesions identified, 127 were categorized as responders (complete or partial metabolic response) and 101 as non-responders (stable or progressive metabolic response), by using the percentage changes in SULpeak (peak standardized uptake values normalized for body lean body mass). The lesion pool was divided into training (n = 182) and testing cohorts (n = 46); for each lesion, 101 image features from both PET and CT were extracted (202 features per lesion). These features, along with clinical and pathological information, allowed the prediction model's construction by using seven popular feature selection methods in cross-combination with another seven machine-learning (ML) classifiers. The performance of the different models was investigated with the receiver-operating characteristic curve (ROC) analysis, using the area under the curve (AUC) and accuracy (ACC) metrics. Results: The combinations, least absolute shrinkage and selection operator (Lasso) + support vector machines (SVM), or random forest (RF) had the highest AUC in the cross-validation, with 0.93 +/- 0.06 and 0.92 +/- 0.03, respectively, whereas Lasso + neural network (NN) or SVM, and mutual information (MI) + RF, had the higher AUC and ACC in the validation cohort, with 0.90/0.72, 0.86/0.76, and 87/85, respectively. On average, the models with Lasso and models with SVM had the best mean performance for both AUC and ACC in both training and validation cohorts. Conclusions: Image features obtained from a pretreatment [F-18]F-FDG PET/CT along with clinical vaiables could predict the metabolic response of metastatic breast cancer lesions, by their incorporation into predictive models, whose performance depends on the selected combination between feature selection and ML classifier methods.
Description
We acknowledge support from the Spanish Government (RTI2018-095800-A-I00 and RTI2018-098868-B-I00), from Comunidad de Madrid (B2017/BMD-3888 PRONTO-CM), NIH R01-CA215700-5 grant, and University of Pisa (Direzione Area Medica).
UCM subjects
Keywords
Citation
Collections