Skip to main content
Fig. 1 | Journal of Translational Medicine

Fig. 1

From: Total mutational load and clinical features as predictors of the metastatic status in lung adenocarcinoma and squamous cell carcinoma patients

Fig. 1

Benchmarking stages. The proposed benchmarking for model comparison has four main stages. First (stage 01), we preprocess the dataset and apply different classification and validation strategies to generate an input dataset. Second (stage 02), we train Random Forest models using different subsets of the input dataset, aiming to assess the relative importance of each data stream. In this stage, we also evaluate whether applying dimensionality reduction techniques (PCA) and different resampling schemes affects model performance. We repeat the experiment on 100 different partitions (training and validation) of the input dataset, obtaining performance distributions instead of single values. Third (stage 03), we analyse the performance distributions and error sources to assess which strategies perform better under each condition. Finally (stage 04), we selected the best model for the dataset studied and identified the feature that most contributed to the classification

Back to article page