Fig. 1From: A machine learning-based clinical tool for diagnosing myopathy using multi-cohort microarray expression profilesModel training and validation workflow. The original, augmented, and combined expression profile data are referred to as T0, T1, and T2 respectively. A training-test split of 2:1 was made for T0. The training set T2 was used for feature selection and training the support vector machine (SVM) classifier. The test set of T0 were used for making predictions and validating the model performance measured by multiclass area under the receiver-operator curve (AUC). This workflow was applied to three data augmentation strategies: (a) no class size adjustment, (b) sampling to the mean class size, and (c) sampling to twice the mean class sizeBack to article page