Skip to main content

Advertisement

Table 2 Evaluation of performance of predictive models on validation dataset of 345 viral siRNAs(V 345 )

From: VIRsiRNApred: a web server for predicting inhibition efficacy of siRNAs targeting human viruses

Predictive model no. siRNA features No. of siRNA features Pearson correlation coefficient* on validation (V345) dataset# during 10-fold cross validation
SVM ANN KNN REP Tree
1   Mononucleotide frequency 4 0.16 0.08 0.09 0.08
2 Dinucleotide frequency 16 0.30 0.23 0.22 0.24
3 Trinucleotide frequency 64 0.39 0.25 0.24 0.26
4 Tetranucleotide frequency 256 0.40 0.26 0.27 0.28
5 Pentanucleotide frequency 1024 0.42 0.27 0.28 0.29
6 Binary 76 0.03 0.02 0.02 0.01
7 Thermodynamic features 21 0.19 0.15 0.18 0.15
8 Secondary structure 28 0.02 0.02 0.02 0.02
9   1 + 2 + 3 + 4 + 5 1364 0.48 0.32 0.34 0.30
10 6 + 9 1440 0.48 0.32 0.34 0.32
11 6 + 7 + 9 1461 0.50 0.33 0.35 0.31
12   6 + 7 + 8 + 9 1489 0.45 0.32 0.33 0.30
  1. *Pearson Correlation Coefficient (PCC) is the correlation between experimental and predicted viral siRNA efficacy.
  2. #V345 is the validation dataset of experimental viral siRNA not used in training. Predictive Models 1-8 were developed on individual siRNA features while models 9-12 were based on hybrid siRNA features.