annotate kmersvm/rocprcurve.xml @ 11:3b0c30b3baf1 draft default tip

Uploaded
author test-svm
date Wed, 08 Aug 2012 19:20:14 -0400
parents 66088269713e
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
1 <tool id="ROC-PR Curve" name="ROC-PR Curve">
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
2 <description>calculates AUC for ROC and PR curves</description>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
3 <command interpreter="sh">r_wrapper.sh $script_file</command>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
4 <inputs>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
5 <param format="tabular" name="cvpred_data" type="data" label="CV Predictions"/>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
6 </inputs>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
7 <outputs>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
8 <!--
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
9 <data format="pdf" name="rocprc.pdf" from_work_dir="rocprc.pdf" label="ROC-PR Curve" />
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
10 -->
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
11 <data format="png" name="rocprc.png" from_work_dir="rocprc.png" />
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
12 </outputs>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
13
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
14 <configfiles>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
15 <configfile name="script_file">
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
16
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
17 rm(list = objects() )
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
18
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
19 ########## calculate auprc #########
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
20 auPRC &lt;- function (perf) {
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
21 rec &lt;- perf@x.values
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
22 prec &lt;- perf@y.values
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
23 result &lt;- list()
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
24 for (i in 1:length(perf@x.values)) {
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
25 result[i] &lt;- list(sum((rec[[i]][2:length(rec[[i]])] - rec[[i]][2:length(rec[[i]])-1])*prec[[i]][-1]))
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
26 }
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
27 return(result)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
28 }
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
29
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
30 ########## plot ROC and PR-Curve #########
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
31 rocprc &lt;- function(x) {
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
32 sink(NULL,type="message")
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
33 options(warn=-1)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
34 suppressMessages(suppressWarnings(library('ROCR')))
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
35 svmresult &lt;- data.frame(x)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
36 colnames(svmresult) &lt;- c("Seqid","Pred","Label", "CV")
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
37
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
38 linewd &lt;- 1
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
39 wd &lt;- 4
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
40 ht &lt;- 4
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
41 fig.nrows &lt;- 1
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
42 fig.ncols &lt;- 2
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
43 pt &lt;- 10
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
44 cex.general &lt;- 1
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
45 cex.lab &lt;- 0.9
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
46 cex.axis &lt;- 0.9
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
47 cex.main &lt;- 1.2
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
48 cex.legend &lt;- 0.8
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
49
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
50
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
51 #pdf("rocprc.pdf", width=wd*fig.ncols, height=ht*fig.nrows)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
52 png("rocprc.png", width=wd*fig.ncols, height=ht*fig.nrows, unit="in", res=100)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
53
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
54 par(xaxs="i", yaxs="i", mar=c(3.5,3.5,2,2)+0.1, mgp=c(2,0.8,0), mfrow=c(fig.nrows, fig.ncols))
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
55
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
56 CVs &lt;- unique(svmresult[["CV"]])
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
57 preds &lt;- list()
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
58 labs &lt;- list()
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
59 auc &lt;- c()
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
60 for(i in 1:length(CVs)) {
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
61 preds[i] &lt;- subset(svmresult, CV==(i-1), select=c(Pred))
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
62 labs[i] &lt;- subset(svmresult, CV==(i-1), select=c(Label))
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
63 }
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
64
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
65 pred &lt;- prediction(preds, labs)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
66 perf_roc &lt;- performance(pred, 'tpr', 'fpr')
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
67 perf_prc &lt;- performance(pred, 'prec', 'rec')
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
68
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
69 perf_auc &lt;- performance(pred, 'auc')
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
70 prcs &lt;- auPRC(perf_prc)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
71 avgauc &lt;- 0
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
72 avgprc &lt;- 0
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
73
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
74 for(j in 1:length(CVs)) {
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
75 avgauc &lt;- avgauc + perf_auc@y.values[[j]]
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
76 avgprc &lt;- avgprc + prcs[[j]]
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
77 }
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
78
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
79 avgauc &lt;- avgauc/length(CVs)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
80 avgprc &lt;- avgprc/length(CVs)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
81
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
82 #preds_merged &lt;- unlist(preds)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
83 #labs_merged &lt;- unlist(labs)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
84 #pred_merged &lt;- prediction(preds_merged, labs_merged)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
85 #perf_merged_auc &lt;- performance(pred_merged, 'auc')
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
86
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
87 plot(perf_roc, colorize=T, main="ROC curve", spread.estimate="stderror",
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
88 xlab="1-Specificity", ylab="Sensitivity", cex.lab=1.2)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
89 text(0.2, 0.1, paste("AUC=", format(avgauc, digits=3, nsmall=3)))
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
90
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
91 plot(perf_prc, colorize=T, main="P-R curve", spread.estimate="stderror",
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
92 xlab="Recall", ylab="Precision", cex.lab=1.2, xlim=c(0,1), ylim=c(0,1))
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
93 text(0.2, 0.1, paste("AUC=", format(avgprc, digits=3, nsmall=3)))
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
94
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
95 dev.off()
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
96 }
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
97
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
98 ############## main function #################
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
99 d &lt;- read.table("${cvpred_data}")
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
100
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
101 rocprc(d)
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
102
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
103 </configfile>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
104 </configfiles>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
105
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
106 <help>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
107
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
108 **Note**
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
109
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
110 This tool is based on the ROCR library. If you use this tool please cite:
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
111
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
112 Tobias Sing, Oliver Sander, Niko Beerenwinkel, Thomas Lengauer.
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
113 ROCR: visualizing classifier performance in R.
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
114 Bioinformatics 21(20):3940-3941 (2005).
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
115
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
116 ----
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
117
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
118 **What it does**
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
119
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
120 Takes as input cross-validation predictions and calculates ROC Curve and its area under curve (AUC) and PR Curve and its AUC.
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
121
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
122 ----
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
123
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
124 **Results**
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
125
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
126 ROC Curve: Receiver Operating Characteristic Curve. Compares true positive rate (sensitivity) to false positive rate (1 - specificity).
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
127
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
128 PR Curve: Precision Recall Curve. Compares number of true positives (recall; same as sensitivity) to the number of true positives relative to the total number sequences classified as positive (precision).
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
129
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
130 AUC for a given curve: Area Under the Curve: Probability that of a randomly selected positive/negative pair, the positive will be scored more highly by the trained SVM than a negative.
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
131
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
132 .. class:: infomark
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
133
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
134 Both curves measure SVM performance, but ROC curves can be inaccurate if there is a large skew in class distribution. For more information see:
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
135
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
136 Jesse Davis, Mark Goadrich.
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
137 The Relationship Between Precision-Recall and ROC Curves.
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
138 Proceedings of the 23rd Annual Internation Conference on Machine Learning.
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
139 Pittsburgh, PA, 2006.
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
140
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
141 ----
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
142
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
143 **Example**
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
144
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
145 .. image:: ./static/images/sample_roc_chen.png
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
146 </help>
66088269713e Uploaded all files tracked by git
test-svm
parents:
diff changeset
147 </tool>