annotate templateLibrary.py @ 7:6f20d6bc1dd3 draft default tip

planemo upload commit b5238645b0390bc72071841b6af1eba8fdc24ab1
author anmoljh
date Sat, 26 May 2018 17:49:59 -0400
parents ab806d671e22
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1 def __template4Rnw():
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
2
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
3 template4Rnw = r'''%% Classification Modeling Script
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
4 %% Max Kuhn (max.kuhn@pfizer.com, mxkuhn@gmail.com)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
5 %% Version: 1.00
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
6 %% Created on: 2010/10/02
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
7 %%
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
8 %% The originla file hs been improved by
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
9 %% Deepak Bharti, Andrew M. Lynn , Anmol J. Hemrom
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
10 %% Version : 1.01
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
11 %% created on : 2014/08/12
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
12 %% This is an Sweave template for building and describing
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
13 %% classification models. It mixes R and LaTeX code. The document can
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
14 %% be processing using R's Sweave function to produce a tex file.
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
15 %%
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
16 %% The inputs are:
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
17 %% - the initial data set in a data frame called 'rawData'
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
18 %% - a factor column in the data set called 'class'. this should be the
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
19 %% outcome variable
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
20 %% - all other columns in rawData should be predictor variables
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
21 %% - the type of model should be in a variable called 'modName'.
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
22 %%
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
23 %% The script attempts to make some intelligent choices based on the
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
24 %% model being used. For example, if modName is "pls", the script will
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
25 %% automatically center and scale the predictor data. There are
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
26 %% situations where these choices can (and should be) changed.
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
27 %%
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
28 %% There are other options that may make sense to change. For example,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
29 %% the user may want to adjust the type of resampling. To find these
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
30 %% parts of the script, search on the string 'OPTION'. These parts of
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
31 %% the code will document the options.
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
32
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
33 \documentclass[14pt]{report}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
34 \usepackage{amsmath}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
35 \usepackage[pdftex]{graphicx}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
36 \usepackage{color}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
37 \usepackage{ctable}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
38 \usepackage{xspace}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
39 \usepackage{fancyvrb}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
40 \usepackage{fancyhdr}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
41 \usepackage{lastpage}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
42 \usepackage{longtable}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
43 \usepackage{algorithm2e}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
44 \usepackage[
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
45 colorlinks=true,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
46 linkcolor=blue,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
47 citecolor=blue,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
48 urlcolor=blue]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
49 {hyperref}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
50 \usepackage{lscape}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
51 \usepackage{Sweave}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
52 \SweaveOpts{keep.source = TRUE}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
53
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
54 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
55
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
56 % define new colors for use
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
57 \definecolor{darkgreen}{rgb}{0,0.6,0}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
58 \definecolor{darkred}{rgb}{0.6,0.0,0}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
59 \definecolor{lightbrown}{rgb}{1,0.9,0.8}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
60 \definecolor{brown}{rgb}{0.6,0.3,0.3}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
61 \definecolor{darkblue}{rgb}{0,0,0.8}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
62 \definecolor{darkmagenta}{rgb}{0.5,0,0.5}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
63
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
64 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
65
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
66 \newcommand{\bld}[1]{\mbox{\boldmath $$#1$$}}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
67 \newcommand{\shell}[1]{\mbox{$$#1$$}}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
68 \renewcommand{\vec}[1]{\mbox{\bf {#1}}}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
69
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
70 \newcommand{\ReallySmallSpacing}{\renewcommand{\baselinestretch}{.6}\Large\normalsize}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
71 \newcommand{\SmallSpacing}{\renewcommand{\baselinestretch}{1.1}\Large\normalsize}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
72
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
73 \newcommand{\halfs}{\frac{1}{2}}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
74
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
75 \setlength{\oddsidemargin}{-.25 truein}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
76 \setlength{\evensidemargin}{0truein}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
77 \setlength{\topmargin}{-0.2truein}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
78 \setlength{\textwidth}{7 truein}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
79 \setlength{\textheight}{8.5 truein}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
80 \setlength{\parindent}{0.20truein}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
81 \setlength{\parskip}{0.10truein}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
82
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
83 \setcounter{LTchunksize}{50}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
84
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
85 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
86 \pagestyle{fancy}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
87 \lhead{}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
88 %% OPTION Report header name
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
89 \chead{Classification Model Script}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
90 \rhead{}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
91 \lfoot{}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
92 \cfoot{}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
93 \rfoot{\thepage\ of \pageref{LastPage}}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
94 \renewcommand{\headrulewidth}{1pt}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
95 \renewcommand{\footrulewidth}{1pt}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
96 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
97
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
98 %% OPTION Report title and modeler name
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
99 \title{Classification Model Script using $METHOD}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
100 \author{"Lynn Group with M. Kuhn, SCIS, JNU, New Delhi"}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
101
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
102 \begin{document}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
103
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
104 \maketitle
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
105
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
106 \thispagestyle{empty}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
107 <<dummy, eval=TRUE, echo=FALSE, results=hide>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
108 # sets values for variables used later in the program to prevent the \Sexpr error on parsing with Sweave
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
109 numSamples=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
110 classDistString=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
111 missingText=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
112 numPredictors=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
113 numPCAcomp=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
114 pcaText=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
115 nzvText=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
116 corrText=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
117 ppText=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
118 varText=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
119 splitText="Dummy Text"
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
120 nirText="Dummy Text"
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
121 # pctTrain is a variable that is initialised in Data splitting, and reused later in testPred
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
122 pctTrain=0.8
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
123 Smpling=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
124 nzvText1=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
125 classDistString1=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
126 dwnsmpl=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
127 upsmpl=''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
128
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
129 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
130 <<startup, eval= TRUE, results = hide, echo = FALSE>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
131 library(Hmisc)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
132 library(caret)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
133 library(pROC)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
134 versionTest <- compareVersion(packageDescription("caret")$$Version,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
135 "4.65")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
136 if(versionTest < 0) stop("caret version 4.65 or later is required")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
137
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
138 library(RColorBrewer)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
139
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
140
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
141 listString <- function (x, period = FALSE, verbose = FALSE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
142 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
143 if (verbose) cat("\n entering listString\n")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
144 flush.console()
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
145 if (!is.character(x))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
146 x <- as.character(x)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
147 numElements <- length(x)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
148 out <- if (length(x) > 0) {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
149 switch(min(numElements, 3), x, paste(x, collapse = " and "),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
150 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
151 x <- paste(x, c(rep(",", numElements - 2), " and", ""), sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
152 paste(x, collapse = " ")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
153 })
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
154 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
155 else ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
156 if (period) out <- paste(out, ".", sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
157 if (verbose) cat(" leaving listString\n\n")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
158 flush.console()
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
159 out
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
160 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
161
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
162 resampleStats <- function(x, digits = 3)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
163 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
164 bestPerf <- x$$bestTune
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
165 colnames(bestPerf) <- gsub("^\\.", "", colnames(bestPerf))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
166 out <- merge(x$$results, bestPerf)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
167 out <- out[, colnames(out) %in% x$$perfNames]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
168 names(out) <- gsub("ROC", "area under the ROC curve", names(out), fixed = TRUE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
169 names(out) <- gsub("Sens", "sensitivity", names(out), fixed = TRUE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
170 names(out) <- gsub("Spec", "specificity", names(out), fixed = TRUE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
171 names(out) <- gsub("Accuracy", "overall accuracy", names(out), fixed = TRUE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
172 names(out) <- gsub("Kappa", "Kappa statistics", names(out), fixed = TRUE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
173
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
174 out <- format(out, digits = digits)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
175 listString(paste(names(out), "was", out))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
176 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
177
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
178 twoClassNoProbs <- function (data, lev = NULL, model = NULL)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
179 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
180 out <- c(sensitivity(data[, "pred"], data[, "obs"], lev[1]),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
181 specificity(data[, "pred"], data[, "obs"], lev[2]),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
182 confusionMatrix(data[, "pred"], data[, "obs"])$$overall["Kappa"])
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
183
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
184 names(out) <- c("Sens", "Spec", "Kappa")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
185 out
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
186 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
187
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
188
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
189
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
190 ##OPTION: model name: see ?train for more values/models
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
191 modName <- "$METHOD"
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
192
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
193
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
194 load("$RDATA")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
195 rawData <- dataX
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
196 rawData$$outcome <- dataY
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
197
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
198 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
199
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
200
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
201 \section*{Data Sets}\label{S:data}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
202
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
203 %% OPTION: provide some background on the problem, the experimental
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
204 %% data, how the compounds were selected etc
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
205
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
206 <<getDataInfo, eval = $GETDATAINFOEVAL, echo = $GETDATAINFOECHO, results = $GETDATAINFORESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
207 if(!any(names(rawData) == "outcome")) stop("a variable called outcome should be in the data set")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
208 if(!is.factor(rawData$$outcome)) stop("the outcome should be a factor vector")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
209
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
210 ## OPTION: when there are only two classes, the first level of the
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
211 ## factor is used as the "positive" or "event" for calculating
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
212 ## sensitivity and specificity. Adjust the outcome factor accordingly.
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
213 numClasses <- length(levels(rawData$$outcome))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
214 numSamples <- nrow(rawData)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
215 numPredictors <- ncol(rawData) - 1
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
216 predictorNames <- names(rawData)[names(rawData) != "outcome"]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
217
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
218 isNum <- apply(rawData[,predictorNames, drop = FALSE], 2, is.numeric)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
219 if(any(!isNum)) stop("all predictors in rawData should be numeric")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
220
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
221 classTextCheck <- all.equal(levels(rawData$$outcome), make.names(levels(rawData$$outcome)))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
222 if(!classTextCheck) warning("the class levels are not valid R variable names; this may cause errors")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
223
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
224 ## Get the class distribution
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
225 classDist <- table(rawData$$outcome)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
226 classDistString <- paste("``",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
227 names(classDist),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
228 "'' ($$n$$=",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
229 classDist,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
230 ")",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
231 sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
232 classDistString <- listString(classDistString)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
233 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
234
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
235 <<missingFilter, eval = $MISSINGFILTEREVAL, echo = $MISSINGFILTERECHO, results = $MISSINGFILTERRESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
236 colRate <- apply(rawData[, predictorNames, drop = FALSE],
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
237 2, function(x) mean(is.na(x)))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
238
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
239 ##OPTION thresholds can be changed
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
240 colExclude <- colRate > $MISSINGFILTERTHRESHC
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
241
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
242 missingText <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
243
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
244 if(any(colExclude))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
245 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
246 missingText <- paste(missingText,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
247 ifelse(sum(colExclude) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
248 " There were ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
249 " There was "),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
250 sum(colExclude),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
251 ifelse(sum(colExclude) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
252 " predictors ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
253 " predictor "),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
254 "with an excessive number of ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
255 "missing data. ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
256 ifelse(sum(colExclude) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
257 " These were excluded. ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
258 " This was excluded. "))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
259 predictorNames <- predictorNames[!colExclude]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
260 rawData <- rawData[, names(rawData) %in% c("outcome", predictorNames), drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
261 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
262
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
263
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
264 rowRate <- apply(rawData[, predictorNames, drop = FALSE],
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
265 1, function(x) mean(is.na(x)))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
266
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
267 rowExclude <- rowRate > $MISSINGFILTERTHRESHR
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
268
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
269
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
270 if(any(rowExclude)) {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
271 missingText <- paste(missingText,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
272 ifelse(sum(rowExclude) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
273 " There were ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
274 " There was "),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
275 sum(colExclude),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
276 ifelse(sum(rowExclude) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
277 " samples ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
278 " sample "),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
279 "with an excessive number of ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
280 "missing data. ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
281 ifelse(sum(rowExclude) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
282 " These were excluded. ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
283 " This was excluded. "),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
284 "After filtering, ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
285 sum(!rowExclude),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
286 " samples remained.")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
287 rawData <- rawData[!rowExclude, ]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
288 hasMissing <- apply(rawData[, predictorNames, drop = FALSE],
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
289 1, function(x) mean(is.na(x)))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
290 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
291 hasMissing <- apply(rawData[, predictorNames, drop = FALSE],
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
292 1, function(x) any(is.na(x)))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
293 missingText <- paste(missingText,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
294 ifelse(missingText == "",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
295 "There ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
296 "Subsequently, there "),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
297 ifelse(sum(hasMissing) == 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
298 "was ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
299 "were "),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
300 ifelse(sum(hasMissing) > 0,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
301 sum(hasMissing),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
302 "no"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
303 ifelse(sum(hasMissing) == 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
304 "sample ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
305 "samples "),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
306 "with missing values.")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
307
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
308 rawData <- rawData[complete.cases(rawData),]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
309
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
310 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
311
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
312 rawData1 <- rawData[,1:length(rawData)-1]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
313 rawData2 <- rawData[,length(rawData)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
314
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
315 set.seed(222)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
316 nzv1 <- nearZeroVar(rawData1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
317 if(length(nzv1) > 0)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
318 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
319 nzvVars1 <- names(rawData1)[nzv1]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
320 rawData <- rawData1[, -nzv1]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
321 rawData$outcome <- rawData2
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
322 nzvText1 <- paste("There were ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
323 length(nzv1),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
324 " predictors that were removed from original data due to",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
325 " severely unbalanced distributions that",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
326 " could negatively affect the model fit",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
327 ifelse(length(nzv1) > 10,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
328 ".",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
329 paste(": ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
330 listString(nzvVars1),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
331 ".",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
332 sep = "")),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
333 sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
334
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
335 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
336 rawData <- rawData1
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
337 rawData$outcome <- rawData2
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
338 nzvText1 <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
339
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
340 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
341
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
342 remove("rawData1")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
343 remove("rawData2")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
344
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
345 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
346
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
347 The initial data set consisted of \Sexpr{numSamples} samples and
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
348 \Sexpr{numPredictors} predictor variables. The breakdown of the
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
349 outcome data classes were: \Sexpr{classDistString}.
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
350
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
351 \Sexpr{missingText}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
352
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
353 \Sexpr{nzvText1}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
354
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
355 <<pca, eval= $PCAEVAL, echo = $PCAECHO, results = $PCARESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
356
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
357 predictorNames <- names(rawData)[names(rawData) != "outcome"]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
358 numPredictors <- length(predictorNames)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
359 predictors <- rawData[, predictorNames, drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
360 ## PCA will fail with predictors having less than 2 unique values
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
361 isZeroVar <- apply(predictors, 2,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
362 function(x) length(unique(x)) < 2)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
363 if(any(isZeroVar)) predictors <- predictors[, !isZeroVar, drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
364 ## For whatever, only the formula interface to prcomp
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
365 ## handles missing values
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
366 pcaForm <- as.formula(
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
367 paste("~",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
368 paste(names(predictors), collapse = "+")))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
369 pca <- prcomp(pcaForm,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
370 data = predictors,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
371 center = TRUE,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
372 scale. = TRUE,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
373 na.action = na.omit)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
374 ## OPTION: the number of components plotted/discussed can be set
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
375 numPCAcomp <- $PCACOMP
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
376 pctVar <- pca$$sdev^2/sum(pca$$sdev^2)*100
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
377 pcaText <- paste(round(pctVar[1:numPCAcomp], 1),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
378 "\\\\%",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
379 sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
380 pcaText <- listString(pcaText)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
381 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
382
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
383 To get an initial assessment of the separability of the classes,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
384 principal component analysis (PCA) was used to distill the
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
385 \Sexpr{numPredictors} predictors down into \Sexpr{numPCAcomp}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
386 surrogate variables (i.e. the principal components) in a manner that
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
387 attempts to maximize the amount of information preserved from the
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
388 original predictor set. Figure \ref{F:inititalPCA} contains plots of
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
389 the first \Sexpr{numPCAcomp} components, which accounted for
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
390 \Sexpr{pcaText} percent of the variability in the original predictors
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
391 (respectively).
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
392
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
393
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
394 %% OPTION: remark on how well (or poorly) the data separated
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
395
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
396 \setkeys{Gin}{width = 0.8\textwidth}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
397 \begin{figure}[p]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
398 \begin{center}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
399
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
400 <<pcaPlot, eval = $PCAPLOTEVAL, echo = $PCAPLOTECHO, results = $PCAPLOTRESULT, fig = $PCAPLOTFIG, width = 8, height = 8>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
401 trellis.par.set(caretTheme(), warn = TRUE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
402 if(numPCAcomp == 2)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
403 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
404 axisRange <- extendrange(pca$$x[, 1:2])
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
405 print(
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
406 xyplot(PC1 ~ PC2,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
407 data = as.data.frame(pca$$x),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
408 type = c("p", "g"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
409 groups = rawData$$outcome,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
410 auto.key = list(columns = 2),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
411 xlim = axisRange,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
412 ylim = axisRange))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
413 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
414 axisRange <- extendrange(pca$$x[, 1:numPCAcomp])
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
415 print(
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
416 splom(~as.data.frame(pca$$x)[, 1:numPCAcomp],
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
417 type = c("p", "g"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
418 groups = rawData$$outcome,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
419 auto.key = list(columns = 2),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
420 as.table = TRUE,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
421 prepanel.limits = function(x) axisRange
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
422 ))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
423
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
424 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
425
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
426 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
427
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
428 \caption[PCA Plot]{A plot of the first \Sexpr{numPCAcomp}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
429 principal components for the original data set.}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
430 \label{F:inititalPCA}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
431 \end{center}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
432 \end{figure}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
433
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
434
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
435
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
436 <<initialDataSplit, eval = $INITIALDATASPLITEVAL, echo = $INITIALDATASPLITECHO, results = $INITIALDATASPLITRESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
437
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
438 ## OPTION: in small samples sizes, you may not want to set aside a
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
439 ## training set and focus on the resampling results.
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
440
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
441 set.seed(1234)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
442 dataX <- rawData[,1:length(rawData)-1]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
443 dataY <- rawData[,length(rawData)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
444
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
445 Smpling <- "$SAAMPLING"
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
446
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
447 if(Smpling=="downsampling")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
448 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
449 dwnsmpl <- downSample(dataX,dataY)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
450 rawData <- dwnsmpl[,1:length(dwnsmpl)-1]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
451 rawData$outcome <- dwnsmpl[,length(dwnsmpl)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
452 remove("dwnsmpl")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
453 remove("dataX")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
454 remove("dataY")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
455 }else if(Smpling=="upsampling"){
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
456 upsmpl <- upSample(dataX,dataY)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
457 rawData <- upsmpl[,1:length(upsmpl)-1]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
458 rawData$outcome <- upsmpl[,length(upsmpl)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
459 remove("upsmpl")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
460 remove("dataX")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
461 remove("dataY")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
462 }else{remove("dataX")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
463 remove("dataY")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
464 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
465
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
466
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
467
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
468 numSamples <- nrow(rawData)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
469
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
470 predictorNames <- names(rawData)[names(rawData) != "outcome"]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
471 numPredictors <- length(predictorNames)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
472
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
473
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
474 classDist1 <- table(rawData$outcome)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
475 classDistString1 <- paste("``",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
476 names(classDist1),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
477 "'' ($n$=",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
478 classDist1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
479 ")",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
480 sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
481 classDistString1 <- listString(classDistString1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
482
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
483 pctTrain <- $PERCENT
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
484
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
485 if(pctTrain < 1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
486 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
487 ## OPTION: seed number can be changed
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
488 set.seed(1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
489 inTrain <- createDataPartition(rawData$$outcome,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
490 p = pctTrain,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
491 list = FALSE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
492 trainX <- rawData[ inTrain, predictorNames]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
493 testX <- rawData[-inTrain, predictorNames]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
494 trainY <- rawData[ inTrain, "outcome"]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
495 testY <- rawData[-inTrain, "outcome"]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
496 splitText <- paste("The original data were split into ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
497 "a training set ($$n$$=",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
498 nrow(trainX),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
499 ") and a test set ($$n$$=",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
500 nrow(testX),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
501 ") in a manner that preserved the ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
502 "distribution of the classes.",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
503 sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
504 isZeroVar <- apply(trainX, 2,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
505 function(x) length(unique(x)) < 2)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
506 if(any(isZeroVar))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
507 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
508 trainX <- trainX[, !isZeroVar, drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
509 testX <- testX[, !isZeroVar, drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
510 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
511
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
512 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
513 trainX <- rawData[, predictorNames]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
514 testX <- NULL
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
515 trainY <- rawData[, "outcome"]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
516 testY <- NULL
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
517 splitText <- "The entire data set was used as the training set."
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
518 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
519 trainDist <- table(trainY)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
520 nir <- max(trainDist)/length(trainY)*100
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
521 niClass <- names(trainDist)[which.max(trainDist)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
522 nirText <- paste("The non--information rate is the accuracy that can be ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
523 "achieved by predicting all samples using the most ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
524 "dominant class. For these data, the rate is ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
525 round(nir, 2), "\\\\% using the ``",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
526 niClass,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
527 "'' class.",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
528 sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
529
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
530 remove("rawData")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
531
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
532 if((!is.null(testX)) && (!is.null(testY))){
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
533 #save(trainX,trainY,testX,testY,file="datasets.RData")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
534 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
535 save(trainX,trainY,file="datasets.RData")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
536 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
537
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
538 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
539
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
540 \Sexpr{splitText}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
541
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
542 \Sexpr{nirText}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
543
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
544 The data set for model building consisted of \Sexpr{numSamples} samples and
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
545 \Sexpr{numPredictors} predictor variables. The breakdown of the
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
546 outcome data classes were: \Sexpr{classDistString1}.
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
547
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
548 <<nzv, eval= $NZVEVAL, results = $NZVRESULT, echo = $NZVECHO>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
549 ## OPTION: other pre-processing steps can be used
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
550 ppSteps <- caret:::suggestions(modName)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
551
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
552 set.seed(2)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
553 if(ppSteps["nzv"])
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
554 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
555 nzv <- nearZeroVar(trainX)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
556 if(length(nzv) > 0)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
557 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
558 nzvVars <- names(trainX)[nzv]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
559 trainX <- trainX[, -nzv]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
560 nzvText <- paste("There were ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
561 length(nzv),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
562 " predictors that were removed from train set due to",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
563 " severely unbalanced distributions that",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
564 " could negatively affect the model",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
565 ifelse(length(nzv) > 10,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
566 ".",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
567 paste(": ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
568 listString(nzvVars),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
569 ".",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
570 sep = "")),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
571 sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
572 testX <- testX[, -nzv]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
573 } else nzvText <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
574 } else nzvText <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
575 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
576
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
577 \Sexpr{nzvText}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
578
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
579
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
580 <<corrFilter, eval = $CORRFILTEREVAL, results = $CORRFILTERRESULT, echo = $CORRFILTERECHO>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
581 if(ppSteps["corr"])
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
582 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
583 ## OPTION:
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
584 corrThresh <- $THRESHHOLDCOR
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
585 highCorr <- findCorrelation(cor(trainX, use = "pairwise.complete.obs"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
586 corrThresh)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
587 if(length(highCorr) > 0)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
588 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
589 corrVars <- names(trainX)[highCorr]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
590 trainX <- trainX[, -highCorr]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
591 corrText <- paste("There were ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
592 length(highCorr),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
593 " predictors that were removed due to",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
594 " large between--predictor correlations that",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
595 " could negatively affect the model fit",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
596 ifelse(length(highCorr) > 10,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
597 ".",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
598 paste(": ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
599 listString(highCorr),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
600 ".",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
601 sep = "")),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
602 " Removing these predictors forced",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
603 " all pair--wise correlations to be",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
604 " less than ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
605 corrThresh,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
606 ".",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
607 sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
608 testX <- testX[, -highCorr]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
609 } else corrText <- "No correlation among data on given threshold"
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
610 }else corrText <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
611 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
612
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
613 \Sexpr{corrText}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
614
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
615 <<preProc, eval = $PREPROCEVAL, echo = $PREPROCECHO, results = $PREPROCRESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
616 ppMethods <- NULL
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
617 if(ppSteps["center"]) ppMethods <- c(ppMethods, "center")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
618 if(ppSteps["scale"]) ppMethods <- c(ppMethods, "scale")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
619 if(any(hasMissing) > 0) ppMethods <- c(ppMethods, "knnImpute")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
620 ##OPTION other methods, such as spatial sign, can be added to this list
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
621
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
622 if(length(ppMethods) > 0)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
623 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
624 ppInfo <- preProcess(trainX, method = ppMethods)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
625 trainX <- predict(ppInfo, trainX)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
626 if(pctTrain < 1) testX <- predict(ppInfo, testX)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
627 ppText <- paste("The following pre--processing methods were",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
628 " applied to the training",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
629 ifelse(pctTrain < 1, " and test", ""),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
630 " data: ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
631 listString(ppMethods),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
632 ".",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
633 sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
634 ppText <- gsub("center", "mean centering", ppText)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
635 ppText <- gsub("scale", "scaling to unit variance", ppText)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
636 ppText <- gsub("knnImpute",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
637 paste(ppInfo$$k, "--nearest neighbor imputation", sep = ""),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
638 ppText)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
639 ppText <- gsub("spatialSign", "the spatial sign transformation", ppText)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
640 ppText <- gsub("pca", "principal component feature extraction", ppText)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
641 ppText <- gsub("ica", "independent component feature extraction", ppText)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
642 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
643 ppInfo <- NULL
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
644 ppText <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
645 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
646
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
647 predictorNames <- names(trainX)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
648 if(nzvText != "" | corrText != "" | ppText != "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
649 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
650 varText <- paste("After pre--processing, ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
651 ncol(trainX),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
652 "predictors remained for modeling.")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
653 } else varText <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
654
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
655 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
656
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
657 \Sexpr{ppText}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
658 \Sexpr{varText}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
659
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
660 \clearpage
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
661
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
662 \section*{Model Building}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
663
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
664 <<setupWorkers, eval = TRUE, echo = $SETUPWORKERSECHO, results = $SETUPWORKERSRESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
665 numWorkers <- $NUMWORKERS
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
666 ##OPTION: turn up numWorkers to use MPI
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
667 if(numWorkers > 1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
668 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
669 mpiCalcs <- function(X, FUN, ...)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
670 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
671 theDots <- list(...)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
672 parLapply(theDots$$cl, X, FUN)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
673 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
674
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
675 library(snow)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
676 cl <- makeCluster(numWorkers, "MPI")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
677 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
678 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
679
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
680 <<setupResampling, echo = $SETUPRESAMPLINGECHO, results = $SETUPRESAMPLINGRESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
681 ##OPTION: the resampling options can be changed. See
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
682 ## ?trainControl for details
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
683
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
684 resampName <- "$RESAMPNAME"
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
685 resampNumber <- $RESAMPLENUMBER
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
686 numRepeat <- $NUMREPEAT
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
687 resampP <- $RESAMPLENUMBERPERCENT
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
688
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
689 modelInfo <- modelLookup(modName)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
690
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
691 if(numClasses == 2)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
692 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
693 foo <- if(any(modelInfo$$probModel)) twoClassSummary else twoClassNoProbs
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
694 } else foo <- defaultSummary
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
695
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
696 set.seed(3)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
697 ctlObj <- trainControl(method = resampName,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
698 number = resampNumber,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
699 repeats = numRepeat,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
700 p = resampP,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
701 classProbs = any(modelInfo$$probModel),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
702 summaryFunction = foo)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
703
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
704
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
705 ##OPTION select other performance metrics as needed
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
706 optMetric <- if(numClasses == 2 & any(modelInfo$$probModel)) "ROC" else "Kappa"
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
707
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
708 if(numWorkers > 1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
709 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
710 ctlObj$$workers <- numWorkers
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
711 ctlObj$$computeFunction <- mpiCalcs
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
712 ctlObj$$computeArgs <- list(cl = cl)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
713 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
714 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
715
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
716 <<setupGrid, results = $SETUPGRIDRESULT, echo = $SETUPGRIDECHO>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
717 ##OPTION expand or contract these grids as needed (or
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
718 ## add more models
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
719
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
720 gridSize <- $SETUPGRIDSIZE
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
721
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
722 if(modName %in% c("svmPoly", "svmRadial", "svmLinear", "lvq", "ctree2", "ctree")) gridSize <- 5
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
723 if(modName %in% c("earth", "fda")) gridSize <- 7
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
724 if(modName %in% c("knn", "rocc", "glmboost", "rf", "nodeHarvest")) gridSize <- 10
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
725
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
726 if(modName %in% c("nb")) gridSize <- 2
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
727 if(modName %in% c("pam", "rpart")) gridSize <- 15
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
728 if(modName %in% c("pls")) gridSize <- min(20, ncol(trainX))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
729
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
730 if(modName == "gbm")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
731 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
732 tGrid <- expand.grid(.interaction.depth = -1 + (1:5)*2 ,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
733 .n.trees = (1:10)*20,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
734 .shrinkage = .1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
735 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
736
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
737 if(modName == "nnet")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
738 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
739 tGrid <- expand.grid(.size = -1 + (1:5)*2 ,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
740 .decay = c(0, .001, .01, .1))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
741 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
742
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
743 if(modName == "ada")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
744 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
745 tGrid <- expand.grid(.maxdepth = 1, .iter = c(100,200,300,400), .nu = 1 )
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
746
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
747 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
748
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
749
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
750 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
751
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
752 <<fitModel, results = $FITMODELRESULT, echo = $FITMODELECHO, eval = $FITMODELEVAL>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
753 ##OPTION alter as needed
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
754
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
755 set.seed(4)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
756 modelFit <- switch(modName,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
757 gbm =
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
758 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
759 mix <- sample(seq(along = trainY))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
760 train(
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
761 trainX[mix,], trainY[mix], modName,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
762 verbose = FALSE,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
763 bag.fraction = .9,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
764 metric = optMetric,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
765 trControl = ctlObj,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
766 tuneGrid = tGrid)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
767 },
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
768
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
769 multinom =
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
770 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
771 train(
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
772 trainX, trainY, modName,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
773 trace = FALSE,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
774 metric = optMetric,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
775 maxiter = 1000,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
776 MaxNWts = 5000,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
777 trControl = ctlObj,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
778 tuneLength = gridSize)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
779 },
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
780
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
781 nnet =
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
782 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
783 train(
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
784 trainX, trainY, modName,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
785 metric = optMetric,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
786 linout = FALSE,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
787 trace = FALSE,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
788 maxiter = 1000,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
789 MaxNWts = 5000,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
790 trControl = ctlObj,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
791 tuneGrid = tGrid)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
792
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
793 },
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
794
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
795 svmRadial =, svmPoly =, svmLinear =
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
796 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
797 train(
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
798 trainX, trainY, modName,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
799 metric = optMetric,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
800 scaled = TRUE,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
801 trControl = ctlObj,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
802 tuneLength = gridSize)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
803 },
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
804 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
805 train(trainX, trainY, modName,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
806 trControl = ctlObj,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
807 metric = optMetric,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
808 tuneLength = gridSize)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
809 })
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
810
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
811 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
812
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
813 <<modelDescr, echo = $MODELDESCRECHO, results = $MODELDESCRRESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
814 summaryText <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
815
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
816 resampleName <- switch(tolower(modelFit$$control$$method),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
817 boot = paste("the bootstrap (", length(modelFit$$control$$index), " reps)", sep = ""),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
818 boot632 = paste("the bootstrap 632 rule (", length(modelFit$$control$$index), " reps)", sep = ""),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
819 cv = paste("cross-validation (", modelFit$$control$$number, " fold)", sep = ""),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
820 repeatedcv = paste("cross-validation (", modelFit$$control$$number, " fold, repeated ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
821 modelFit$$control$$repeats, " times)", sep = ""),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
822 lgocv = paste("repeated train/test splits (", length(modelFit$$control$$index), " reps, ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
823 round(modelFit$$control$$p, 2), "$$\\%$$)", sep = ""))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
824
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
825 tuneVars <- latexTranslate(tolower(modelInfo$$label))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
826 tuneVars <- gsub("\\#", "the number of ", tuneVars, fixed = TRUE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
827 if(ncol(modelFit$$bestTune) == 1 && colnames(modelFit$$bestTune) == ".parameter")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
828 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
829 summaryText <- paste(summaryText,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
830 "\n\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
831 "There are no tuning parameters associated with this model.",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
832 "To characterize the model performance on the training set,",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
833 resampleName,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
834 "was used.",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
835 "Table \\\\ref{T:resamps} and Figure \\\\ref{F:profile}",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
836 "show summaries of the resampling results. ")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
837
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
838 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
839 summaryText <- paste("There",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
840 ifelse(nrow(modelInfo) > 1, "are", "is"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
841 nrow(modelInfo),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
842 ifelse(nrow(modelInfo) > 1, "tuning parameters", "tuning parameter"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
843 "associated with this model:",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
844 listString(tuneVars, period = TRUE))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
845
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
846
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
847
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
848 paramNames <- gsub(".", "", names(modelFit$$bestTune), fixed = TRUE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
849 ## (i in seq(along = paramNames))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
850 ## {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
851 ## check <- modelInfo$$parameter %in% paramNames[i]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
852 ## if(any(check))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
853 ## {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
854 ## paramNames[i] <- modelInfo$$label[which(check)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
855 ## }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
856 ## }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
857
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
858 paramNames <- gsub("#", "the number of ", paramNames, fixed = TRUE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
859 ## Check to see if there was only one combination fit
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
860 summaryText <- paste(summaryText,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
861 "To choose",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
862 ifelse(nrow(modelInfo) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
863 "appropriate values of the tuning parameters,",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
864 "an appropriate value of the tuning parameter,"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
865 resampleName,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
866 "was used to generated a profile of performance across the",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
867 nrow(modelFit$$results),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
868 ifelse(nrow(modelInfo) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
869 "combinations of the tuning parameters.",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
870 "candidate values."),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
871
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
872 "Table \\\\ref{T:resamps} and Figure \\\\ref{F:profile} show",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
873 "summaries of the resampling profile. ", "The final model fitted to the entire training set was:",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
874 listString(paste(latexTranslate(tolower(paramNames)), "=", modelFit$$bestTune[1,]), period = TRUE))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
875
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
876 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
877 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
878
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
879 \Sexpr{summaryText}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
880
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
881 <<resampTable, echo = $RESAMPTABLEECHO, results = $RESAMPTABLERESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
882 tableData <- modelFit$$results
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
883
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
884 if(all(modelInfo$$parameter == "parameter") && resampName == "boot632")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
885 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
886 tableData <- tableData[,-1, drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
887 colNums <- c( length(modelFit$$perfNames), length(modelFit$$perfNames), length(modelFit$$perfNames))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
888 colLabels <- c("Mean", "Standard Deviation","Apparant")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
889 constString <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
890 isConst <- NULL
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
891 } else if (all(modelInfo$$parameter == "parameter") && (resampName == "boot" | resampName == "cv" | resampName == "repeatedcv" )){
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
892 tableData <- tableData[,-1, drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
893 colNums <- c(length(modelFit$$perfNames), length(modelFit$$perfNames))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
894 colLabels <- c("Mean", "Standard Deviation")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
895 constString <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
896 isConst <- NULL
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
897 } else if (all(modelInfo$$parameter == "parameter") && resampName == "LOOCV" ){
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
898 tableData <- tableData[,-1, drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
899 colNums <- length(modelFit$$perfNames)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
900 colLabels <- c("Measures")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
901 constString <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
902 isConst <- NULL
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
903 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
904 if (all(modelInfo$$parameter != "parameter") && resampName == "boot632" ){
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
905 isConst <- apply(tableData[, modelInfo$$parameter, drop = FALSE],
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
906 2,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
907 function(x) length(unique(x)) == 1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
908
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
909 numParamInTable <- sum(!isConst)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
910
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
911 if(any(isConst))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
912 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
913 constParam <- modelInfo$$parameter[isConst]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
914 constValues <- format(tableData[, constParam, drop = FALSE], digits = 4)[1,,drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
915 tableData <- tableData[, !(names(tableData) %in% constParam), drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
916 constString <- paste("The tuning",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
917 ifelse(sum(isConst) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
918 "parmeters",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
919 "parameter"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
920 listString(paste("``", names(constValues), "''", sep = "")),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
921 ifelse(sum(isConst) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
922 "were",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
923 "was"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
924 "held constant at",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
925 ifelse(sum(isConst) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
926 "a value of",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
927 "values of"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
928 listString(constValues[1,]))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
929
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
930 } else constString <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
931
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
932 cn <- colnames(tableData)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
933 ## for(i in seq(along = cn))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
934 ## {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
935 ## check <- modelInfo$$parameter %in% cn[i]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
936 ## if(any(check))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
937 ## {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
938 ## cn[i] <- modelInfo$$label[which(check)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
939 ## }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
940 ## }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
941 ## colnames(tableData) <- cn
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
942
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
943 colNums <- c(numParamInTable,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
944 length(modelFit$$perfNames),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
945 length(modelFit$$perfNames),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
946 length(modelFit$$perfNames))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
947 colLabels <- c("", "Mean", "Standard Deviation", "Apparant")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
948
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
949 }else if (all(modelInfo$$parameter != "parameter") && (resampName == "boot" | resampName == "repeatedcv" | resampName == "cv") ){
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
950 isConst <- apply(tableData[, modelInfo$$parameter, drop = FALSE],
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
951 2,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
952 function(x) length(unique(x)) == 1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
953
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
954 numParamInTable <- sum(!isConst)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
955
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
956 if(any(isConst))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
957 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
958 constParam <- modelInfo$$parameter[isConst]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
959 constValues <- format(tableData[, constParam, drop = FALSE], digits = 4)[1,,drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
960 tableData <- tableData[, !(names(tableData) %in% constParam), drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
961 constString <- paste("The tuning",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
962 ifelse(sum(isConst) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
963 "parmeters",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
964 "parameter"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
965 listString(paste("``", names(constValues), "''", sep = "")),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
966 ifelse(sum(isConst) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
967 "were",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
968 "was"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
969 "held constant at",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
970 ifelse(sum(isConst) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
971 "a value of",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
972 "values of"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
973 listString(constValues[1,]))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
974
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
975 } else constString <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
976
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
977 cn <- colnames(tableData)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
978 ## for(i in seq(along = cn))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
979 ## {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
980 ## check <- modelInfo$$parameter %in% cn[i]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
981 ## if(any(check))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
982 ## {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
983 ## cn[i] <- modelInfo$$label[which(check)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
984 ## }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
985 ## }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
986 ## colnames(tableData) <- cn
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
987
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
988 colNums <- c(numParamInTable,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
989 length(modelFit$$perfNames),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
990 length(modelFit$$perfNames))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
991 colLabels <- c("", "Mean", "Standard Deviation")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
992
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
993 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
994 else if (all(modelInfo$$parameter != "parameter") && resampName == "LOOCV"){
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
995 isConst <- apply(tableData[, modelInfo$$parameter, drop = FALSE],
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
996 2,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
997 function(x) length(unique(x)) == 1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
998
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
999 numParamInTable <- sum(!isConst)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1000
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1001 if(any(isConst))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1002 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1003 constParam <- modelInfo$$parameter[isConst]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1004 constValues <- format(tableData[, constParam, drop = FALSE], digits = 4)[1,,drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1005 tableData <- tableData[, !(names(tableData) %in% constParam), drop = FALSE]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1006 constString <- paste("The tuning",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1007 ifelse(sum(isConst) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1008 "parmeters",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1009 "parameter"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1010 listString(paste("``", names(constValues), "''", sep = "")),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1011 ifelse(sum(isConst) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1012 "were",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1013 "was"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1014 "held constant at",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1015 ifelse(sum(isConst) > 1,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1016 "a value of",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1017 "values of"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1018 listString(constValues[1,]))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1019
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1020 } else constString <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1021
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1022 cn <- colnames(tableData)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1023 ## for(i in seq(along = cn))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1024 ## {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1025 ## check <- modelInfo$$parameter %in% cn[i]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1026 ## if(any(check))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1027 ## {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1028 ## cn[i] <- modelInfo$$label[which(check)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1029 ## }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1030 ## }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1031 ## colnames(tableData) <- cn
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1032
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1033 colNums <- c(numParamInTable,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1034 length(modelFit$$perfNames))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1035 colLabels <- c("", "Measures")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1036
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1037 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1038
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1039 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1040
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1041
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1042
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1043 colnames(tableData) <- gsub("SD$$", "", colnames(tableData))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1044 colnames(tableData) <- gsub("Apparent$$", "", colnames(tableData))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1045 colnames(tableData) <- latexTranslate(colnames(tableData))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1046 rownames(tableData) <- latexTranslate(rownames(tableData))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1047
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1048 latex(tableData,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1049 rowname = NULL,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1050 file = "",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1051 cgroup = colLabels,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1052 n.cgroup = colNums,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1053 where = "h!",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1054 digits = 4,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1055 longtable = nrow(tableData) > 30,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1056 caption = paste(resampleName, "results from the model fit.", constString),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1057 label = "T:resamps")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1058 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1059
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1060 \setkeys{Gin}{ width = 0.9\textwidth}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1061 \begin{figure}[b]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1062 \begin{center}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1063
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1064 <<profilePlot, echo = $PROFILEPLOTECHO, fig = $PROFILEPLOTFIG, width = 8, height = 6>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1065 trellis.par.set(caretTheme(), warn = TRUE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1066 if(all(modelInfo$$parameter == "parameter") | all(isConst) | modName == "nb")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1067 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1068 resultsPlot <- resampleHist(modelFit)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1069 plotCaption <- paste("Distributions of model performance from the ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1070 "training set estimated using ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1071 resampleName)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1072 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1073 if(modName %in% c("svmPoly", "svmRadial", "svmLinear"))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1074 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1075 resultsPlot <- plot(modelFit,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1076 metric = optMetric,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1077 xTrans = function(x) log10(x))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1078 resultsPlot <- update(resultsPlot,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1079 type = c("g", "p", "l"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1080 ylab = paste(optMetric, " (", resampleName, ")", sep = ""))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1081
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1082 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1083 resultsPlot <- plot(modelFit,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1084 metric = optMetric)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1085 resultsPlot <- update(resultsPlot,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1086 type = c("g", "p", "l"),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1087 ylab = paste(optMetric, " (", resampleName, ")", sep = ""))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1088 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1089 plotCaption <- paste("A plot of the estimates of the",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1090 optMetric,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1091 "values calculated using",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1092 resampleName)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1093 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1094 print(resultsPlot)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1095 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1096 \caption[Performance Plot]{\Sexpr{plotCaption}.}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1097 \label{F:profile}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1098 \end{center}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1099 \end{figure}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1100
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1101
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1102 <<stopWorkers, echo = $STOPWORKERSECHO, results = $STOPWORKERSRESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1103 if(numWorkers > 1) stopCluster(cl)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1104 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1105
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1106 <<testPred, results = $TESTPREDRESULT, echo = $TESTPREDECHO>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1107 if((!is.null(testX)) && (!is.null(testY))){
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1108 save(trainX,trainY,testX,testY,file="datasets.RData")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1109 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1110 save(trainX,trainY,file="datasets.RData")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1111 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1112
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1113 if(pctTrain < 1)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1114 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1115 cat("\\clearpage\n\\section*{Test Set Results}\n\n")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1116 classPred <- predict(modelFit, testX)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1117 cm <- confusionMatrix(classPred, testY)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1118 values <- cm$$overall[c("Accuracy", "Kappa", "AccuracyPValue", "McnemarPValue")]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1119
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1120 values <- values[!is.na(values) & !is.nan(values)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1121 values <- c(format(values[1:2], digits = 3),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1122 format.pval(values[-(1:2)], digits = 5))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1123 nms <- c("the overall accuracy", "the Kappa statistic",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1124 "the $$p$$--value that accuracy is greater than the no--information rate",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1125 "the $$p$$--value of concordance from McNemar's test")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1126 nms <- nms[seq(along = values)]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1127 names(values) <- nms
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1128
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1129 if(any(modelInfo$$probModel))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1130 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1131 classProbs <- extractProb(list(fit = modelFit),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1132 testX = testX,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1133 testY = testY)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1134 classProbs <- subset(classProbs, dataType == "Test")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1135 if(numClasses == 2)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1136 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1137 tmp <- twoClassSummary(classProbs, lev = levels(classProbs$$obs))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1138 tmp <- c(format(tmp, digits = 3))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1139 names(tmp) <- c("the area under the ROC curve", "the sensitivity", "the specificity")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1140
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1141 values <- c(values, tmp)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1142
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1143 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1144 probPlot <- plotClassProbs(classProbs)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1145 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1146 testString <- paste("Based on the test set of",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1147 nrow(testX),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1148 "samples,",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1149 listString(paste(names(values), "was", values), period = TRUE),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1150 "The confusion matrix for the test set is shown in Table",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1151 "\\\\ref{T:cm}.")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1152 testString <- paste(testString,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1153 " Using ", resampleName,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1154 ", the training set estimates were ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1155 resampleStats(modelFit),
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1156 ".",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1157 sep = "")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1158
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1159 if(any(modelInfo$$probModel)) testString <- paste(testString,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1160 "Histograms of the class probabilities",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1161 "for the test set samples are shown in",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1162 "Figure \\\\ref{F:probs}",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1163 ifelse(numClasses == 2,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1164 " and the test set ROC curve is in Figure \\\\ref{F:roc}.",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1165 "."))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1166
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1167
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1168
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1169 latex(cm$$table,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1170 title = "",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1171 file = "",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1172 where = "h",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1173 cgroup = "Observed Values",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1174 n.cgroup = numClasses,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1175 caption = "The confusion matrix for the test set",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1176 label = "T:cm")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1177
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1178 } else testString <- ""
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1179 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1180 \Sexpr{testString}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1181
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1182
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1183 <<classProbsTex, results = $CLASSPROBSTEXRESULT, echo = $CLASSPROBSTEXECHO>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1184 if(any(modelInfo$probModel) && pctTrain < 1 ) {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1185 cat(
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1186 paste("\\begin{figure}[p]\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1187 "\\begin{center}\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1188 "\\includegraphics{classProbs}",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1189 "\\caption[PCA Plot]{Class probabilities",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1190 "for the test set. Each panel contains ",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1191 "separate classes}\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1192 "\\label{F:probs}\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1193 "\\end{center}\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1194 "\\end{figure}"))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1195 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1196 if(any(modelInfo$$probModel) & numClasses == 2 & pctTrain < 1 )
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1197 {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1198 cat(
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1199 paste("\\begin{figure}[p]\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1200 "\\begin{center}\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1201 "\\includegraphics[clip, width = .8\\textwidth]{roc}",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1202 "\\caption[ROC Plot]{ROC Curve",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1203 "for the test set.}\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1204 "\\label{F:roc}\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1205 "\\end{center}\n",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1206 "\\end{figure}"))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1207 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1208 cat (paste(""))
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1209 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1210
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1211 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1212 <<classProbsTex, results = $CLASSPROBSTEXRESULT1, echo = $CLASSPROBSTEXECHO1 >>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1213 if(any(modelInfo$probModel) && pctTrain < 1) {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1214 pdf("classProbs.pdf", height = 7, width = 7)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1215 trellis.par.set(caretTheme(), warn = FALSE)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1216 print(probPlot)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1217 dev.off()
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1218 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1219 if(any(modelInfo$probModel) & numClasses == 2 & pctTrain < 1) {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1220 resPonse<-testY
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1221 preDictor<-classProbs[, levels(trainY)[1]]
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1222 pdf("roc.pdf", height = 8, width = 8)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1223 # from pROC example at http://web.expasy.org/pROC/screenshots.htm
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1224 plot.roc(resPonse, preDictor, # data
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1225 percent=TRUE, # show all values in percent
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1226 partial.auc=c(100, 90), partial.auc.correct=TRUE, # define a partial AUC (pAUC)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1227 print.auc=TRUE, #display pAUC value on the plot with following options:
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1228 print.auc.pattern="Corrected pAUC (100-90%% SP):\n%.1f%%", print.auc.col="#1c61b6",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1229 auc.polygon=TRUE, auc.polygon.col="#1c61b6", # show pAUC as a polygon
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1230 max.auc.polygon=TRUE, max.auc.polygon.col="#1c61b622", # also show the 100% polygon
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1231 main="Partial AUC (pAUC)")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1232 plot.roc(resPonse, preDictor,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1233 percent=TRUE, add=TRUE, type="n", # add to plot, but don't re-add the ROC itself (useless)
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1234 partial.auc=c(100, 90), partial.auc.correct=TRUE,
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1235 partial.auc.focus="se", # focus pAUC on the sensitivity
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1236 print.auc=TRUE, print.auc.pattern="Corrected pAUC (100-90%% SE):\n%.1f%%", print.auc.col="#008600",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1237 print.auc.y=40, # do not print auc over the previous one
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1238 auc.polygon=TRUE, auc.polygon.col="#008600",
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1239 max.auc.polygon=TRUE, max.auc.polygon.col="#00860022")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1240 dev.off()
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1241 } else {
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1242 cat("")
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1243 }
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1244
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1245 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1246
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1247 \section*{Versions}
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1248
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1249 <<versions, echo = FALSE, results = tex>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1250 toLatex(sessionInfo())
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1251
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1252 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1253
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1254 <<save-data, echo = $SAVEDATAECHO, results = $SAVEDATARESULT>>=
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1255 ## change this to the name of modName....
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1256 Fit <- modelFit
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1257 if(exists('ppInfo') && !is.null(ppInfo)){
1
ab806d671e22 planemo upload commit d09b58f291738a8b6c646eae5bb4c4bcea37f360
anmoljh
parents: 0
diff changeset
1258 save(Fit,ppInfo,cm,file="$METHOD-Fit.RData")
ab806d671e22 planemo upload commit d09b58f291738a8b6c646eae5bb4c4bcea37f360
anmoljh
parents: 0
diff changeset
1259 } else {save(Fit,cm,file="$METHOD-Fit.RData")}
0
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1260
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1261 @
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1262 The model was built using $METHOD and is saved as $METHOD Model for reuse. This contains the variable Fit.
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1263
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1264 \end{document}'''
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1265
c3cbddd52970 planemo upload commit e713bcfa1b1690f9a21ad0bd796c2d385f646e66-dirty
anmoljh
parents:
diff changeset
1266 return template4Rnw