annotate EDeN_train.xml @ 0:99091a5d5c84 draft

Uploaded
author bgruening
date Wed, 04 Sep 2013 05:10:04 -0400
parents
children a3edc97e056c
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
1 <tool id="bg_eden_train" name="EDeN Train" version="0.1">
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
2 <description></description>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
3 <requirements>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
4 </requirements>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
5 <command>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
6 EDeN --action TRAIN
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
7
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
8 --input_data_file_name $infile
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
9 --file_type "SPARSE_VECTOR"
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
10 --binary_file_type
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
11
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
12 ## TODO: we need a tool that creates such a file, maybe from the metadata of an SDF file
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
13 ## target_file_name is a file with 1 or -1 one in each row, indicating the class
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
14 --target_file_name $target_infile
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
15 --model_file_name $model_outfile
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
16
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
17 --lambda $lambda ##??? notation?
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
18 --epochs $epoch
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
19
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
20 --sparsification_num_iterations $sparsification_num_iterations
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
21 --topological_regularization_num_neighbors $topological_regularization_num_neighbors
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
22 --topological_regularization_decay_rate $topological_regularization_decay_rate
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
23
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
24 --num_iterations $num_iterations
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
25 --threshold $threshold
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
26 --only_positive $only_positive
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
27 --only_negative $only_negative
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
28
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
29 --random_seed $random_seed
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
30
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
31 </command>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
32 <inputs>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
33 <param format="eden_sparse_vector" name="infile" type="data" label="Input Graph" help=""/>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
34 <param format="txt" name="target_infile" type="data" label="Target file" help=""/>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
35
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
36 <param name="kernel_type" type="select" display="radio" label="Type of the Kernel">
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
37 <option value="NSPDK">NSPDK</option>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
38 <option value="WDK">WDK</option>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
39 <option value="PBK">PBK</option>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
40 <option value="USPK">USPK</option>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
41 <option value="DDK">DDK</option>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
42 <option value="NSDDK">ANSDDK</option>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
43 <option value="SK">SK [NSPDK]</option>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
44 </param>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
45
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
46 <param name="graph_type" type="select" display="radio" label="Type of Graph">
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
47 <option value="DIRECTED">directed</option>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
48 <option value="UNDIRECTED">undirected</option>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
49 </param>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
50
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
51 <param name="epoch" type="integer" value="10" label="Epoch, Stochastic gradient descend algorithm." help="">
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
52 <validator type="in_range" min="1" />
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
53 </param>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
54 <param name="lambda" type="text" value="1e-4" label="lambda, Stochastic gradient descend algorithm." help="" />
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
55
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
56 </inputs>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
57 <outputs>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
58 <data format="txt" name="model_outfile" label="Train Model from ${on_string}"/>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
59 </outputs>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
60 <tests>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
61 <test>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
62 <param name="infile" value="3_molceuls.sdf" />
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
63 <output name="outfile" file="3_molecules.gspan" />
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
64 </test>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
65 </tests>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
66 <help>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
67
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
68 .. class:: infomark
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
69
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
70 **What it does**
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
71
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
72 The linear model is induced using the accelerated stochastic gradient descent technique by Léon Bottou and Yann LeCun.
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
73 When the target information is 0, a self-training algorithm is used to impute a positive or negative class to the unsupervised instances.
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
74 If the target information is imbalanced a minority class resampling technique is used to rebalance the training set.
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
75
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
76 This tool is part of the EDeN (Explicit Decomposition with Neighborhoods) suite, developed by Fabrizio Costa.
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
77
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
78
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
79 REFERENCES
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
80 ==========
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
81
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
82 The code for Stochastic Gradient Descent SVM is adapted from http://leon.bottou.org/projects/sgd. Léon Bottou and Yann LeCun, ''Large Scale Online Learning'', Advances in Neural Information Processing Systems 16, Edited by Sebastian Thrun, Lawrence Saul and Bernhard Schölkopf, MIT Press, Cambridge, MA, 2004.
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
83
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
84
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
85
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
86 </help>
99091a5d5c84 Uploaded
bgruening
parents:
diff changeset
87 </tool>