annotate GALAXY_FILES/tools/PreProcess_Expression_Data.xml @ 0:1ef24fd0c914

Uploaded
author mmaiensc
date Wed, 29 Feb 2012 14:46:05 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
1 <tool id="prep_data" name="PreProcess Expression Data" version="1.3">
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
2 <description>Combines gene expression data</description>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
3 <command interpreter="perl">PreProcess_Expression_Data.pl -i $data -c $compslist -a $annot -o $output -p $thresh -l $log -v n</command>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
4 <inputs>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
5 <param format="txt" name="data" type="data" label="Expression data"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
6 <param format="txt" name="compslist" type="data" label="Comparison list"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
7 <param format="txt" name="annot" type="data" label="Annotation file"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
8 <param name="thresh" type="float" min="0" max="1" label="Percentile threshold" value="0.63" optional="true"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
9 <param name="log" type="select" label="Log transform data?">
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
10 <option value="n" selected="true">No</option>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
11 <option value="y">Yes</option>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
12 </param>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
13 </inputs>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
14 <outputs>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
15 <data format="txt" name="output"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
16 </outputs>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
17
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
18 <tests>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
19 <test>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
20 <param name="data" value="EMBER/expression.txt"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
21 <param name="compslist" value="EMBER/comparisons_list.txt"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
22 <param name="annot" value="EMBER/annotation.txt"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
23 <param name="thresh" value="0.63"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
24 <param name="log" value="n"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
25 <output name="output" file="EMBER/expression_profiles.txt"/>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
26 </test>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
27 </tests>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
28
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
29 <help>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
30
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
31 This tool discretizes the gene expression data and adds genomic annotations.
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
32
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
33 More options for the EMBER tools (especially for the main program, EMBER, including searching for multiple expression patterns) are available in the command line version, available at http://dinner-group.uchicago.edu/downloads.html. That package also includes test data and sample outputs.
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
34
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
35 When using any of the EMBER tools, please cite: M Maienschein-Cline, J Zhou, KP White, R Sciammas, and AR Dinner. Discovering transcription factor regulatory targets using gene expression and binding data. *Bioinformatics*, 28:206-213 (2012).
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
36
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
37 -----
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
38
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
39 Description of inputs:
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
40
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
41 *Expression Data*:
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
42
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
43 Microarray data, with data from N experiments (and at least 2 replicates per condition).
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
44
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
45 *Format (N+1 columns)*: [ID] [expt 1 value] [expt 2 value] ... [expt N value]
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
46
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
47 IMPORTANT: the first line should be a title line, first field "#ID", and subsequent fields giving the condition/replicate for each column, i.e.,
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
48
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
49 #ID [condition]#[replicate]...
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
50
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
51 where [condition] matches the values in the Comparison List, and replicate tells which number the file is. [condition] and [replicate] are delimited by a "#" (so don't use that character in the condition name).
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
52
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
53 *Comparison List*:
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
54
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
55 List of behavior dimension definitions. [condition] should match the names in the expression data list.
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
56
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
57 *Format (2 columns)*: [condition1] [condition2]
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
58
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
59 *Annotation File*:
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
60
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
61 Gives the genomic coordinates of each probe set.
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
62
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
63 *Format (6 columns)*: [probe id] [gene name] [chromosome] [start] [end] [strand]
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
64
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
65 *Percentile Threshold* (p):
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
66
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
67 Used to eliminate genes that are consistently expressed at a very low level. All data are concatenated into one list, and the pth percentile of that list is taken as the thresold. Then a probe set is removed if its value is less than the threshold in ALL conditions.
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
68
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
69 p = 1.0 means all probes are retained, p = 0.0 means none are. However, note that this does NOT necessarily imply that 0.63 means 63% of probe sets are retained.
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
70
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
71 *Log Transform*: whether or not to take the log of the data before discretization.
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
72
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
73 </help>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
74
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
75 </tool>
1ef24fd0c914 Uploaded
mmaiensc
parents:
diff changeset
76