annotate GALAXY_FILES/tools/EMBER/PreProcess_Expression_Data.xml @ 1:e62b2ba92070 default tip

Uploaded
author mmaiensc
date Thu, 22 Mar 2012 13:19:59 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
1 <tool id="prep_data" name="PreProcess Expression Data" version="1.3.1">
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
2 <description>Step 1 of analysis: discretizes expression data</description>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
3 <command interpreter="perl">PreProcess_Expression_Data.pl -i $data -c $compslist -a $annot -o $output -p $thresh -l $log -v n</command>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
4 <inputs>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
5 <param format="txt" name="data" type="data" label="Expression data"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
6 <param format="txt" name="compslist" type="data" label="Comparison list"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
7 <param format="txt" name="annot" type="data" label="Annotation file"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
8 <param name="thresh" type="float" min="0" max="1" label="Percentile threshold" value="0.63" optional="true"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
9 <param name="log" type="select" label="Log transform data?">
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
10 <option value="n" selected="true">No</option>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
11 <option value="y">Yes</option>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
12 </param>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
13 </inputs>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
14 <outputs>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
15 <data format="txt" name="output"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
16 </outputs>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
17
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
18 <tests>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
19 <test>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
20 <param name="data" value="EMBER/expression.txt"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
21 <param name="compslist" value="EMBER/comparisons_list.txt"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
22 <param name="annot" value="EMBER/annotation.txt"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
23 <param name="thresh" value="0.63"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
24 <param name="log" value="n"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
25 <output name="output" file="EMBER/expression_profiles.txt"/>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
26 </test>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
27 </tests>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
28
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
29 <help>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
30
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
31 This tool discretizes the gene expression data and adds genomic annotations.
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
32
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
33 More options for the EMBER tools (especially for the main program, EMBER, including searching for multiple expression patterns) are available in the command line version, available at http://dinner-group.uchicago.edu/downloads.html. That package also includes test data and sample outputs.
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
34
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
35 When using any of the EMBER tools, please cite: M Maienschein-Cline, J Zhou, KP White, R Sciammas, and AR Dinner. Discovering transcription factor regulatory targets using gene expression and binding data. *Bioinformatics*, 28:206-213 (2012).
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
36
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
37 -----
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
38
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
39 Description of inputs:
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
40
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
41 *Expression Data*:
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
42
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
43 Microarray data, with data from N experiments (and at least 2 replicates per condition).
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
44
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
45 *Format (N+1 columns)*: [ID] [expt 1 value] [expt 2 value] ... [expt N value]
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
46
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
47 IMPORTANT: the first line should be a title line, first field "#ID", and subsequent fields giving the condition/replicate for each column, i.e.,
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
48
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
49 #ID [condition]#[replicate]...
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
50
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
51 where [condition] matches the values in the Comparison List, and replicate tells which number the file is. [condition] and [replicate] are delimited by a "#" (so don't use that character in the condition name).
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
52
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
53 *Comparison List*:
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
54
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
55 List of behavior dimension definitions. [condition] should match the names in the expression data list.
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
56
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
57 *Format (2 columns)*: [condition1] [condition2]
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
58
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
59 *Annotation File*:
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
60
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
61 Gives the genomic coordinates of each probe set.
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
62
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
63 *Format (6 columns)*: [probe id] [gene name] [chromosome] [start] [end] [strand]
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
64
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
65 *Percentile Threshold* (p):
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
66
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
67 Used to eliminate genes that are consistently expressed at a very low level. All data are concatenated into one list, and the pth percentile of that list is taken as the thresold. Then a probe set is removed if its value is less than the threshold in ALL conditions.
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
68
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
69 p = 1.0 means all probes are retained, p = 0.0 means none are. However, note that this does NOT necessarily imply that 0.63 means 63% of probe sets are retained.
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
70
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
71 *Log Transform*: whether or not to take the log of the data before discretization.
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
72
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
73 </help>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
74
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
75 </tool>
e62b2ba92070 Uploaded
mmaiensc
parents:
diff changeset
76