0
|
1 <tool name="GCA: Gene centered annotation" id="ceas_gca" version="0.1.0">
|
|
2 <description>Find the nearest interval in the given intervals set fo every annotated coding gene</description>
|
|
3 <macros>
|
|
4 <import>ceas_macros.xml</import>
|
|
5 </macros>
|
|
6 <expand macro="requirements" />
|
|
7 <command>
|
|
8 gca -b $bfile --span=$span
|
|
9 #include source=$gtpath_ceasdb_ref#
|
|
10 --name=$name &> $log
|
|
11 </command>
|
|
12 <inputs>
|
|
13 <param name="name" type="hidden" value="gca_out"/>
|
|
14 <param ftype="bed" format="bed" name="bfile" type="data" label="BED file(100,000 lines max)">
|
|
15 <validator type="unspecified_build" />
|
|
16 </param>
|
|
17 <expand macro="ceasdb_ref" />
|
|
18 <param name="span" type="text" label="Span" value="3000">
|
|
19 <validator type="in_range" max="1000000" min="100" message="Span is out of range, Span has to be between 100 to 1000000" />
|
|
20 </param>
|
|
21 </inputs>
|
|
22 <outputs>
|
|
23 <data format="xls" name="output" from_work_dir="gca_out.xls"/>
|
|
24 <data format="txt" name="log" label="GCA job log"/>
|
|
25 </outputs>
|
|
26 <expand macro="stdio" />
|
|
27 <tests>
|
|
28 <test maxseconds="3600" name="GCA_1">
|
|
29 <param name="bfile" value="peaks.bed" />
|
|
30 <param name="span" value="3000" />
|
|
31 <param name="refsrc" value="history"/>
|
|
32 <param name="gdb" ftype="ceasdb" value="mm9.refGene.ceasdb"/>
|
|
33 <output name="output">
|
|
34 <assert_contents>
|
|
35 <has_text_matching expression="NM_013495\tchr19\t3323300\t3385733\t+\t2994\t754\t31798\t224353\t0.07\t0.26\t0.12\t0.03\t0.0\t0.0\t0.0" />
|
|
36 </assert_contents>
|
|
37 </output>
|
|
38 </test>
|
|
39 </tests>
|
|
40 <help>
|
|
41 This tool finds the nearest binding sites in the given BED file for
|
|
42 every annotated coding gene. It's a module in CEAS package which is
|
|
43 written by Hyunjin Gene Shin, published in Bioinformatics (pubmed
|
|
44 id:19689956).
|
|
45
|
|
46 @EXTERNAL_DOCUMENTATION@
|
|
47
|
|
48 @CITATION_SECTION@
|
|
49
|
|
50 .. class:: warningmark
|
|
51
|
|
52 **NEED IMPROVEMENT**
|
|
53
|
|
54 -----
|
|
55
|
|
56 **Parameters**
|
|
57
|
|
58 - **BED file** contains the transcription factor binding sites,
|
|
59 generally the BED files for peaks from peak calling tools.
|
|
60 - **Span** is the span for ChIP regions.
|
|
61 - **Genome Annotation Version** to specify the annotations according to
|
|
62 the data set. The annotations are downloaded from UCSC genome site.
|
|
63
|
|
64 -----
|
|
65
|
|
66 **Output**
|
|
67
|
|
68 - **XLS file** is the tab-delimited file.
|
|
69
|
|
70 -----
|
|
71
|
|
72 **script parameter list of GCA**
|
|
73
|
|
74 Options:
|
|
75 --version show program's version number and exit
|
|
76 -h, --help Show this help message and exit.
|
|
77 -b BED, --bed=BED BED file of ChIP regions.
|
|
78 -g GDB, --gt=GDB Gene annotation table. This can be a sqlite3 local db
|
|
79 file, BED file or genome version of UCSC. The BED file
|
|
80 must have an extension of '.bed'
|
|
81 --span=SPAN Span in search of ChIP regions from TSS and TTS,
|
|
82 DEFAULT=3000bp
|
|
83 --name=NAME Experiment name. This will be used to name the output
|
|
84 file. If an experiment name is not given, input BED
|
|
85 file name will be used instead.
|
|
86 --gn-group=GN_GROUP A particular group of genes of interest. If a txt file
|
|
87 with one column of gene names (eg RefSeq IDs in case of
|
|
88 using a refGene table) is given, gca returns the gene-
|
|
89 centered annotation of this particular gene group.
|
|
90 --gname2=NAME2 The gene names of --gn-group will be regarded as
|
|
91 'name2.' See the schema of the gene annotation table.
|
|
92
|
|
93 </help>
|
|
94
|
|
95 </tool>
|