Mercurial > repos > devteam > annotation_profiler
comparison annotation_profiler.xml @ 0:4414f0739808 draft default tip
Imported from capsule None
| author | devteam |
|---|---|
| date | Mon, 19 May 2014 10:59:42 -0400 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:4414f0739808 |
|---|---|
| 1 <tool id="Annotation_Profiler_0" name="Profile Annotations" version="1.0.0"> | |
| 2 <description>for a set of genomic intervals</description> | |
| 3 <requirements> | |
| 4 <requirement type="package" version="0.7.1">bx-python</requirement> | |
| 5 </requirements> | |
| 6 <command interpreter="python">annotation_profiler_for_interval.py -i $input1 -c ${input1.metadata.chromCol} -s ${input1.metadata.startCol} -e ${input1.metadata.endCol} -o $out_file1 $keep_empty -p ${GALAXY_DATA_INDEX_DIR}/annotation_profiler/$dbkey $summary -b 3 -t $table_names</command> | |
| 7 <inputs> | |
| 8 <param format="interval" name="input1" type="data" label="Choose Intervals"> | |
| 9 <validator type="dataset_metadata_in_file" filename="annotation_profiler_valid_builds.txt" metadata_name="dbkey" metadata_column="0" message="Profiling is not currently available for this species."/> | |
| 10 </param> | |
| 11 <param name="keep_empty" type="select" label="Keep Region/Table Pairs with 0 Coverage"> | |
| 12 <option value="-k">Keep</option> | |
| 13 <option value="" selected="true">Discard</option> | |
| 14 </param> | |
| 15 <param name="summary" type="select" label="Output per Region/Summary"> | |
| 16 <option value="-S">Summary</option> | |
| 17 <option value="" selected="true">Per Region</option> | |
| 18 </param> | |
| 19 <param name="table_names" type="drill_down" display="checkbox" hierarchy="recurse" multiple="true" label="Choose Tables to Use" help="Selecting no tables will result in using all tables." from_file="annotation_profiler_options.xml"/> | |
| 20 </inputs> | |
| 21 <outputs> | |
| 22 <data format="input" name="out_file1"> | |
| 23 <change_format> | |
| 24 <when input="summary" value="-S" format="tabular" /> | |
| 25 </change_format> | |
| 26 </data> | |
| 27 </outputs> | |
| 28 <tests> | |
| 29 <test> | |
| 30 <param name="input1" value="4.bed" dbkey="hg18"/> | |
| 31 <param name="keep_empty" value=""/> | |
| 32 <param name="summary" value=""/> | |
| 33 <param name="table_names" value="acembly,affyGnf1h,knownAlt,knownGene,mrna,multiz17way,multiz28way,refGene,snp126"/> | |
| 34 <output name="out_file1" file="annotation_profiler_1.out" /> | |
| 35 </test> | |
| 36 <test> | |
| 37 <param name="input1" value="3.bed" dbkey="hg18"/> | |
| 38 <param name="keep_empty" value=""/> | |
| 39 <param name="summary" value="Summary"/> | |
| 40 <param name="table_names" value="acembly,affyGnf1h,knownAlt,knownGene,mrna,multiz17way,multiz28way,refGene,snp126"/> | |
| 41 <output name="out_file1" file="annotation_profiler_2.out" /> | |
| 42 </test> | |
| 43 </tests> | |
| 44 <help> | |
| 45 **What it does** | |
| 46 | |
| 47 Takes an input set of intervals and for each interval determines the base coverage of the interval by a set of features (tables) available from UCSC. Genomic regions from the input feature data have been merged by overlap / direct adjacency (e.g. a table having ranges of: 1-10, 6-12, 12-20 and 25-28 results in two merged ranges of: 1-20 and 25-28). | |
| 48 | |
| 49 By default, this tool will check the coverage of your intervals against all available features; you may, however, choose to select only those tables that you want to include. Selecting a section heading will effectively cause all of its children to be selected. | |
| 50 | |
| 51 You may alternatively choose to receive a summary across all of the intervals that you provide. | |
| 52 | |
| 53 ----- | |
| 54 | |
| 55 **Example** | |
| 56 | |
| 57 Using the interval below and selecting several tables:: | |
| 58 | |
| 59 chr1 4558 14764 uc001aab.1 0 - | |
| 60 | |
| 61 results in:: | |
| 62 | |
| 63 chr1 4558 14764 uc001aab.1 0 - snp126Exceptions 151 142 | |
| 64 chr1 4558 14764 uc001aab.1 0 - genomicSuperDups 10206 1 | |
| 65 chr1 4558 14764 uc001aab.1 0 - chainOryLat1 3718 1 | |
| 66 chr1 4558 14764 uc001aab.1 0 - multiz28way 10206 1 | |
| 67 chr1 4558 14764 uc001aab.1 0 - affyHuEx1 3553 32 | |
| 68 chr1 4558 14764 uc001aab.1 0 - netXenTro2 3050 1 | |
| 69 chr1 4558 14764 uc001aab.1 0 - intronEst 10206 1 | |
| 70 chr1 4558 14764 uc001aab.1 0 - xenoMrna 10203 1 | |
| 71 chr1 4558 14764 uc001aab.1 0 - ctgPos 10206 1 | |
| 72 chr1 4558 14764 uc001aab.1 0 - clonePos 10206 1 | |
| 73 chr1 4558 14764 uc001aab.1 0 - chainStrPur2Link 1323 29 | |
| 74 chr1 4558 14764 uc001aab.1 0 - affyTxnPhase3HeLaNuclear 9011 8 | |
| 75 chr1 4558 14764 uc001aab.1 0 - snp126orthoPanTro2RheMac2 61 58 | |
| 76 chr1 4558 14764 uc001aab.1 0 - snp126 205 192 | |
| 77 chr1 4558 14764 uc001aab.1 0 - chainEquCab1 10206 1 | |
| 78 chr1 4558 14764 uc001aab.1 0 - netGalGal3 3686 1 | |
| 79 chr1 4558 14764 uc001aab.1 0 - phastCons28wayPlacMammal 10172 3 | |
| 80 | |
| 81 Where:: | |
| 82 | |
| 83 The first added column is the table name. | |
| 84 The second added column is the number of bases covered by the table. | |
| 85 The third added column is the number of regions from the table that is covered by the interval. | |
| 86 | |
| 87 Alternatively, requesting a summary, using the intervals below and selecting several tables:: | |
| 88 | |
| 89 chr1 4558 14764 uc001aab.1 0 - | |
| 90 chr1 4558 19346 uc001aac.1 0 - | |
| 91 | |
| 92 results in:: | |
| 93 | |
| 94 #tableName tableSize tableRegionCount allIntervalCount allIntervalSize allCoverage allTableRegionsOverlaped allIntervalsOverlapingTable nrIntervalCount nrIntervalSize nrCoverage nrTableRegionsOverlaped nrIntervalsOverlapingTable | |
| 95 snp126Exceptions 133601 92469 2 24994 388 359 2 1 14788 237 217 1 | |
| 96 genomicSuperDups 12268847 657 2 24994 24994 2 2 1 14788 14788 1 1 | |
| 97 chainOryLat1 70337730 2542 2 24994 7436 2 2 1 14788 3718 1 1 | |
| 98 affyHuEx1 15703901 112274 2 24994 7846 70 2 1 14788 4293 38 1 | |
| 99 netXenTro2 111440392 1877 2 24994 6100 2 2 1 14788 3050 1 1 | |
| 100 snp126orthoPanTro2RheMac2 700436 690674 2 24994 124 118 2 1 14788 63 60 1 | |
| 101 intronEst 135796064 2332 2 24994 24994 2 2 1 14788 14788 1 1 | |
| 102 xenoMrna 129031327 1586 2 24994 20406 2 2 1 14788 10203 1 1 | |
| 103 snp126 956976 838091 2 24994 498 461 2 1 14788 293 269 1 | |
| 104 clonePos 224999719 39 2 24994 24994 2 2 1 14788 14788 1 1 | |
| 105 chainStrPur2Link 7948016 119841 2 24994 2646 58 2 1 14788 1323 29 1 | |
| 106 affyTxnPhase3HeLaNuclear 136797870 140244 2 24994 22601 17 2 1 14788 13590 9 1 | |
| 107 multiz28way 225928588 38 2 24994 24994 2 2 1 14788 14788 1 1 | |
| 108 ctgPos 224999719 39 2 24994 24994 2 2 1 14788 14788 1 1 | |
| 109 chainEquCab1 246306414 141 2 24994 24994 2 2 1 14788 14788 1 1 | |
| 110 netGalGal3 203351973 461 2 24994 7372 2 2 1 14788 3686 1 1 | |
| 111 phastCons28wayPlacMammal 221017670 22803 2 24994 24926 6 2 1 14788 14754 3 1 | |
| 112 | |
| 113 Where:: | |
| 114 | |
| 115 tableName is the name of the table | |
| 116 tableChromosomeCoverage is the number of positions existing in the table for only the chromosomes that were referenced by the interval file | |
| 117 tableChromosomeCount is the number of regions existing in the table for only the chromosomes that were referenced by the interval file | |
| 118 tableRegionCoverage is the number of positions existing in the table between the minimal and maximal bounding regions that were referenced by the interval file | |
| 119 tableRegionCount is the number of regions existing in the table between the minimal and maximal bounding regions that were referenced by the interval file | |
| 120 | |
| 121 allIntervalCount is the number of provided intervals | |
| 122 allIntervalSize is the sum of the lengths of the provided interval file | |
| 123 allCoverage is the sum of the coverage for each provided interval | |
| 124 allTableRegionsOverlapped is the sum of the number of regions of the table (non-unique) that were overlapped for each interval | |
| 125 allIntervalsOverlappingTable is the number of provided intervals which overlap the table | |
| 126 | |
| 127 nrIntervalCount is the number of non-redundant intervals | |
| 128 nrIntervalSize is the sum of the lengths of non-redundant intervals | |
| 129 nrCoverage is the sum of the coverage of non-redundant intervals | |
| 130 nrTableRegionsOverlapped is the number of regions of the table (unique) that were overlapped by the non-redundant intervals | |
| 131 nrIntervalsOverlappingTable is the number of non-redundant intervals which overlap the table | |
| 132 | |
| 133 | |
| 134 .. class:: infomark | |
| 135 | |
| 136 **TIP:** non-redundant (nr) refers to the set of intervals that remains after the intervals provided have been merged to resolve overlaps | |
| 137 | |
| 138 ------ | |
| 139 | |
| 140 **Citation** | |
| 141 | |
| 142 For the underlying data, please see http://genome.ucsc.edu/cite.html for the proper citation. | |
| 143 | |
| 144 If you use this tool in Galaxy, please cite Blankenberg D, et al. *In preparation.* | |
| 145 | |
| 146 </help> | |
| 147 </tool> |
