105
+ − 1 <tool name="Insertion size metrics" id="PicardInsertSize" version="1.106.0">
+ − 2 <description>for PAIRED data</description>
+ − 3 <requirements><requirement type="package" version="1.106.0">picard</requirement></requirements>
+ − 4 <command interpreter="python">
+ − 5 picard_wrapper.py -i "${input_file}" -n "${out_prefix}" --tmpdir "${__new_file_path__}" --deviations "${deviations}"
+ − 6 --histwidth "${histWidth}" --minpct "${minPct}" --malevel "${malevel}"
+ − 7 -j "\$JAVA_JAR_PATH/CollectInsertSizeMetrics.jar" -d "${html_file.files_path}" -t "${html_file}"
+ − 8 </command>
+ − 9 <inputs>
+ − 10 <param format="bam,sam" name="input_file" type="data" label="SAM/BAM dataset to generate statistics for"
+ − 11 help="If empty, upload or import a SAM/BAM dataset."/>
+ − 12 <param name="out_prefix" value="Insertion size metrics" type="text"
+ − 13 label="Title for the output file" help="Use this remind you what the job was for" size="120" />
+ − 14 <param name="deviations" value="10.0" type="float"
+ − 15 label="Deviations" size="5"
+ − 16 help="See Picard documentation: Generate mean, sd and plots by trimming the data down to MEDIAN + DEVIATIONS*MEDIAN_ABSOLUTE_DEVIATION" />
+ − 17 <param name="histWidth" value="0" type="integer"
+ − 18 label="Histogram width" size="5"
+ − 19 help="Explicitly sets the histogram width option - leave 0 to ignore" />
+ − 20 <param name="minPct" value="0.05" type="float"
+ − 21 label="Minimum percentage" size="5"
+ − 22 help="Discard any data categories (out of FR, TANDEM, RF) that have fewer than this percentage of overall reads" />
+ − 23 <param name="malevel" value="0" type="select" multiple="true" label="Metric Accumulation Level"
+ − 24 help="Level(s) at which metrics will be accumulated">
+ − 25 <option value="ALL_READS" selected="true">All reads (default)</option>
+ − 26 <option value="SAMPLE" default="true">Sample</option>
+ − 27 <option value="LIBRARY" default="true">Library</option>
+ − 28 <option value="READ_GROUP" default="true">Read group</option>
+ − 29 </param>
+ − 30 </inputs>
+ − 31 <outputs>
+ − 32 <data format="html" name="html_file" label="InsertSize_${out_prefix}.html"/>
+ − 33 </outputs>
+ − 34 <tests>
+ − 35 <test>
+ − 36 <param name="input_file" value="picard_input_tiny.sam" />
+ − 37 <param name="out_prefix" value="Insertion size metrics" />
+ − 38 <param name="deviations" value="10.0" />
+ − 39 <param name="histWidth" value="0" />
+ − 40 <param name="minPct" value="0.01" />
+ − 41 <param name="malevel" value="ALL_READS" />
+ − 42 <output name="html_file" file="picard_output_insertsize_tinysam.html" ftype="html" compare="contains" lines_diff="40" />
+ − 43 </test>
+ − 44 </tests>
+ − 45 <help>
+ − 46
+ − 47
+ − 48 .. class:: infomark
+ − 49
+ − 50 **Purpose**
+ − 51
+ − 52 Reads a SAM or BAM file and describes the distribution
+ − 53 of insert size (excluding duplicates) with metrics and a histogram plot.
+ − 54
+ − 55 **Picard documentation**
+ − 56
+ − 57 This is a Galaxy wrapper for CollectInsertSizeMetrics, a part of the external package Picard-tools_.
+ − 58
+ − 59 .. _Picard-tools: http://www.google.com/search?q=picard+samtools
+ − 60
+ − 61 .. class:: warningmark
+ − 62
+ − 63 **Useful for paired data only**
+ − 64
+ − 65 This tool works for paired data only and can be expected to fail for single end data.
+ − 66
+ − 67 -----
+ − 68
+ − 69 .. class:: infomark
+ − 70
+ − 71 **Inputs, outputs, and parameters**
+ − 72
+ − 73 Picard documentation says (reformatted for Galaxy):
+ − 74
+ − 75 .. csv-table::
+ − 76 :header-rows: 1
+ − 77
+ − 78 Option,Description
+ − 79 "INPUT=File","SAM or BAM file Required."
+ − 80 "OUTPUT=File","File to write insert size metrics to Required."
+ − 81 "HISTOGRAM_FILE=File","File to write insert size histogram chart to Required."
+ − 82 "TAIL_LIMIT=Integer","When calculating mean and stdev stop when the bins in the tail of the distribution contain fewer than mode/TAIL_LIMIT items. This also limits how much data goes into each data category of the histogram."
+ − 83 "HISTOGRAM_WIDTH=Integer","Explicitly sets the histogram width, overriding the TAIL_LIMIT option. Also, when calculating mean and stdev, only bins LE HISTOGRAM_WIDTH will be included. "
+ − 84 "MINIMUM_PCT=Float","When generating the histogram, discard any data categories (out of FR, TANDEM, RF) that have fewer than this percentage of overall reads. (Range: 0 to 1) Default value: 0.01."
+ − 85 "STOP_AFTER=Integer","Stop after processing N reads, mainly for debugging. Default value: 0."
+ − 86 "CREATE_MD5_FILE=Boolean","Whether to create an MD5 digest for any BAM files created. Default value: false."
+ − 87
+ − 88 .. class:: warningmark
+ − 89
+ − 90 **Warning on SAM/BAM quality**
+ − 91
+ − 92 Many SAM/BAM files produced externally and uploaded to Galaxy do not fully conform to SAM/BAM specifications. Galaxy deals with this by using the **LENIENT**
+ − 93 flag when it runs Picard, which allows reads to be discarded if they're empty or don't map. This appears
+ − 94 to be the only way to deal with SAM/BAM that cannot be parsed.
+ − 95
+ − 96 </help>
+ − 97 </tool>