Mercurial > repos > bgruening > bedtools_test_bag
changeset 2:662c1741c22d draft
Uploaded
author | bgruening |
---|---|
date | Fri, 10 Jan 2014 11:51:13 -0500 |
parents | 3b3e7774f51a |
children | 07390b1a7bdc |
files | README.rst bamToBed.xml bedtools-galaxy/bamToBed.xml bedtools-galaxy/coverageBed_counts.xml bedtools-galaxy/genomeCoverageBed_bedgraph.xml bedtools-galaxy/genomeCoverageBed_histogram.xml bedtools-galaxy/intersectBed.xml bedtools-galaxy/multiIntersectBed.xml bedtools-galaxy/sortBed.xml bedtools-galaxy/tool_dependencies.xml bedtools-galaxy/unionBedGraphs.xml coverageBed_counts.xml genomeCoverageBed_bedgraph.xml genomeCoverageBed_histogram.xml intersectBed.xml multiIntersectBed.xml sortBed.xml tool_dependencies.xml unionBedGraphs.xml |
diffstat | 19 files changed, 959 insertions(+), 950 deletions(-) [+] |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.rst Fri Jan 10 11:51:13 2014 -0500 @@ -0,0 +1,1 @@ +This repository houses Galaxy wrappers for BEDTools. \ No newline at end of file
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/bamToBed.xml Fri Jan 10 11:51:13 2014 -0500 @@ -0,0 +1,66 @@ +<tool id="bedtools_bamtobed" name="Convert from BAM to BED" version="0.2.0"> + + <description> + </description> + + <requirements> + <requirement type="package" version="2.17.0_5e4507c54355a4a38c6d3e7497a2836a123c6655">bedtools</requirement> + </requirements> + +<command> + bamToBed $option $ed_score -i '$input' > '$output' + #if str($tag): + -tag $tag + #end if +</command> + +<inputs> + <param format="bam" name="input" type="data" label="Convert the following BAM file to BED"/> + <param name="option" type="select" label="What type of BED output would you like"> + <option value="">Create a 6-column BED file.</option> + <option value="-bed12">Create a full, 12-column "blocked" BED file.</option> + <option value="-bedpe">Create a paired-end, BEDPE format.</option> + </param> + <param name="split" type="boolean" label="Report spliced BAM alignments as separate BED entries" truevalue="-split" falsevalue="" checked="false"/> + <param name="ed_score" type="boolean" label="Use alignment's edit-distance for BED score" truevalue="-ed" falsevalue="" checked="false"/> + <param name="tag" type="text" optional="true" label="Use other NUMERIC BAM alignment tag as the BED score"/> +</inputs> + +<outputs> + <data format="bed" name="output" metadata_source="input" label="${input.name} (as BED)"/> +</outputs> + +<help> + +**What it does** + +This tool converts a BAM file to a BED file. The end coordinate is computed +by inspecting the CIGAR string. The QNAME for the alignment is used as the +BED name field and, by default, the MAPQ is used as the BED score. + +.. class:: infomark + +The "Report spliced BAM alignment..." option breaks BAM alignments with the "N" (splice) operator into distinct BED entries. For example, using this option on a CIGAR such as 50M1000N50M would, by default, produce a single BED record that spans 1100bp. However, using this option, it would create two separate BED records that are each 50bp in size and are separated by 1000bp (the size of the N operation). This is important for RNA-seq and structural variation experiments. + + +.. class:: warningmark + +If using a custom BAM alignment TAG as the BED score, note that this must be a numeric tag (e.g., type "i" as in NM:i:0). + +.. class:: warningmark + +If creating a BEDPE output (see output formatting options), the BAM file should be sorted by query name. + + +------ + +This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ + + .. __: http://code.google.com/p/bedtools/ + .. __: http://cphg.virginia.edu/quinlan/ + .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short + + + +</help> +</tool>
--- a/bedtools-galaxy/bamToBed.xml Tue Jan 08 08:54:50 2013 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,66 +0,0 @@ -<tool id="bedtools_bamtobed" name="Convert from BAM to BED" version="0.2.0"> - - <description> - </description> - - <requirements> - <requirement type="package" version="2.17.0_5e4507c54355a4a38c6d3e7497a2836a123c6655">bedtools</requirement> - </requirements> - -<command> - bamToBed $option $ed_score -i '$input' > '$output' - #if str($tag): - -tag $tag - #end if -</command> - -<inputs> - <param format="bam" name="input" type="data" label="Convert the following BAM file to BED"/> - <param name="option" type="select" label="What type of BED output would you like"> - <option value="">Create a 6-column BED file.</option> - <option value="-bed12">Create a full, 12-column "blocked" BED file.</option> - <option value="-bedpe">Create a paired-end, BEDPE format.</option> - </param> - <param name="split" type="boolean" label="Report spliced BAM alignments as separate BED entries" truevalue="-split" falsevalue="" checked="false"/> - <param name="ed_score" type="boolean" label="Use alignment's edit-distance for BED score" truevalue="-ed" falsevalue="" checked="false"/> - <param name="tag" type="text" optional="true" label="Use other NUMERIC BAM alignment tag as the BED score"/> -</inputs> - -<outputs> - <data format="bed" name="output" metadata_source="input" label="${input.name} (as BED)"/> -</outputs> - -<help> - -**What it does** - -This tool converts a BAM file to a BED file. The end coordinate is computed -by inspecting the CIGAR string. The QNAME for the alignment is used as the -BED name field and, by default, the MAPQ is used as the BED score. - -.. class:: infomark - -The "Report spliced BAM alignment..." option breaks BAM alignments with the "N" (splice) operator into distinct BED entries. For example, using this option on a CIGAR such as 50M1000N50M would, by default, produce a single BED record that spans 1100bp. However, using this option, it would create two separate BED records that are each 50bp in size and are separated by 1000bp (the size of the N operation). This is important for RNA-seq and structural variation experiments. - - -.. class:: warningmark - -If using a custom BAM alignment TAG as the BED score, note that this must be a numeric tag (e.g., type "i" as in NM:i:0). - -.. class:: warningmark - -If creating a BEDPE output (see output formatting options), the BAM file should be sorted by query name. - - ------- - -This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ - - .. __: http://code.google.com/p/bedtools/ - .. __: http://cphg.virginia.edu/quinlan/ - .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short - - - -</help> -</tool>
--- a/bedtools-galaxy/coverageBed_counts.xml Tue Jan 08 08:54:50 2013 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,65 +0,0 @@ -<tool id="bedtools_coveragebed_counts" name="Count intervals in one file overlapping intervals in another file" version="0.2.0"> - - <description> - </description> - - <requirements> - <requirement type="package" version="2.17.0_5e4507c54355a4a38c6d3e7497a2836a123c6655">bedtools</requirement> - </requirements> - -<command> - coverageBed - #if $inputA.ext == "bam" - -abam '$inputA' - #else - -a '$inputA' - #end if - -b '$inputB' - -counts - $split - $strand - | sort -k1,1 -k2,2n - > '$output' -</command> - -<inputs> - <param format="bed,bam" name="inputA" type="data" label="Count how many intervals in this BED or BAM file (source)"> - <validator type="unspecified_build" /> - </param> - <param format="bed" name="inputB" type="data" label="overlap the intervals in this BED file (target)"> - <validator type="unspecified_build" /> - </param> - <param name="split" type="boolean" checked="false" truevalue="-split" falsevalue="" label="Treat split/spliced BAM or BED12 entries as distinct BED intervals when computing coverage." help="If set, the coverage will be calculated based the spliced intervals only. For BAM files, this inspects the CIGAR N operation to infer the blocks for computing coverage. For BED12 files, this inspects the BlockCount, BlockStarts, and BlockEnds fields (i.e., columns 10,11,12). If this option is not set, coverage will be calculated based on the interval's START/END coordinates, and would include introns in the case of RNAseq data." /> - - <param name="strand" type="select" label="Count"> - <option value="">overlaps on either strand</option> - <option value="-s">only overlaps occurring on the **same** strand.</option> - <option value="-S">only overlaps occurring on the **opposite** strand.</option> - </param> -</inputs> - -<outputs> - <data format="bed" name="output" metadata_source="inputB" label="count of overlaps in ${inputA.name} on ${inputB.name}"/> -</outputs> - -<help> - -**What it does** - -This tool converts counts the number of intervals in a BAM or BED file (the source) that overlap another BED file (the target). - -.. class:: infomark - -The output file will be comprised of each interval from your original target BED file, plus an additional column indicating the number of intervals in your source file that overlapped that target interval. - - ------- - -This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ - - .. __: http://code.google.com/p/bedtools/ - .. __: http://cphg.virginia.edu/quinlan/ - .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short - -</help> -</tool>
--- a/bedtools-galaxy/genomeCoverageBed_bedgraph.xml Tue Jan 08 08:54:50 2013 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,112 +0,0 @@ -<tool id="bedtools_genomecoveragebed_bedgraph" name="Create a BedGraph of genome coverage" version="0.2.0"> - - <description> - </description> - - <requirements> - <requirement type="package" version="2.17.0_5e4507c54355a4a38c6d3e7497a2836a123c6655">bedtools</requirement> - </requirements> - - <command>genomeCoverageBed - #if $input.ext == "bam" - -ibam '$input' - #else - -i '$input' - -g ${chromInfo} - #end if - - #if str($scale): - -scale $scale - #end if - - -bg - $zero_regions - $split - $strand - > '$output' - </command> - - <inputs> - <param format="bed,bam" name="input" type="data" label="The BAM or BED file from which coverage should be computed"> - <validator type="unspecified_build" /> - </param> - - <param name="zero_regions" type="boolean" checked="true" truevalue="-bga" falsevalue="" label="Report regions with zero coverage" help="If set, regions without any coverage will also be reported." /> - - <param name="split" type="boolean" checked="false" truevalue="-split" falsevalue="" label="Treat split/spliced BAM or BED12 entries as distinct BED intervals when computing coverage." help="If set, the coverage will be calculated based the spliced intervals only. For BAM files, this inspects the CIGAR N operation to infer the blocks for computing coverage. For BED12 files, this inspects the BlockCount, BlockStarts, and BlockEnds fields (i.e., columns 10,11,12). If this option is not set, coverage will be calculated based on the interval's START/END coordinates, and would include introns in the case of RNAseq data." /> - - <param name="strand" type="select" label="Calculate coverage based on"> - <option value="">both strands combined</option> - <option value="-strand +">positive strand only</option> - <option value="-strand -">negative strand only</option> - </param> - - <param name="scale" type="text" optional="true" label="Scale the coverage by a constant factor" help="Each BEDGRAPH coverage value is multiplied by this factor before being reported. Useful for normalizing coverage by, e.g., reads per million (RPM)"/> - </inputs> - - <outputs> - <data format="bedgraph" name="output" metadata_source="input" label="${input.name} (Genome Coverage BedGraph)" /> - </outputs> - <help> - - -**What it does** - -This tool calculates the genome-wide coverage of intervals defined in a BAM or BED file and reports them in BedGraph format. - -.. class:: warningmark - -The input BED or BAM file must be sorted by chromosome name (but doesn't necessarily have to be sorted by start position). - ------ - -**Example 1** - -Input (BED format)- -Overlapping, un-sorted intervals:: - - chr1 140 176 - chr1 100 130 - chr1 120 147 - - -Output (BedGraph format)- -Sorted, non-overlapping intervals, with coverage value on the 4th column:: - - chr1 100 120 1 - chr1 120 130 2 - chr1 130 140 1 - chr1 140 147 2 - chr1 147 176 1 - ------ - -**Example 2 - with ZERO-Regions selected (assuming hg19)** - -Input (BED format)- -Overlapping, un-sorted intervals:: - - chr1 140 176 - chr1 100 130 - chr1 120 147 - - -Output (BedGraph format)- -Sorted, non-overlapping intervals, with coverage value on the 4th column:: - - chr1 0 100 0 - chr1 100 120 1 - chr1 120 130 2 - chr1 130 140 1 - chr1 140 147 2 - chr1 147 176 1 - chr1 176 249250621 0 - - -This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ - - .. __: http://code.google.com/p/bedtools/ - .. __: http://cphg.virginia.edu/quinlan/ - .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short -</help> -</tool>
--- a/bedtools-galaxy/genomeCoverageBed_histogram.xml Tue Jan 08 08:54:50 2013 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,77 +0,0 @@ -<tool id="bedtools_genomecoveragebed_histogram" name="Create a histogram of genome coverage" version="0.2.0"> - <description> - </description> - - <requirements> - <requirement type="package" version="2.17.0_5e4507c54355a4a38c6d3e7497a2836a123c6655">bedtools</requirement> - </requirements> - - <command>genomeCoverageBed - #if $input.ext == "bam" - -ibam '$input' - #else - -i '$input' - -g ${chromInfo} - #end if - #if str($max): - -max $max - #end if - > '$output' - </command> - - <inputs> - <param format="bed,bam" name="input" type="data" label="The BAM or BED file from which coverage should be computed"></param> - <param name="max" type="text" optional="true" label="Max depth" help="Combine all positions with a depth >= max into a single bin in the histogram."/> - </inputs> - - <outputs> - <data format="tabular" name="output" metadata_source="input" label="${input.name} (Genome Coverage Histogram)" /> - </outputs> - -<help> -**What it does** - -This tool calculates a histogram of genome coverage depth based on mapped reads in BAM format or intervals in BED format. - - ------- - - -.. class:: infomark - -The output file will contain five columns: - - * 1. Chromosome name (or 'genome' for whole-genome coverage) - * 2. Coverage depth - * 3. The number of bases on chromosome (or genome) with depth equal to column 2. - * 4. The size of chromosome (or entire genome) in base pairs - * 5. The fraction of bases on chromosome (or entire genome) with depth equal to column 2. - -**Example Output**:: - - chr2L 0 1379895 23011544 0.0599653 - chr2L 1 837250 23011544 0.0363839 - chr2L 2 904442 23011544 0.0393038 - chr2L 3 913723 23011544 0.0397072 - chr2L 4 952166 23011544 0.0413778 - chr2L 5 967763 23011544 0.0420555 - chr2L 6 986331 23011544 0.0428624 - chr2L 7 998244 23011544 0.0433801 - chr2L 8 995791 23011544 0.0432735 - chr2L 9 996398 23011544 0.0432999 - - - - ------- - -This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ - - .. __: http://code.google.com/p/bedtools/ - .. __: http://cphg.virginia.edu/quinlan/ - .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short - - - -</help> -</tool>
--- a/bedtools-galaxy/intersectBed.xml Tue Jan 08 08:54:50 2013 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,99 +0,0 @@ -<tool id="bedtools_intersectbed" name="Intersect interval files" version="0.2.0"> - <description> - </description> - <requirements> - <requirement type="package" version="2.17.0_5e4507c54355a4a38c6d3e7497a2836a123c6655">bedtools</requirement> - </requirements> - - <command> - intersectBed - #if $inputA.ext == "bam": - -abam $inputA - #else: - -a $inputA - #end if - - -b $inputB - $split - $strand - #if str($fraction): - -f $fraction - #end if - $reciprocal - $invert - $once - $header - $overlap_mode - > $output - </command> - - <inputs> - <param format="bed,bam,vcf,gff,gff3" name="inputA" type="data" label="BED or BAM file"/> - <param format="bed" name="inputB" type="data" label="overlap intervals in this BED file?"/> - - <param name="strand" type="select" label="Calculate coverage based on"> - <option value="">Overlaps on either strand</option> - <option value="-s">Only overlaps occurring on the **same** strand.</option> - <option value="-S">Only overlaps occurring on the **opposite** strand.</option> - </param> - - <param name="overlap_mode" type="select" label="Calculate coverage based on"> - <option value="">Overlaps on either strand</option> - <option value="-wa">Write the original entry in A for each overlap.</option> - <option value="-wb">Write the original entry in B for each overlap. Useful for knowing what A overlaps. Restricted by the fraction- and reciprocal option.</option> - <option value="-wo">Write the original A and B entries plus the number of base pairs of overlap between the two features. Only A features with overlap are reported. Restricted by the fraction- and reciprocal option.</option> - <option value="-wao">Write the original A and B entries plus the number of base pairs of overlap between the two features. However, A features w/o overlap are also reported with a NULL B feature and overlap = 0. Restricted by the fraction- and reciprocal option.</option> - <option value="-loj">Perform a "left outer join". That is, for each feature in A report each overlap with B. If no overlaps are found, report a NULL feature for B.</option> - </param> - - <param name="split" type="boolean" checked="true" truevalue="-split" falsevalue="" label="Treat split/spliced BAM or BED12 entries as distinct BED intervals when computing coverage." help="If set, the coverage will be calculated based the spliced intervals only. For BAM files, this inspects the CIGAR N operation to infer the blocks for computing coverage. For BED12 files, this inspects the BlockCount, BlockStarts, and BlockEnds fields (i.e., columns 10,11,12). If this option is not set, coverage will be calculated based on the interval's START/END coordinates, and would include introns in the case of RNAseq data." /> - <!-- -f --> - <param name="fraction" type="text" optional="true" label="Minimum overlap required as a fraction of the BAM alignment" help="Alignments are only retained if the overlap with the an interval in the BED file comprises at least this fraction of the BAM alignment's length. For example, to require that the overlap affects 50% of the BAM alignment, use 0.50"/> - <!-- -r --> - <param name="reciprocal" type="boolean" checked="false" truevalue="-r" falsevalue="" label="Require reciprocal overlap." help="If set, the overlap between the BAM alignment and the BED interval must affect the above fraction of both the alignment and the BED interval." /> - <!-- -v --> - <param name="invert" type="boolean" checked="false" truevalue="-v" falsevalue="" label="Report only those alignments that **do not** overlap the BED file."/> - <!-- -u --> - <param name="once" type="boolean" checked="false" truevalue="-u" falsevalue="" label="Write the original A entry _once_ if _any_ overlaps found in B." help="Just report the fact >=1 hit was found." /> - <!-- -c --> - <param name="count" type="boolean" checked="false" truevalue="-c" falsevalue="" label="For each entry in A, report the number of overlaps with B." help="Reports 0 for A entries that have no overlap with B." /> - <!-- -sorted Use the "chromsweep" algorithm for sorted (-k1,1 -k2,2n) input --> - - <!-- header --> - <param name="header" type="boolean" checked="false" truevalue="-header" falsevalue="" label="Print the header from the A file prior to results." /> - - - </inputs> - - <outputs> - <data format_source="inputA" name="output" metadata_source="inputA" label="Intersection of ${inputA.name} and ${inputB.name}"/> - </outputs> - -<help> - -**What it does** - -It allows one to screen for overlaps between two sets of genomic features. Moreover, it allows one to have -fine control as to how the intersections are reported. intersectBed works with both BED/GFF/VCF -and BAM files as input. -Example usage would be to cull a BAM file from an exome capture experiment to include on the "on-target" alignments. - -.. class:: infomark - -Note that each BAM alignment is treated individually. Therefore, if one end of a paired-end alignment overlaps an interval in the BED file, yet the other end does not, the output file will only include the overlapping end. - -.. class:: infomark - -Note that a BAM alignment will be sent to the output file **once** even if it overlaps more than one interval in the BED file. - - ------- - -This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ - - .. __: http://code.google.com/p/bedtools/ - .. __: http://cphg.virginia.edu/quinlan/ - .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short - -</help> -</tool>
--- a/bedtools-galaxy/multiIntersectBed.xml Tue Jan 08 08:54:50 2013 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,205 +0,0 @@ -<tool id="bedtools_multiintersectbed" name="Intersect multiple sorted BED files" version="0.2.0"> - <description> - </description> - - <requirements> - <requirement type="package" version="2.17.0_5e4507c54355a4a38c6d3e7497a2836a123c6655">bedtools</requirement> - </requirements> - - <command>multiIntersectBed - $header - #if $zero.value == True: - -empty - -g ${chromInfo} - #end if - - -i '$input1' - '$input2' - #for $q in $beds - '${q.input}' - #end for - - -names - #if $name1.choice == "tag": - '${input1.name}' - #else - '${name1.custom_name}' - #end if - - #if $name2.choice == "tag": - '${input2.name}' - #else - '${name2.custom_name}' - #end if - - #for $q in $beds - #if $q.name.choice == "tag": - '${q.input.name}' - #else - '${q.input.custom_name}' - #end if - #end for - > '$output' - </command> - - <inputs> - <!-- Make it easy for the user, first two input files are always shown --> - <!-- INPUT 1 --> - <param name="input1" format="bed" type="data" label="First sorted BED file" /> - - <conditional name="name1"> - <param name="choice" type="select" label="Sample name"> - <option value="tag" selected="true">Use input's tag</option> - <option value="custom">Enter custom table name</option> - </param> - <when value="tag"> - </when> - <when value="custom"> - <param name="custom_name" type="text" area="false" label="Custom sample name"/> - </when> - </conditional> - - <!-- INPUT 2 --> - <param name="input2" format="bed" type="data" label="Second sorted BED file" /> - - <conditional name="name2"> - <param name="choice" type="select" label="Sample name"> - <option value="tag" selected="true">Use input's tag</option> - <option value="custom">Enter custom table name</option> - </param> - <when value="tag"> - </when> - <when value="custom"> - <param name="custom_name" type="text" area="false" label="Custom sample name"/> - </when> - </conditional> - - <!-- Additional files, if the user needs more --> - <repeat name="beds" title="Add'l sorted BED files" > - <param name="input" format="bed" type="data" label="BED file" /> - - <conditional name="name"> - <param name="choice" type="select" label="Sample name"> - <option value="tag" selected="true">Use input's tag</option> - <option value="custom">Enter custom table name</option> - </param> - <when value="tag"> - </when> - <when value="custom"> - <param name="custom_name" type="text" area="false" label="Custom sample name"/> - </when> - </conditional> - </repeat> - - <param name="header" type="boolean" checked="true" truevalue="-header" falsevalue="" label="Print header line" help="The first line will include the name of each sample." /> - - <param name="zero" type="boolean" checked="true" label="Report regions that are not covered by any of the files" help="If set, regions that are not overlapped by any file will also be reported. Requires a valid organism key for all input datasets" /> - - </inputs> - - <outputs> - <data format="tabular" name="output" metadata_source="input1" label="Common intervals identified from among ${input1.name}, ${input2.name} and so on." /> - </outputs> - <help> - -**What it does** - -This tool identifies common intervals among multiple, sorted BED files. Intervals can be common among 0 to N of the N input BED files. The pictorial and raw data examples below illustrate the behavior of this tool more clearly. - - -.. image:: http://people.virginia.edu/~arq5x/files/bedtools-galaxy/mbi.png - - -.. class:: warningmark - -This tool requires that each BED file is reference-sorted (chrom, then start). - - -.. class:: infomark - -The output file will contain five fixed columns, plus additional columns for each BED file: - - * 1. Chromosome name (or 'genome' for whole-genome coverage). - * 2. The zero-based start position of the interval. - * 3. The one-based end position of the interval. - * 4. The number of input files that had at least one feature overlapping this interval. - * 5. A list of input files or labels that had at least one feature overlapping this interval. - * 6. For each input file, an indication (1 = Yes, 0 = No) of whether or not the file had at least one feature overlapping this interval. - ------- - -**Example input**:: - - # a.bed - chr1 6 12 - chr1 10 20 - chr1 22 27 - chr1 24 30 - - # b.bed - chr1 12 32 - chr1 14 30 - - # c.bed - chr1 8 15 - chr1 10 14 - chr1 32 34 - - ------- - -**Example without a header and without reporting intervals with zero coverage**:: - - - chr1 6 8 1 1 1 0 0 - chr1 8 12 2 1,3 1 0 1 - chr1 12 15 3 1,2,3 1 1 1 - chr1 15 20 2 1,2 1 1 0 - chr1 20 22 1 2 0 1 0 - chr1 22 30 2 1,2 1 1 0 - chr1 30 32 1 2 0 1 0 - chr1 32 34 1 3 0 0 1 - - -**Example adding a header line**:: - - - chrom start end num list a.bed b.bed c.bed - chr1 6 8 1 1 1 0 0 - chr1 8 12 2 1,3 1 0 1 - chr1 12 15 3 1,2,3 1 1 1 - chr1 15 20 2 1,2 1 1 0 - chr1 20 22 1 2 0 1 0 - chr1 22 30 2 1,2 1 1 0 - chr1 30 32 1 2 0 1 0 - chr1 32 34 1 3 0 0 1 - - -**Example adding a header line and custom file labels**:: - - - chrom start end num list joe bob sue - chr1 6 8 1 joe 1 0 0 - chr1 8 12 2 joe,sue 1 0 1 - chr1 12 15 3 joe,bob,sue 1 1 1 - chr1 15 20 2 joe,bob 1 1 0 - chr1 20 22 1 bob 0 1 0 - chr1 22 30 2 joe,bob 1 1 0 - chr1 30 32 1 bob 0 1 0 - chr1 32 34 1 sue 0 0 1 - - ------ - - -This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ - - .. __: http://code.google.com/p/bedtools/ - .. __: http://cphg.virginia.edu/quinlan/ - .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short - - - - -</help> -</tool>
--- a/bedtools-galaxy/sortBed.xml Tue Jan 08 08:54:50 2013 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,57 +0,0 @@ -<tool id="bedtools_sortbed" name="Sort BED files" version="0.2.0"> - -<description> -</description> - - <requirements> - <requirement type="package" version="2.17.0_5e4507c54355a4a38c6d3e7497a2836a123c6655">bedtools</requirement> - </requirements> - -<command> - sortBed -i $input $option > $output -</command> - -<inputs> - <param format="bed" name="input" type="data" label="Sort the following BED file"/> - <param name="option" type="select" label="Sort by"> - <!-- sort -k 1,1 -k2,2 -n a.bed --> - <option value="">chromosome, then by start position (asc)</option> - <option value="-sizeA">feature size in ascending order.</option> - <option value="-sizeD">feature size in descending order.</option> - <option value="-chrThenSizeA">chromosome, then by feature size (asc).</option> - <option value="-chrThenSizeD">chromosome, then by feature size (desc).</option> - <option value="-chrThenScoreA">chromosome, then by score (asc).</option> - <option value="-chrThenScoreD">chromosome, then by score (desc).</option> - </param> - -</inputs> - -<outputs> - <data format="bed" name="output" metadata_source="input" label="${input.name} (as BED)"/> -</outputs> - -<help> - -**What it does** - -Sorts a feature file by chromosome and other criteria. - - -.. class:: warningmark - -It should be noted that sortBed is merely a convenience utility, as the UNIX sort utility -will sort BED files more quickly while using less memory. For example, UNIX sort will sort a BED file -by chromosome then by start position in the following manner: sort -k 1,1 -k2,2 -n a.bed - ------- - -This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ - - .. __: http://code.google.com/p/bedtools/ - .. __: http://cphg.virginia.edu/quinlan/ - .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short - - - -</help> -</tool>
--- a/bedtools-galaxy/tool_dependencies.xml Tue Jan 08 08:54:50 2013 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,24 +0,0 @@ -<?xml version="1.0"?> -<tool_dependency> - <package name="bedtools" version="2.17.0_5e4507c54355a4a38c6d3e7497a2836a123c6655"> - <install version="1.0"> - <actions> - <action type="shell_command">git clone --recursive https://github.com/arq5x/bedtools.git</action> - <action type="shell_command">git reset --hard 5e4507c54355a4a38c6d3e7497a2836a123c6655</action> - <action type="shell_command">make</action> - <action type="move_directory_files"> - <source_directory>bin</source_directory> - <destination_directory>$INSTALL_DIR/bin</destination_directory> - </action> - <action type="set_environment"> - <environment_variable name="PATH" action="prepend_to">$INSTALL_DIR/bin</environment_variable> - </action> - </actions> - </install> - <readme>FreeBayes requires g++ and the standard C and C++ development libraries. - </readme> - </package> -</tool_dependency> - - -
--- a/bedtools-galaxy/unionBedGraphs.xml Tue Jan 08 08:54:50 2013 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,245 +0,0 @@ -<tool id="bedtools_mergebedgraph" name="Merge BedGraph files" version="0.2.0"> - <description> - </description> - - - <requirements> - <requirement type="package" version="2.17.0_5e4507c54355a4a38c6d3e7497a2836a123c6655">bedtools</requirement> - </requirements> - - <command>unionBedGraphs - $header - -filler '$filler' - #if $zero.value == True: - -empty - -g ${chromInfo} - #end if - - -i '$input1' - '$input2' - #for $q in $bedgraphs - '${q.input}' - #end for - - -names - #if $name1.choice == "tag": - '${input1.name}' - #else - '${name1.custom_name}' - #end if - - #if $name2.choice == "tag": - '${input2.name}' - #else - '${name2.custom_name}' - #end if - - #for $q in $bedgraphs - #if $q.name.choice == "tag": - '${q.input.name}' - #else - '${q.input.custom_name}' - #end if - #end for - > '$output' - </command> - - <inputs> - <!-- Make it easy for the user, first two input files are always shown --> - <!-- INPUT 1 --> - <param name="input1" format="bedgraph" type="data" label="First BedGraph file" /> - - <conditional name="name1"> - <param name="choice" type="select" label="Sample name"> - <option value="tag" selected="true">Use input's tag</option> - <option value="custom">Enter custom table name</option> - </param> - <when value="tag"> - </when> - <when value="custom"> - <param name="custom_name" type="text" area="false" label="Custom sample name"/> - </when> - </conditional> - - <!-- INPUT 2 --> - <param name="input2" format="bedgraph" type="data" label="Second BedGraph file" /> - - <conditional name="name2"> - <param name="choice" type="select" label="Sample name"> - <option value="tag" selected="true">Use input's tag</option> - <option value="custom">Enter custom table name</option> - </param> - <when value="tag"> - </when> - <when value="custom"> - <param name="custom_name" type="text" area="false" label="Custom sample name"/> - </when> - </conditional> - - <!-- Additional files, if the user needs more --> - <repeat name="bedgraphs" title="Add'l BedGraph files" > - <param name="input" format="bedgraph" type="data" label="BedGraph file" /> - - <conditional name="name"> - <param name="choice" type="select" label="Sample name"> - <option value="tag" selected="true">Use input's tag</option> - <option value="custom">Enter custom table name</option> - </param> - <when value="tag"> - </when> - <when value="custom"> - <param name="custom_name" type="text" area="false" label="Custom sample name"/> - </when> - </conditional> - </repeat> - - <param name="header" type="boolean" checked="true" truevalue="-header" falsevalue="" label="Print header line" help="The first line will include the name of each sample." /> - - <param name="zero" type="boolean" checked="true" label="Report regions with zero coverage" help="If set, regions without any coverage will also be reported. Requires a valid organism key for all input datasets" /> - - <param name="filler" type="text" value="0" label="Text to use for no-coverage value" help="Can be 0.0, N/A, - or any other value." /> - </inputs> - - <outputs> - <data format="tabular" name="output" metadata_source="input1" label="Merged BedGraphs of ${input1.name}, ${input2.name} and so on." /> - </outputs> - <help> - -**What it does** - -This tool merges multiple BedGraph files, allowing direct and fine-scale coverage comparisons among many samples/files. The BedGraph files need not represent the same intervals; the tool will identify both common and file-specific intervals. In addition, the BedGraph values need not be numeric: one can use any text as the BedGraph value and the tool will compare the values from multiple files. - -.. image:: http://people.virginia.edu/~arq5x/files/bedtools-galaxy/ubg.png - - -.. class:: warningmark - -This tool requires that each BedGraph file is reference-sorted (chrom, then start) and contains non-overlapping intervals (within a given file). - - ------- - -**Example input**:: - - # 1.bedgraph - chr1 1000 1500 10 - chr1 2000 2100 20 - - # 2.bedgraph - chr1 900 1600 60 - chr1 1700 2050 50 - - # 3.bedgraph - chr1 1980 2070 80 - chr1 2090 2100 20 - - ------- - -**Examples using the Zero Coverage checkbox** - -Output example (*without* checking "Report regions with zero coverage"):: - - chr1 900 1000 0 60 0 - chr1 1000 1500 10 60 0 - chr1 1500 1600 0 60 0 - chr1 1700 1980 0 50 0 - chr1 1980 2000 0 50 80 - chr1 2000 2050 20 50 80 - chr1 2050 2070 20 0 80 - chr1 2070 2090 20 0 0 - chr1 2090 2100 20 0 20 - - -Output example (*with* checking "Report regions with zero coverage"). The lines marked with (*) are not covered in any input file, but are still reported (The asterisk marking does not appear in the file).:: - - chr1 0 900 0 0 0 (*) - chr1 900 1000 0 60 0 - chr1 1000 1500 10 60 0 - chr1 1500 1600 0 60 0 - chr1 1600 1700 0 0 0 (*) - chr1 1700 1980 0 50 0 - chr1 1980 2000 0 50 80 - chr1 2000 2050 20 50 80 - chr1 2050 2070 20 0 80 - chr1 2070 2090 20 0 0 - chr1 2090 2100 20 0 20 - chr1 2100 247249719 0 0 0 (*) - - ------- - -**Examples adjusting the "Filler value" for no-covered intervals** - -The default value is '0', but you can use any other value. - -Output example with **filler = N/A**:: - - chr1 900 1000 N/A 60 N/A - chr1 1000 1500 10 60 N/A - chr1 1500 1600 N/A 60 N/A - chr1 1600 1700 N/A N/A N/A - chr1 1700 1980 N/A 50 N/A - chr1 1980 2000 N/A 50 80 - chr1 2000 2050 20 50 80 - chr1 2050 2070 20 N/A 80 - chr1 2070 2090 20 N/A N/A - chr1 2090 2100 20 N/A 20 - - ------- - -**Examples using the "sample name" labels**:: - - chrom start end WT-1 WT-2 KO-1 - chr1 900 1000 N/A 60 N/A - chr1 1000 1500 10 60 N/A - chr1 1500 1600 N/A 60 N/A - chr1 1600 1700 N/A N/A N/A - chr1 1700 1980 N/A 50 N/A - chr1 1980 2000 N/A 50 80 - chr1 2000 2050 20 50 80 - chr1 2050 2070 20 N/A 80 - chr1 2070 2090 20 N/A N/A - chr1 2090 2100 20 N/A 20 - - ------- - -**Non-numeric values** - -The input BedGraph files can contain any kind of value in the fourth column, not necessarily a numeric value. - -Input Example:: - - File-1 File-2 - chr1 200 300 Sample1 chr1 100 240 0.75 - chr1 400 450 Sample1 chr1 250 700 0.43 - chr1 530 600 Sample2 - -Output Example:: - - chr1 100 200 0 0.75 - chr1 200 240 Sample1 0.75 - chr1 240 250 Sample1 0 - chr1 250 300 Sample1 0.43 - chr1 300 400 0 0.43 - chr1 400 450 Sample1 0.43 - chr1 450 530 0 0.43 - chr1 530 600 Sample2 0.43 - chr1 600 700 0 0.43 - - ------ - -This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ - - .. __: http://code.google.com/p/bedtools/ - .. __: http://cphg.virginia.edu/quinlan/ - .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short - - - - -</help> -</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/coverageBed_counts.xml Fri Jan 10 11:51:13 2014 -0500 @@ -0,0 +1,61 @@ +<tool id="bedtools_coveragebed_counts" name="Count intervals" version="2.18.2.0"> + <description>in one file overlapping intervals in another file</description> + <requirements> + <requirement type="package" version="2.18.2">bedtools</requirement> + </requirements> + <version_command>bedtools --version</version_command> + <command> + coverageBed + #if $inputA.ext == "bam" + -abam '$inputA' + #else + -a '$inputA' + #end if + -b '$inputB' + -counts + $split + $strand + | sort -k1,1 -k2,2n + > '$output' + </command> + + <inputs> + <param format="bed,bam" name="inputA" type="data" label="Count how many intervals in this BED or BAM file (source)"> + <validator type="unspecified_build" /> + </param> + <param format="bed" name="inputB" type="data" label="overlap the intervals in this BED file (target)"> + <validator type="unspecified_build" /> + </param> + <param name="split" type="boolean" checked="false" truevalue="-split" falsevalue="" label="Treat split/spliced BAM or BED12 entries as distinct BED intervals when computing coverage." help="If set, the coverage will be calculated based the spliced intervals only. For BAM files, this inspects the CIGAR N operation to infer the blocks for computing coverage. For BED12 files, this inspects the BlockCount, BlockStarts, and BlockEnds fields (i.e., columns 10,11,12). If this option is not set, coverage will be calculated based on the interval's START/END coordinates, and would include introns in the case of RNAseq data." /> + + <param name="strand" type="select" label="Count"> + <option value="">overlaps on either strand</option> + <option value="-s">only overlaps occurring on the **same** strand.</option> + <option value="-S">only overlaps occurring on the **opposite** strand.</option> + </param> + </inputs> + + <outputs> + <data format="bed" name="output" metadata_source="inputB" label="count of overlaps in ${inputA.name} on ${inputB.name}"/> + </outputs> + <help> + + **What it does** + + This tool converts counts the number of intervals in a BAM or BED file (the source) that overlap another BED file (the target). + + .. class:: infomark + + The output file will be comprised of each interval from your original target BED file, plus an additional column indicating the number of intervals in your source file that overlapped that target interval. + + + ------ + + This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ + + .. __: http://code.google.com/p/bedtools/ + .. __: http://cphg.virginia.edu/quinlan/ + .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short + + </help> +</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/genomeCoverageBed_bedgraph.xml Fri Jan 10 11:51:13 2014 -0500 @@ -0,0 +1,109 @@ +<tool id="bedtools_genomecoveragebed_bedgraph" name="Create a BedGraph" version="2.18.2.0"> + <description>of genome coverage</description> + <requirements> + <requirement type="package" version="2.18.2">bedtools</requirement> + </requirements> + <version_command>bedtools --version</version_command> + <command>genomeCoverageBed + #if $input.ext == "bam" + -ibam '$input' + #else + -i '$input' + -g ${chromInfo} + #end if + + #if str($scale): + -scale $scale + #end if + + -bg + $zero_regions + $split + $strand + > '$output' + </command> + + <inputs> + <param format="bed,bam" name="input" type="data" label="The BAM or BED file from which coverage should be computed"> + <validator type="unspecified_build" /> + </param> + + <param name="zero_regions" type="boolean" checked="true" truevalue="-bga" falsevalue="" label="Report regions with zero coverage" help="If set, regions without any coverage will also be reported." /> + + <param name="split" type="boolean" checked="false" truevalue="-split" falsevalue="" label="Treat split/spliced BAM or BED12 entries as distinct BED intervals when computing coverage." help="If set, the coverage will be calculated based the spliced intervals only. For BAM files, this inspects the CIGAR N operation to infer the blocks for computing coverage. For BED12 files, this inspects the BlockCount, BlockStarts, and BlockEnds fields (i.e., columns 10,11,12). If this option is not set, coverage will be calculated based on the interval's START/END coordinates, and would include introns in the case of RNAseq data." /> + + <param name="strand" type="select" label="Calculate coverage based on"> + <option value="">both strands combined</option> + <option value="-strand +">positive strand only</option> + <option value="-strand -">negative strand only</option> + </param> + + <param name="scale" type="text" optional="true" label="Scale the coverage by a constant factor" help="Each BEDGRAPH coverage value is multiplied by this factor before being reported. Useful for normalizing coverage by, e.g., reads per million (RPM)"/> + </inputs> + + <outputs> + <data format="bedgraph" name="output" metadata_source="input" label="${input.name} (Genome Coverage BedGraph)" /> + </outputs> + <help> + + +**What it does** + +This tool calculates the genome-wide coverage of intervals defined in a BAM or BED file and reports them in BedGraph format. + +.. class:: warningmark + +The input BED or BAM file must be sorted by chromosome name (but doesn't necessarily have to be sorted by start position). + +----- + +**Example 1** + +Input (BED format)- +Overlapping, un-sorted intervals:: + + chr1 140 176 + chr1 100 130 + chr1 120 147 + + +Output (BedGraph format)- +Sorted, non-overlapping intervals, with coverage value on the 4th column:: + + chr1 100 120 1 + chr1 120 130 2 + chr1 130 140 1 + chr1 140 147 2 + chr1 147 176 1 + +----- + +**Example 2 - with ZERO-Regions selected (assuming hg19)** + +Input (BED format)- +Overlapping, un-sorted intervals:: + + chr1 140 176 + chr1 100 130 + chr1 120 147 + + +Output (BedGraph format)- +Sorted, non-overlapping intervals, with coverage value on the 4th column:: + + chr1 0 100 0 + chr1 100 120 1 + chr1 120 130 2 + chr1 130 140 1 + chr1 140 147 2 + chr1 147 176 1 + chr1 176 249250621 0 + + +This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ + + .. __: http://code.google.com/p/bedtools/ + .. __: http://cphg.virginia.edu/quinlan/ + .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short + </help> +</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/genomeCoverageBed_histogram.xml Fri Jan 10 11:51:13 2014 -0500 @@ -0,0 +1,72 @@ +<tool id="bedtools_genomecoveragebed_histogram" name="Create a histogram" version="2.18.2.0"> + <description>of genome coverage</description> + <requirements> + <requirement type="package" version="2.18.2">bedtools</requirement> + </requirements> + <version_command>bedtools --version</version_command> + <command>genomeCoverageBed + #if $input.ext == "bam" + -ibam '$input' + #else + -i '$input' + -g ${chromInfo} + #end if + #if str($max): + -max $max + #end if + > '$output' + </command> + + <inputs> + <param format="bed,bam" name="input" type="data" label="The BAM or BED file from which coverage should be computed"></param> + <param name="max" type="text" optional="true" label="Max depth" help="Combine all positions with a depth >= max into a single bin in the histogram."/> + </inputs> + + <outputs> + <data format="tabular" name="output" metadata_source="input" label="${input.name} (Genome Coverage Histogram)" /> + </outputs> + <help> +**What it does** + +This tool calculates a histogram of genome coverage depth based on mapped reads in BAM format or intervals in BED format. + + +------ + + +.. class:: infomark + +The output file will contain five columns: + + * 1. Chromosome name (or 'genome' for whole-genome coverage) + * 2. Coverage depth + * 3. The number of bases on chromosome (or genome) with depth equal to column 2. + * 4. The size of chromosome (or entire genome) in base pairs + * 5. The fraction of bases on chromosome (or entire genome) with depth equal to column 2. + +**Example Output**:: + + chr2L 0 1379895 23011544 0.0599653 + chr2L 1 837250 23011544 0.0363839 + chr2L 2 904442 23011544 0.0393038 + chr2L 3 913723 23011544 0.0397072 + chr2L 4 952166 23011544 0.0413778 + chr2L 5 967763 23011544 0.0420555 + chr2L 6 986331 23011544 0.0428624 + chr2L 7 998244 23011544 0.0433801 + chr2L 8 995791 23011544 0.0432735 + chr2L 9 996398 23011544 0.0432999 + + + + +------ + +This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ + + .. __: http://code.google.com/p/bedtools/ + .. __: http://cphg.virginia.edu/quinlan/ + .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short + + </help> +</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/intersectBed.xml Fri Jan 10 11:51:13 2014 -0500 @@ -0,0 +1,155 @@ +<tool id="bedtools_intersectbed" name="Intersect interval files" version="2.18.2.0"> + <description> + </description> + <requirements> + <requirement type="package" version="2.18.2">bedtools</requirement> + </requirements> + <version_command>bedtools --version</version_command> + <command> + intersectBed + + #if $intype.inselect == "bam" + -abam $intype.inputBam -b $input $intype.bed + #else + -a $intype.inputBed -b $input + #end if + + #if $output_opt.output_opt_select == "yes" + $output_opt.overlap_mode + $output_opt.u + $output_opt.c + $output_opt.v + #end if + #if $overlap_opt.overlap_opt_select == "yes" + #if str($overlap_opt.f.value) != "None" + -f $overlap_opt.f + #end if + $overlap_opt.r + #end if + $split + $header + $strand + > $output + </command> + <inputs> + <conditional name="intype"> + <param name="inselect" type="select" label="Select input file type"> + <option value="bed" selected="true">BED, GFF, Interval, VCF</option> + <option value="bam">BAM</option> + </param> + <when value="bed"> + <param name="inputBed" type="data" format="bed,gff,interval,vcf" label="Input A" help="Each feature in A is compared to B in search of overlaps."/> + </when> + <when value="bam"> + <param name="inputBam" type="data" format="bam" label="Input A" help="Each BAM alignment in A is compared to B in search of overlaps."/> + <param name="bed" type="boolean" truevalue="-bed" falsevalue="" checked="False" label="Write output as BED" help="The default is to write output in BAM when using BAM as input."/> + </when> + </conditional> + + <param name="input" type="data" format="bed,gff,vcf" label="Input B"/> + + <conditional name="output_opt"> + <param name="output_opt_select" type="select" label="Show output options"> + <option value="no" selected="true">No</option> + <option value="yes">Yes</option> + </param> + <when value="yes"> + <param name="overlap_mode" type="select" label="Calculate coverage based on"> + <option value="">Overlaps on either strand</option> + <option value="-wa">Write the original entry in A for each overlap.</option> + <option value="-wb">Write the original entry in B for each overlap. Useful for knowing what A overlaps. Restricted by the fraction- and reciprocal option.</option> + <option value="-wo">Write the original A and B entries plus the number of base pairs of overlap between the two features. Only A features with overlap are reported. Restricted by the fraction- and reciprocal option.</option> + <option value="-wao">Write the original A and B entries plus the number of base pairs of overlap between the two features. However, A features w/o overlap are also reported with a NULL B feature and overlap = 0. Restricted by the fraction- and reciprocal option.</option> + <option value="-loj">Perform a "left outer join". That is, for each feature in A report each overlap with B. If no overlaps are found, report a NULL feature for B.</option> + </param> + <param name="u" type="boolean" truevalue="-u" falsevalue="" checked="False" label="Write original A entry once if any overlaps found in B" help="Frequently a feature in "A" will overlap with multiple features in "B". By default, intersectBed will report each overlap as a separate output line. However, one may want to simply know that there is at least one overlap (or none). When one uses this option, "A" features that overlap with one or more "B" features are reported once. Those that overlap with no "B" features are not reported at all."/> + <param name="c" type="boolean" truevalue="-c" falsevalue="" checked="False" label="For each entry in A, report the number of hits in B while restricting to the minimum overlap fraction" help="Reports a column after each "A" feature indicating the number (0 or more) of overlapping features found in "B"."/> + <param name="v" type="boolean" truevalue="-v" falsevalue="" checked="False" label="Only report those entries in A that have no overlap in B"/> + </when> + <when value="no"/> + </conditional> + + <conditional name="overlap_opt"> + <param name="overlap_opt_select" type="select" label="Show options for overlap definition"> + <option value="no" selected="true">No</option> + <option value="yes">Yes</option> + </param> + <when value="yes"> + <param name="f" type="float" min="0" max="1" value="" optional="true" label="Minimum overlap required as a fraction of A" help="By default, intersectBed will report an overlap between A and B so long as there is at least one base pair is overlapping. Yet sometimes you may want to restrict reported overlaps between A and B to cases where the feature in B overlaps at least X% (e.g. 50%) of the A feature. This option does exactly this. Default is 1E-9 (i.e. 1bp). For example, to require that the overlap affects 50% of the BAM alignment, use 0.50."/> + <param name="r" type="boolean" truevalue="-r" falsevalue="" checked="False" label="Require that the fraction of overlap be reciprocal for A and B" help="In other words, if the minimum overlap fraction is 0.90 and this option is used, it requires that B overlap at least 90% of A and that A also overlaps at least 90% of B."/> + </when> + <when value="no"/> + </conditional> + <param name="split" type="boolean" truevalue="-split" falsevalue="" checked="false" label="Treat split/spliced BAM or BED12 entries as distinct BED intervals when computing coverage." help="If set, the coverage will be calculated based the spliced intervals only. For BAM files, this inspects the CIGAR N operation to infer the blocks for computing coverage. For BED12 files, this inspects the BlockCount, BlockStarts, and BlockEnds fields (i.e., columns 10,11,12). If this option is not set, coverage will be calculated based on the interval's START/END coordinates, and would include introns in the case of RNAseq data." /> + <param name="header" type="boolean" checked="false" truevalue="-header" falsevalue="" label="Print the header from the A file prior to results." /> + <param name="strand" type="select" label="Calculate coverage based on"> + <option value="">Overlaps on either strand</option> + <option value="-s">Only overlaps occurring on the **same** strand.</option> + <option value="-S">Only overlaps occurring on the **opposite** strand.</option> + </param> + + </inputs> + <outputs> + <data name="output" format="bed" label="Intersection of ${input.name}"> + <change_format> + <when input="intype.bed" value="" format="bam"/> + </change_format> + </data> + </outputs> + +<help> + +.. class:: infomark + +Note that each BAM alignment is treated individually. Therefore, if one end of a paired-end alignment overlaps an interval in the BED file, +yet the other end does not, the output file will only include the overlapping end. + +.. class:: infomark + +Note that a BAM alignment will be sent to the output file **once** even if it overlaps more than one interval in the BED file. + + +**What it does** + +By far, the most common question asked of two sets of genomic features is whether or not any of the +features in the two sets "overlap" with one another. This is known as feature intersection. intersectBed +allows one to screen for overlaps between two sets of genomic features. Moreover, it allows one to have +fine control as to how the intersections are reported. intersectBed works with both BED/GFF +and BAM files as input. + +By default, if an overlap is found, intersectBed reports the shared interval between the two +overlapping features. + + +**Default behavior when using BAM input** + +When comparing alignments in BAM format to features in BED format, intersectBed +will, by default, write the output in BAM format. That is, each alignment in the BAM file that meets +the user's criteria will be written in BAM format. This serves as a mechanism to +create subsets of BAM alignments are of biological interest, etc. Note that only the mate in the BAM +alignment is compared to the BED file. Thus, if only one end of a paired-end sequence overlaps with a +feature in B, then that end will be written to the BAM output. By contrast, the other mate for the +pair will not be written. One should use pairToBed if one wants each BAM alignment +for a pair to be written to BAM output. + + +**Output BED format when using BAM input** + +When comparing alignments in BAM format to features in BED format, intersectBed +will optionally write the output in BED format. That is, each alignment in the BAM file is converted +to a 6 column BED feature and if overlaps are found (or not) based on the user's criteria, the BAM +alignment will be reported in BED format. The BED "name" field is comprised of the RNAME field in +the BAM alignment. If mate information is available, the mate (e.g., "/1" or "/2") field will be +appended to the name. The "score" field is the mapping quality score from the BAM alignment. + + +------ + +This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ + + .. __: http://code.google.com/p/bedtools/ + .. __: http://cphg.virginia.edu/quinlan/ + .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short + +</help> +</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/multiIntersectBed.xml Fri Jan 10 11:51:13 2014 -0500 @@ -0,0 +1,200 @@ +<tool id="bedtools_multiintersectbed" name="Intersect" version="2.18.2.0"> + <description>multiple sorted BED files</description> + <requirements> + <requirement type="package" version="2.18.2">bedtools</requirement> + </requirements> + <version_command>bedtools --version</version_command> + <command>multiIntersectBed + $header + #if $zero.value == True: + -empty + -g ${chromInfo} + #end if + + -i '$input1' + '$input2' + #for $q in $beds + '${q.input}' + #end for + + -names + #if $name1.choice == "tag": + '${input1.name}' + #else + '${name1.custom_name}' + #end if + + #if $name2.choice == "tag": + '${input2.name}' + #else + '${name2.custom_name}' + #end if + + #for $q in $beds + #if $q.name.choice == "tag": + '${q.input.name}' + #else + '${q.input.custom_name}' + #end if + #end for + > '$output' + </command> + <inputs> + <!-- Make it easy for the user, first two input files are always shown --> + <!-- INPUT 1 --> + <param name="input1" format="bed" type="data" label="First sorted BED file" /> + + <conditional name="name1"> + <param name="choice" type="select" label="Sample name"> + <option value="tag" selected="true">Use input's tag</option> + <option value="custom">Enter custom table name</option> + </param> + <when value="tag"> + </when> + <when value="custom"> + <param name="custom_name" type="text" area="false" label="Custom sample name"/> + </when> + </conditional> + + <!-- INPUT 2 --> + <param name="input2" format="bed" type="data" label="Second sorted BED file" /> + + <conditional name="name2"> + <param name="choice" type="select" label="Sample name"> + <option value="tag" selected="true">Use input's tag</option> + <option value="custom">Enter custom table name</option> + </param> + <when value="tag"> + </when> + <when value="custom"> + <param name="custom_name" type="text" area="false" label="Custom sample name"/> + </when> + </conditional> + + <!-- Additional files, if the user needs more --> + <repeat name="beds" title="Add'l sorted BED files" > + <param name="input" format="bed" type="data" label="BED file" /> + + <conditional name="name"> + <param name="choice" type="select" label="Sample name"> + <option value="tag" selected="true">Use input's tag</option> + <option value="custom">Enter custom table name</option> + </param> + <when value="tag"> + </when> + <when value="custom"> + <param name="custom_name" type="text" area="false" label="Custom sample name"/> + </when> + </conditional> + </repeat> + + <param name="header" type="boolean" checked="true" truevalue="-header" falsevalue="" label="Print header line" help="The first line will include the name of each sample." /> + + <param name="zero" type="boolean" checked="true" label="Report regions that are not covered by any of the files" help="If set, regions that are not overlapped by any file will also be reported. Requires a valid organism key for all input datasets" /> + + </inputs> + + <outputs> + <data format="tabular" name="output" metadata_source="input1" label="Common intervals identified from among ${input1.name}, ${input2.name} and so on." /> + </outputs> + <help> + +**What it does** + +This tool identifies common intervals among multiple, sorted BED files. Intervals can be common among 0 to N of the N input BED files. The pictorial and raw data examples below illustrate the behavior of this tool more clearly. + + +.. image:: http://people.virginia.edu/~arq5x/files/bedtools-galaxy/mbi.png + + +.. class:: warningmark + +This tool requires that each BED file is reference-sorted (chrom, then start). + + +.. class:: infomark + +The output file will contain five fixed columns, plus additional columns for each BED file: + + * 1. Chromosome name (or 'genome' for whole-genome coverage). + * 2. The zero-based start position of the interval. + * 3. The one-based end position of the interval. + * 4. The number of input files that had at least one feature overlapping this interval. + * 5. A list of input files or labels that had at least one feature overlapping this interval. + * 6. For each input file, an indication (1 = Yes, 0 = No) of whether or not the file had at least one feature overlapping this interval. + +------ + +**Example input**:: + + # a.bed + chr1 6 12 + chr1 10 20 + chr1 22 27 + chr1 24 30 + + # b.bed + chr1 12 32 + chr1 14 30 + + # c.bed + chr1 8 15 + chr1 10 14 + chr1 32 34 + + +------ + +**Example without a header and without reporting intervals with zero coverage**:: + + + chr1 6 8 1 1 1 0 0 + chr1 8 12 2 1,3 1 0 1 + chr1 12 15 3 1,2,3 1 1 1 + chr1 15 20 2 1,2 1 1 0 + chr1 20 22 1 2 0 1 0 + chr1 22 30 2 1,2 1 1 0 + chr1 30 32 1 2 0 1 0 + chr1 32 34 1 3 0 0 1 + + +**Example adding a header line**:: + + + chrom start end num list a.bed b.bed c.bed + chr1 6 8 1 1 1 0 0 + chr1 8 12 2 1,3 1 0 1 + chr1 12 15 3 1,2,3 1 1 1 + chr1 15 20 2 1,2 1 1 0 + chr1 20 22 1 2 0 1 0 + chr1 22 30 2 1,2 1 1 0 + chr1 30 32 1 2 0 1 0 + chr1 32 34 1 3 0 0 1 + + +**Example adding a header line and custom file labels**:: + + + chrom start end num list joe bob sue + chr1 6 8 1 joe 1 0 0 + chr1 8 12 2 joe,sue 1 0 1 + chr1 12 15 3 joe,bob,sue 1 1 1 + chr1 15 20 2 joe,bob 1 1 0 + chr1 20 22 1 bob 0 1 0 + chr1 22 30 2 joe,bob 1 1 0 + chr1 30 32 1 bob 0 1 0 + chr1 32 34 1 sue 0 0 1 + + +----- + + +This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ + + .. __: http://code.google.com/p/bedtools/ + .. __: http://cphg.virginia.edu/quinlan/ + .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short + + + </help> +</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/sortBed.xml Fri Jan 10 11:51:13 2014 -0500 @@ -0,0 +1,49 @@ +<tool id="bedtools_sortbed" name="Sort BED" version="2.18.2.0"> + <description>files</description> + <requirements> + <requirement type="package" version="2.18.2">bedtools</requirement> + </requirements> + <version_command>bedtools --version</version_command> + <command> + sortBed -i $input $option > $output + </command> + <inputs> + <param format="bed" name="input" type="data" label="Sort the following BED file"/> + <param name="option" type="select" label="Sort by"> + <!-- sort -k 1,1 -k2,2 -n a.bed --> + <option value="">chromosome, then by start position (asc)</option> + <option value="-sizeA">feature size in ascending order.</option> + <option value="-sizeD">feature size in descending order.</option> + <option value="-chrThenSizeA">chromosome, then by feature size (asc).</option> + <option value="-chrThenSizeD">chromosome, then by feature size (desc).</option> + <option value="-chrThenScoreA">chromosome, then by score (asc).</option> + <option value="-chrThenScoreD">chromosome, then by score (desc).</option> + </param> + </inputs> + + <outputs> + <data format="bed" name="output" metadata_source="input" label="${input.name} (as BED)"/> + </outputs> + <help> + +**What it does** + +Sorts a feature file by chromosome and other criteria. + + +.. class:: warningmark + +It should be noted that sortBed is merely a convenience utility, as the UNIX sort utility +will sort BED files more quickly while using less memory. For example, UNIX sort will sort a BED file +by chromosome then by start position in the following manner: sort -k 1,1 -k2,2 -n a.bed + +------ + +This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ + + .. __: http://code.google.com/p/bedtools/ + .. __: http://cphg.virginia.edu/quinlan/ + .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short + + </help> +</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool_dependencies.xml Fri Jan 10 11:51:13 2014 -0500 @@ -0,0 +1,6 @@ +<?xml version="1.0"?> +<tool_dependency> + <package name="bedtools" version="2.18.2"> + <repository changeset_revision="044d68a0c07f" name="package_bedtools_2_18" owner="iuc" toolshed="http://testtoolshed.g2.bx.psu.edu" /> + </package> +</tool_dependency>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/unionBedGraphs.xml Fri Jan 10 11:51:13 2014 -0500 @@ -0,0 +1,240 @@ +<tool id="bedtools_mergebedgraph" name="Merge BedGraph" version="2.18.2.0"> + <description>files</description> + <requirements> + <requirement type="package" version="2.18.2">bedtools</requirement> + </requirements> + <version_command>bedtools --version</version_command> + <command>unionBedGraphs + $header + -filler '$filler' + #if $zero.value == True: + -empty + -g ${chromInfo} + #end if + + -i '$input1' + '$input2' + #for $q in $bedgraphs + '${q.input}' + #end for + + -names + #if $name1.choice == "tag": + '${input1.name}' + #else + '${name1.custom_name}' + #end if + + #if $name2.choice == "tag": + '${input2.name}' + #else + '${name2.custom_name}' + #end if + + #for $q in $bedgraphs + #if $q.name.choice == "tag": + '${q.input.name}' + #else + '${q.input.custom_name}' + #end if + #end for + > '$output' + </command> + + <inputs> + <!-- Make it easy for the user, first two input files are always shown --> + <!-- INPUT 1 --> + <param name="input1" format="bedgraph" type="data" label="First BedGraph file" /> + + <conditional name="name1"> + <param name="choice" type="select" label="Sample name"> + <option value="tag" selected="true">Use input's tag</option> + <option value="custom">Enter custom table name</option> + </param> + <when value="tag"> + </when> + <when value="custom"> + <param name="custom_name" type="text" area="false" label="Custom sample name"/> + </when> + </conditional> + + <!-- INPUT 2 --> + <param name="input2" format="bedgraph" type="data" label="Second BedGraph file" /> + + <conditional name="name2"> + <param name="choice" type="select" label="Sample name"> + <option value="tag" selected="true">Use input's tag</option> + <option value="custom">Enter custom table name</option> + </param> + <when value="tag"> + </when> + <when value="custom"> + <param name="custom_name" type="text" area="false" label="Custom sample name"/> + </when> + </conditional> + + <!-- Additional files, if the user needs more --> + <repeat name="bedgraphs" title="Add'l BedGraph files" > + <param name="input" format="bedgraph" type="data" label="BedGraph file" /> + + <conditional name="name"> + <param name="choice" type="select" label="Sample name"> + <option value="tag" selected="true">Use input's tag</option> + <option value="custom">Enter custom table name</option> + </param> + <when value="tag"> + </when> + <when value="custom"> + <param name="custom_name" type="text" area="false" label="Custom sample name"/> + </when> + </conditional> + </repeat> + + <param name="header" type="boolean" checked="true" truevalue="-header" falsevalue="" label="Print header line" help="The first line will include the name of each sample." /> + + <param name="zero" type="boolean" checked="true" label="Report regions with zero coverage" help="If set, regions without any coverage will also be reported. Requires a valid organism key for all input datasets" /> + + <param name="filler" type="text" value="0" label="Text to use for no-coverage value" help="Can be 0.0, N/A, - or any other value." /> + </inputs> + + <outputs> + <data format="tabular" name="output" metadata_source="input1" label="Merged BedGraphs of ${input1.name}, ${input2.name} and so on." /> + </outputs> + <help> + +**What it does** + +This tool merges multiple BedGraph files, allowing direct and fine-scale coverage comparisons among many samples/files. The BedGraph files need not represent the same intervals; the tool will identify both common and file-specific intervals. In addition, the BedGraph values need not be numeric: one can use any text as the BedGraph value and the tool will compare the values from multiple files. + +.. image:: http://people.virginia.edu/~arq5x/files/bedtools-galaxy/ubg.png + + +.. class:: warningmark + +This tool requires that each BedGraph file is reference-sorted (chrom, then start) and contains non-overlapping intervals (within a given file). + + +------ + +**Example input**:: + + # 1.bedgraph + chr1 1000 1500 10 + chr1 2000 2100 20 + + # 2.bedgraph + chr1 900 1600 60 + chr1 1700 2050 50 + + # 3.bedgraph + chr1 1980 2070 80 + chr1 2090 2100 20 + + +------ + +**Examples using the Zero Coverage checkbox** + +Output example (*without* checking "Report regions with zero coverage"):: + + chr1 900 1000 0 60 0 + chr1 1000 1500 10 60 0 + chr1 1500 1600 0 60 0 + chr1 1700 1980 0 50 0 + chr1 1980 2000 0 50 80 + chr1 2000 2050 20 50 80 + chr1 2050 2070 20 0 80 + chr1 2070 2090 20 0 0 + chr1 2090 2100 20 0 20 + + +Output example (*with* checking "Report regions with zero coverage"). The lines marked with (*) are not covered in any input file, but are still reported (The asterisk marking does not appear in the file).:: + + chr1 0 900 0 0 0 (*) + chr1 900 1000 0 60 0 + chr1 1000 1500 10 60 0 + chr1 1500 1600 0 60 0 + chr1 1600 1700 0 0 0 (*) + chr1 1700 1980 0 50 0 + chr1 1980 2000 0 50 80 + chr1 2000 2050 20 50 80 + chr1 2050 2070 20 0 80 + chr1 2070 2090 20 0 0 + chr1 2090 2100 20 0 20 + chr1 2100 247249719 0 0 0 (*) + + +------ + +**Examples adjusting the "Filler value" for no-covered intervals** + +The default value is '0', but you can use any other value. + +Output example with **filler = N/A**:: + + chr1 900 1000 N/A 60 N/A + chr1 1000 1500 10 60 N/A + chr1 1500 1600 N/A 60 N/A + chr1 1600 1700 N/A N/A N/A + chr1 1700 1980 N/A 50 N/A + chr1 1980 2000 N/A 50 80 + chr1 2000 2050 20 50 80 + chr1 2050 2070 20 N/A 80 + chr1 2070 2090 20 N/A N/A + chr1 2090 2100 20 N/A 20 + + +------ + +**Examples using the "sample name" labels**:: + + chrom start end WT-1 WT-2 KO-1 + chr1 900 1000 N/A 60 N/A + chr1 1000 1500 10 60 N/A + chr1 1500 1600 N/A 60 N/A + chr1 1600 1700 N/A N/A N/A + chr1 1700 1980 N/A 50 N/A + chr1 1980 2000 N/A 50 80 + chr1 2000 2050 20 50 80 + chr1 2050 2070 20 N/A 80 + chr1 2070 2090 20 N/A N/A + chr1 2090 2100 20 N/A 20 + + +------ + +**Non-numeric values** + +The input BedGraph files can contain any kind of value in the fourth column, not necessarily a numeric value. + +Input Example:: + + File-1 File-2 + chr1 200 300 Sample1 chr1 100 240 0.75 + chr1 400 450 Sample1 chr1 250 700 0.43 + chr1 530 600 Sample2 + +Output Example:: + + chr1 100 200 0 0.75 + chr1 200 240 Sample1 0.75 + chr1 240 250 Sample1 0 + chr1 250 300 Sample1 0.43 + chr1 300 400 0 0.43 + chr1 400 450 Sample1 0.43 + chr1 450 530 0 0.43 + chr1 530 600 Sample2 0.43 + chr1 600 700 0 0.43 + + +----- + +This tool is part of the `bedtools package`__ from the `Quinlan laboratory`__. If you use this tool, please cite `Quinlan AR, and Hall I.M. BEDTools: A flexible framework for comparing genomic features. Bioinformatics, 2010, 26, 6.`__ + + .. __: http://code.google.com/p/bedtools/ + .. __: http://cphg.virginia.edu/quinlan/ + .. __: http://bioinformatics.oxfordjournals.org/content/26/6/841.short + + + </help> +</tool>