Mercurial > repos > malex > bayesase

<tool id="base_BWASplitSam_2output" name="BWASplitSAM" version="21.1.13" python_template_version="3.6">
    <description> creates a SAM file of uniquely mapping reads from a BWA-MEM alignment</description>
    <macros>
        <import>macros.xml</import>
    </macros>
    <expand macro="requirements"/>
    <command><![CDATA[
    bwa_split_sam_seonly_2output.py
          --sam=$sam
          --uniq=$uniq
          --summ=$summ
]]>
    </command>
    <inputs>
      <param name="sam" type="data" format="sam" label="Select SAM file or collection of SAM files" help="Each SAM file will be subset into only uniquely mapping reads. We note that the SAM file MUST be generated using single-end mode."/>
    </inputs>
    <outputs>
      <data name="uniq" format="sam" label="${tool.name} on ${on_string}: Uniquely Mapped Reads SAM"/>
      <data name="summ" format="tabular" label="${tool.name} on ${on_string}: Summary of Aligned Reads"/>
    </outputs>
    <tests>
        <test>
            <param name="sam" ftype="data"      value="align_and_counts_test_data/bam_to_sam_BASE_test_data.sam"/>
            <output name="uniq" ftype="data"      value="align_and_counts_test_data/W1118_G1_unique_sam_for_BASE.sam"/>
            <output name="summ"     file="align_and_counts_test_data/W1118_G1_BWASplitSAM_summary.tabular" />
        </test>
    </tests>
    <help><![CDATA[
**Tool Description**

This tool subsets a SAM file based on values in the FLAG field. Flag values are used to parse the reads into categories according to how they align to the reference genome.  The tool outputs a SAM file containing only uniquely mapping reads in addition to a summary TSV file containing read counts for each alignment category. Uniquely mapping reads are defined as reads that map to a single location in the genome. Category types 'mapped' and 'opposite' together define the uniquely mapping reads.

The types of alignment classifications are: Mapped, Unmapped, Opposite, Ambiguous, Chimeric, and Not_primary. More details on these classifications can be found below.

Input SAM files must be generated from single-end aligments. The BWASplitSAM tool works wth SAM files generated by BWA-MEM. If using a different aligner, another tool may be substituted in place of BWASplitSAM to generate a SAM file of uniquely mapping reads.

**Input**

    (1) A SAM file generated from a **SE alignment generated using BWA-MEM**.

**Output**

The following output files are generated based on the SAM flag values in the starting SAM file:

(1) **Uniquely_mapped_reads.sam**: A SAM file comprised only of uniquely mapped reads.

A summary TSV file file containing the number of reads falling under the following categories::
    (1) **Mapped**: Reads that uniquely map on the forward strand
    (2) **Unmapped**: Reads that do not align
    (3) **Opposite**: Reads that uniquely map on reverse strand
    (4) **Ambiguous**: Reads that align to more than one location
    (5) **Chimeric**: Reads that align to distinct positions in the genome (non-linear alignments)
    (6) **Not_primary**: Reads with non primary alignments (has secondary alignment(s))

Example summary output file::


    +---------------+---------------------+---------------------------------------+---------------------+---------------------+----------------------+---------------------+-----------------+
    |   Name        |  count_total_reads  | count_mapped_read_opposite_strand     | count_unmapped_read |  count_mapped_read  | count_ambiguous_read |count_chimeric_read  | count_notprimary|
    +===============+=====================+=======================================+=====================+=====================+======================+=====================+=================+
    | dataset_2215  |14                   |5                                      |0                    | 9                   |0                    |0                     |0                |
    +---------------+---------------------+---------------------------------------+---------------------+---------------------+----------------------+---------------------+-----------------+

    ]]></help>
    <citations>
            <citation type="bibtex">@ARTICLE{Miller20BASE,
            author = {Brecca Miller, Alison M. Morse, Elyse Borgert, Zihao Liu, Kelsey Sinclair, Gavin Gamble, Fei Zou, Jeremy Newman, Luis Leon Novello, Fabio Marroni, Lauren M. McIntyre},
            title = {Testcrosses are an efficient strategy for identifying cis regulatory variation: Bayesian analysis of allele imbalance among conditions (BASE)},
            journal = {????},
            year = {submitted for publication}
            }</citation>
        </citations>
</tool>
author	malex
date	Thu, 14 Jan 2021 21:31:44 +0000
parents
children