# HG changeset patch
# User nick
# Date 1448317688 18000
# Node ID 4633a25d8c19e033c7fafb0325f1997b4d8537dd
planemo upload commit 801bf168032a13f6405518bddb35a24c9e9a8cd4-dirty
diff -r 000000000000 -r 4633a25d8c19 align_families.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/align_families.xml Mon Nov 23 17:28:08 2015 -0500
@@ -0,0 +1,59 @@
+
+
+ from duplex sequencing data
+ align_families.py $input > $output
+
+
+
+
+
+
+
+
+ mafft
+ duplex
+
+
+
+
+
+
+
+
+
+**What it does**
+
+This is for processing duplex sequencing data. It does a multiple sequence alignment on each (single-stranded) family of reads.
+
+-----
+
+**Input**
+
+This expects the output format of the "Make families" tool.
+
+-----
+
+**Output**
+
+The output is a tabular file where each line corresponds to a (single) read.
+
+The columns are::
+
+ 1: barcode (both tags)
+ 2: tag order in barcode ("ab" or "ba")
+ 3: read mate ("1" or "2")
+ 4: read name
+ 5: read sequence, aligned ("-" for gaps)
+ 6: read quality scores, aligned (" " for gaps)
+
+-----
+
+**Alignments**
+
+The alignments are done using MAFFT, specifically the command
+::
+
+ $ mafft --nuc --quiet family.fa > family.aligned.fa
+
+
+
diff -r 000000000000 -r 4633a25d8c19 duplex.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/duplex.xml Mon Nov 23 17:28:08 2015 -0500
@@ -0,0 +1,50 @@
+
+
+ from duplex sequencing data
+ duplex.fa
+ && awk -f $__tool_directory__/utils/outconv.awk -v target=1 duplex.fa > $output1
+ && awk -f $__tool_directory__/utils/outconv.awk -v target=2 duplex.fa > $output2
+ ]]>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ keep_sscs
+
+
+
+
+**What it does**
+
+This is for processing duplex sequencing data. It creates single-strand and duplex consensus reads from aligned read families.
+
+-----
+
+**Input**
+
+This expects the output format of the "Align families" tool.
+
+-----
+
+**Output**
+
+This will output final, duplex consensus reads in two FASTA files (first and second reads in the pairs). Optionally, you can save the single-strand reads too, in a separate FASTA file.
+
+
+
diff -r 000000000000 -r 4633a25d8c19 make_families.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/make_families.xml Mon Nov 23 17:28:08 2015 -0500
@@ -0,0 +1,79 @@
+
+
+ from duplex sequencing data
+ paste $fastq1 $fastq2
+ | paste - - - -
+ | awk -f $__tool_directory__/make-barcodes.awk -v TAG_LEN=$taglen -v INVARIANT=$invariant
+ | sort
+ > $output
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+**What it does**
+
+This tool is for processing raw duplex sequencing data, removing the barcodes and grouping by them into families of reads from the same fragment.
+
+-----
+
+**Output**
+
+The output will be a tabular file where each line corresponds to a pair of input reads.
+
+The columns are::
+
+ 1: barcode (both tags joined and ordered)
+ 2: tag order in barcode ("ab" or "ba")
+ 3: read1 name
+ 4: read1 sequence (minus the tag and invariant sequences)
+ 5: read1 quality scores (minus the same tag and invariant)
+ 6: read2 name
+ 7: read2 sequence (minus the tag and invariant sequences)
+ 8: read2 quality scores (minus the same tag and invariant)
+
+-----
+
+**Barcode creation**
+
+For each pair, the tool will remove the tag at the beginning of each read and create a barcode by concatenating the two tags. The order of the tags is determined by a string comparison so that it will make an identical barcode from pairs of either order. The original tag order will be noted in the second column.
+
+Since pairs from opposite strands will have the same tags, but in the reverse order, this produces the same barcode for reads from the same fragment, regardless of strand. Then a simple sort will group all reads from the same strand together, separated into strands by the different "order" values.
+
+Examples::
+
+ +---------------+-----------------+
+ | input tags | output |
+ +-------+-------+-------+---------+
+ | read1 | read2 | order | barcode |
+ +-------+-------+-------+---------+
+ | ATG | CCT | ab | ATGCCT |
+ +-------+-------+-------+---------+
+ | CCT | ATG | ba | ATGCCT |
+ +-------+-------+-------+---------+
+
+
+
diff -r 000000000000 -r 4633a25d8c19 tool_dependencies.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool_dependencies.xml Mon Nov 23 17:28:08 2015 -0500
@@ -0,0 +1,18 @@
+
+
+
+
+
+
+
+
+ https://github.com/makrutenko/duplex/archive/master.tar.gz
+ duplex-master
+ make
+
+ $INSTALL_DIR
+
+
+
+
+