annotate lastz_d.xml @ 4:0acd9701676b draft

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
author iuc
date Fri, 18 May 2018 16:58:24 -0400
parents c3767eaae954
children ec4affe27298
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
1 <tool id="lastz_d_wrapper" name="LASTZ_D" version="1.3.2">
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
2 <description>: estimate substitution scores matrix</description>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
3 <macros>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
4 <import>lastz_macros.xml</import>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
5 </macros>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
6 <requirements>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
7 <requirement type="package" version="@LASTZ_CONDA_VERSION@">lastz</requirement>
3
c3767eaae954 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit 13e9724b44888b0de9535ac7b561ad9686038413
iuc
parents: 2
diff changeset
8 <requirement type="package" version="1.0.6">bzip2</requirement>
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
9 </requirements>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
10 <command detect_errors="exit_code"><![CDATA[
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
11 lastz_D
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
12 @TARGET_INPUT_COMMAND_LINE@
3
c3767eaae954 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit 13e9724b44888b0de9535ac7b561ad9686038413
iuc
parents: 2
diff changeset
13 @query_input@
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
14 #if $score_file:
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
15 '--inferonly=${score_file}'
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
16 #else:
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
17 --inferonly
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
18 #end if
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
19 '--infscores=${output}'
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
20 ]]>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
21 </command>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
22 <inputs>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
23 <expand macro="target_input"/>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
24 <param name="query" format="fasta,fastq" type="data" label="Select QUERY sequence(s)" help="These are the sequences that you are aligning against TARGET"/>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
25 <param name="score_file" type="data" format="txt" optional="true" label="Control file for inference" argument="--inferonly[=control_file]" help="Optional controf file. If nothing is selected, LASTZ_D uses default described in the manual"/>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
26 </inputs>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
27 <outputs>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
28 <data format="txt" name="output" label="${tool.name} on ${on_string}: substituion matrix"/>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
29 </outputs>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
30 <tests>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
31 <test>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
32 <param name="ref_source" value="history" />
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
33 <param name="target" value="chrM_human.fa" />
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
34 <param name="query" value="chrM_mouse.fa" />
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
35 <output name="output" value="lastz_d_test1.out" />
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
36 </test>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
37 <test>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
38 <param name="ref_source" value="history" />
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
39 <param name="target" value="chrM_human.fa" />
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
40 <param name="query" value="chrM_mouse.fa" />
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
41 <param name="score_file" value="lastz_d_ctrl_file.txt" />
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
42 <output name="output" value="lastz_d_test2.out" />
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
43 </test>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
44 </tests>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
45
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
46 <help><![CDATA[
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
47
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
48 **What is does**
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
49
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
50 LASTZ_D is a non-integer (**D** stands for Double) version of LASTZ that can be used to estimate substitution matrix that will be used to score alignments. It was developed by `Bob Harris <http://www.bx.psu.edu/~rsharris/>`_ in the lab of Webb Miller at Penn State as a part of LASTZ. Matrix computed by this tool is to be used by LASTZ (see below).
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
51
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
52 .. class:: warningmark
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
53
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
54 **Read documentation** before proceeding. LASTZ is a complex tool with many parameter options. Fortunately, there is a `great manual <https://lastz.github.io/lastz/>`_ maintained by its author. The two sections that are particularly relevant to the inference of substitution matrix are `Inferring Score Sets <http://www.bx.psu.edu/~rsharris/lastz/README.lastz-1.04.00.html#adv_inference>`_ and `Inference Control File <http://www.bx.psu.edu/~rsharris/lastz/README.lastz-1.04.00.html#fmt_inference>`_.
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
55
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
56 **Notes on the inference**
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
57
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
58 Inference is achieved by computing the probability of each of the 18 different alignment events (gap open, gap extend, and 16 substitutions). These probabilities are estimated from alignments of the sequences. Of course, at first we don't have alignments, so the process begins by using a generic scoring set to create alignments, infer scores from those, then realign, and so on, until the scores stabilize or "converge". Ungapped alignments are performed until the substitution scores converge, then gapped alignments are performed (holding the substitution scores constant) until the gap penalties converge. In the end you get a matrix like this::
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
59
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
60 # (a LASTZ scoring set, created by "LASTZ --infer")
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
61
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
62 bad_score = X:-1781 # used for sub[X][*] and sub[*][X]
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
63 fill_score = -178 # used when sub[*][*] not otherwise defined
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
64 gap_open_penalty = 400
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
65 gap_extend_penalty = 30
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
66
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
67 A C G T
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
68 A 72 -79 -49 -97
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
69 C -79 100 -178 -49
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
70 G -49 -178 100 -79
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
71 T -97 -49 -79 72
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
72
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
73 This dataset can then be used as an input to the **Read the substitution scores** parameter of LASTZ (Parameter section *Scoring*).
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
74
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
75 The iterative process can fail if there's not a lot of sequence to align. E.g. if after the 4th iteration there's nothing in the central 50% denominators go to zero and the process fails.
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
76
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
77 If the sequences you are aligning have GC content different than the usual ACGT 30-20-20-30 split, scoring inference should discover this and give you better alignments.
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
78
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
79
4
0acd9701676b planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a7e9d5b3906b7ebb35b1c29c3a8e8203b2cefccd
iuc
parents: 3
diff changeset
80 ]]>
2
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
81 </help>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
82 <expand macro="citations"/>
8e9252994649 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit c5379af63b23648020a4709f8ed9d9eac26582aa
iuc
parents:
diff changeset
83 </tool>