Mercurial > repos > jasper > align_back_trans
changeset 12:28459eecd18c draft
Deleted selected files
author | jasper |
---|---|
date | Fri, 03 Feb 2017 13:45:14 -0500 |
parents | 680d842fa17a |
children | 1df4c5372e07 |
files | README.rst align_back_trans.xml shed.yml tool_dependencies.xml |
diffstat | 4 files changed, 0 insertions(+), 291 deletions(-) [+] |
line wrap: on
line diff
--- a/README.rst Fri Feb 03 13:13:17 2017 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,130 +0,0 @@ -Galaxy tool to back-translate a protein alignment to nucleotides -================================================================ - -This tool is copyright 2012-2015 by Peter Cock, The James Hutton Institute -(formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved. -See the licence text below (MIT licence). - -This tool is a short Python script (using Biopython library functions) to -load a protein alignment, and matching nucleotide FASTA file of unaligned -sequences, which are threaded onto the protein alignment in order to produce -a codon aware nucleotide alignment - which can be viewed as a back translation. - -This tool is available from the Galaxy Tool Shed at: - -* http://toolshed.g2.bx.psu.edu/view/peterjc/align_back_trans - -The underlying Python script can also be used outside of Galaxy, for -details run:: - - $ python align_back_trans.py - -Automated Installation -====================== - -This should be straightforward using the Galaxy Tool Shed, which should be -able to automatically install the dependency on Biopython, and then install -this tool and run its unit tests. - - -Manual Installation -=================== - -There are just two files to install to use this tool from within Galaxy: - -* ``align_back_trans.py`` (the Python script) -* ``align_back_trans.xml`` (the Galaxy tool definition) - -The suggested location is in a dedicated ``tools/align_back_trans`` folder. - -You will also need to modify the ``tools_conf.xml`` file to tell Galaxy to offer -the tool. One suggested location is in the multiple alignments section. Simply -add the line:: - - <tool file="align_back_trans/align_back_trans.xml" /> - -You will also need to install Biopython 1.62 or later. - -If you wish to run the unit tests, also move/copy the ``test-data/`` files -under Galaxy's ``test-data/`` folder. Then:: - - ./run_tests.sh -id align_back_trans - -That's it. - - -History -======= - -======= ====================================================================== -Version Changes -------- ---------------------------------------------------------------------- -v0.0.1 - Initial version, based on a previously written Python script -v0.0.2 - Optionally check the translation is consistent -v0.0.3 - First official release -v0.0.4 - Simplified XML to apply input format to output data. - - Fixed error message when sequence length not a multiple of three. -v0.0.5 - More explicit error messages when seqences lengths do not match. - - Tool definition now embeds citation information. -v0.0.6 - Reorder XML elements (internal change only). - - Use ``format_source=...`` tag. - - Planemo for Tool Shed upload (``.shed.yml``, internal change only). -======= ====================================================================== - - -Developers -========== - -This script was initially developed on this repository: -https://github.com/peterjc/picobio/blob/master/align/align_back_trans.py - -With the addition of a Galaxy wrapper, developement moved here: -https://github.com/peterjc/pico_galaxy/tree/master/tools/align_back_trans - -For pushing a release to the test or main "Galaxy Tool Shed", use the following -Planemo commands (which requires you have set your Tool Shed access details in -``~/.planemo.yml`` and that you have access rights on the Tool Shed):: - - $ planemo shed_update --shed_target testtoolshed --check_diff ~/repositories/pico_galaxy/tools/align_back_trans/ - ... - -or:: - - $ planemo shed_update --shed_target toolshed --check_diff ~/repositories/pico_galaxy/tools/align_back_trans/ - ... - -To just build and check the tar ball, use:: - - $ planemo shed_upload --tar_only ~/repositories/pico_galaxy/tools/align_back_trans/ - ... - $ tar -tzf shed_upload.tar.gz - test-data/demo_nucs.fasta - test-data/demo_nucs_trailing_stop.fasta - test-data/demo_prot_align.fasta - test-data/demo_nuc_align.fasta - tools/align_back_trans/README.rst - tools/align_back_trans/align_back_trans.py - tools/align_back_trans/align_back_trans.xml - tools/align_back_trans/tool_dependencies.xml - - -Licence (MIT) -============= - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in -all copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN -THE SOFTWARE.
--- a/align_back_trans.xml Fri Feb 03 13:13:17 2017 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,129 +0,0 @@ -<tool id="align_back_trans" name="Thread nucleotides onto a protein alignment (back-translation)" version="0.0.6"> - <description>Gives a codon aware alignment</description> - <requirements> - <requirement type="package" version="1.63">biopython</requirement> - <requirement type="python-module">Bio</requirement> - </requirements> - <stdio> - <!-- Anything other than zero is an error --> - <exit_code range="1:" /> - <exit_code range=":-1" /> - </stdio> - <version_command interpreter="python">align_back_trans.py --version</version_command> - <command interpreter="python"> -align_back_trans.py $prot_align.ext "$prot_align" "$nuc_file" "$out_nuc_align" "$table" - </command> - <inputs> - <param name="prot_align" type="data" format="fasta,muscle,clustal" label="Aligned protein file" help="Mutliple sequence file in FASTA, ClustalW or PHYLIP format." /> - <param name="table" type="select" label="Genetic code" help="Tables from the NCBI, these determine the start and stop codons"> - <option value="1">1. Standard</option> - <option value="2">2. Vertebrate Mitochondrial</option> - <option value="3">3. Yeast Mitochondrial</option> - <option value="4">4. Mold, Protozoan, Coelenterate Mitochondrial and Mycoplasma/Spiroplasma</option> - <option value="5">5. Invertebrate Mitochondrial</option> - <option value="6">6. Ciliate Macronuclear and Dasycladacean</option> - <option value="9">9. Echinoderm Mitochondrial</option> - <option value="10">10. Euplotid Nuclear</option> - <option value="11">11. Bacterial</option> - <option value="12">12. Alternative Yeast Nuclear</option> - <option value="13">13. Ascidian Mitochondrial</option> - <option value="14">14. Flatworm Mitochondrial</option> - <option value="15">15. Blepharisma Macronuclear</option> - <option value="16">16. Chlorophycean Mitochondrial</option> - <option value="21">21. Trematode Mitochondrial</option> - <option value="22">22. Scenedesmus obliquus</option> - <option value="23">23. Thraustochytrium Mitochondrial</option> - <option value="0">Don't check the translation</option> - </param> - <param name="nuc_file" type="data" format="fasta" label="Unaligned nucleotide sequences" help="FASTA format, using same identifiers as your protein alignment" /> - </inputs> - <outputs> - <data name="out_nuc_align" format_source="prot_align" metadata_source="prot_align" label="${prot_align.name} (back-translated)"/> - </outputs> - <tests> - <test> - <param name="prot_align" value="demo_prot_align.fasta" /> - <param name="nuc_file" value="demo_nucs.fasta" /> - <param name="table" value="0" /> - <output name="out_nuc_align" file="demo_nuc_align.fasta" /> - </test> - <test> - <param name="prot_align" value="demo_prot_align.fasta" /> - <param name="nuc_file" value="demo_nucs_trailing_stop.fasta" /> - <param name="table" value="11" /> - <output name="out_nuc_align" file="demo_nuc_align.fasta" /> - </test> - </tests> - <help> -**What it does** - -Takes an input file of aligned protein sequences (typically FASTA or Clustal -format), and a matching file of unaligned nucleotide sequences (FASTA format, -using the same identifiers), and threads the nucleotide sequences onto the -protein alignment to produce a codon aware nucleotide alignment - which can -be viewed as a back translation. - -If you specify one of the standard NCBI genetic codes (recommended), then the -translation is verified. This will allow fuzzy matching if stop codons in the -protein sequence have been reprented as X, and will allow for a trailing stop -codon present in the nucleotide sequences but not the protein. - -Note - the protein and nucleotide sequences must use the same identifers. - -Note - If no translation table is specified, the provided nucleotide sequences -should be exactly three times the length of the protein sequences (exluding the gaps). - -Note - the nucleotide FASTA file may contain extra sequences not in the -protein alignment, they will be ignored. This can be useful if for example -you have a nucleotide FASTA file containing all the genes in an organism, -while the protein alignment is for a specific gene family. - -**Example** - -Given this protein alignment in FASTA format:: - - >Alpha - DEER - >Beta - DE-R - >Gamma - D--R - -and this matching unaligned nucleotide FASTA file:: - - >Alpha - GATGAGGAACGA - >Beta - GATGAGCGU - >Gamma - GATCGG - -the tool would return this nucleotide alignment:: - - >Alpha - GATGAGGAACGA - >Beta - GATGAG---CGU - >Gamma - GAT------CGG - -Notice that all the gaps are multiples of three in length. - - -**Citation** - -This tool uses Biopython, so if you use this Galaxy tool in work leading to a -scientific publication please cite the following paper: - -Cock et al (2009). Biopython: freely available Python tools for computational -molecular biology and bioinformatics. Bioinformatics 25(11) 1422-3. -http://dx.doi.org/10.1093/bioinformatics/btp163 pmid:19304878. - -This tool is available to install into other Galaxy Instances via the Galaxy -Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/align_back_trans - </help> - <citations> - <citation type="doi">10.7717/peerj.167</citation> - <citation type="doi">10.1093/bioinformatics/btp163</citation> - </citations> -</tool>
--- a/shed.yml Fri Feb 03 13:13:17 2017 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,26 +0,0 @@ -name: align_back_trans -owner: peterjc -homepage_url: https://github.com/peterjc/pico_galaxy/tree/master/tools/align_back_trans -remote_repository_url: https://github.com/peterjc/pico_galaxy/tree/master/tools/align_back_trans -description: Thread nucleotides onto a protein alignment (back-translation) -long_description: | - Takes an input file of aligned protein sequences (typically FASTA or Clustal - format), and a matching file of unaligned nucleotide sequences (FASTA format, - using the same identifiers), and threads the nucleotide sequences onto the - protein alignment to produce a codon aware nucleotide alignment - which can - be viewed as a back translation. -categories: -- Fasta Manipulation -- Sequence Analysis -type: unrestricted -include: -- strip_components: 2 - source: - - ../../test-data/demo_nuc_align.fasta - - ../../test-data/demo_nucs.fasta - - ../../test-data/demo_nucs_trailing_stop.fasta - - ../../test-data/demo_prot_align.fasta - - ../../tools/align_back_trans/README.rst - - ../../tools/align_back_trans/align_back_trans.py - - ../../tools/align_back_trans/align_back_trans.xml - - ../../tools/align_back_trans/tool_dependencies.xml
--- a/tool_dependencies.xml Fri Feb 03 13:13:17 2017 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,6 +0,0 @@ -<?xml version="1.0"?> -<tool_dependency> - <package name="biopython" version="1.63"> - <repository changeset_revision="a5c49b83e983" name="package_biopython_1_63" owner="biopython" toolshed="https://toolshed.g2.bx.psu.edu" /> - </package> -</tool_dependency>