annotate README @ 1:e0a835a2f74a draft default tip

planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 1a82c0609363fe0541e6dbdd46308a67f30ca9e1-dirty
author jjohnson
date Mon, 20 May 2019 15:30:06 -0400
parents 63f23d5db27c
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
1 The DeFuse galaxy tool is based on DeFuse_Version_0.6.2
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
2 http://sourceforge.net/apps/mediawiki/defuse/index.php?title=Main_Page
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
3 https://bitbucket.org/dranew/defuse
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
4
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
5 DeFuse is a software package for gene fusion discovery using RNA-Seq data. The software uses clusters of discordant paired end alignments to inform a split read alignment analysis for finding fusion boundaries. The software also employs a number of heuristic filters in an attempt to reduce the number of false positives and produces a fully annotated output for each predicted fusion.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
6
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
7
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
8 Manual:
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
9 http://sourceforge.net/apps/mediawiki/defuse/index.php?title=DeFuse_Version_0.6.2
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
10
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
11 The included tool_dependencies.xml will download and install the defuse code.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
12 It will set the environment variable: "DEFUSE_PATH" to the location of the defuse install.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
13 The tool_dependencies.xml also has the download for bowtie.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
14
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
15
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
16 The defuse.pl command relies on a configuration file to specifiy options, the location of reference data, and other applications that it depends upon: bowtie, bowtie-build, samtools, gmap, blat, fatotwobit, R, and Rscript.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
17
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
18 The DeFuse galaxy tool can either construct the config.txt file that is mentioned in the defuse manual, or select an existing config.txt file in the users history.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
19 When constructing the config.txt file, the DeFuse tool uses the values selected in: tool-data/defuse.loc
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
20 The dictionary field in the tool-data/defuse.loc can be used to set fields in the config.txt file, including the site specific location of reference data and the paths to the other application binaries.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
21 The "Defuse parameter settings" are used to alter options in the config.txt file.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
22
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
23 The DeFuse galaxy tool also generates a bash script to run defuse.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
24 That script will attempt to edit the config.txt file to specifiy any unset paths to applications that defuse relies upon:
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
25 bowtie, bowtie-build, samtools, blat, fatotwobit, R, and Rscript
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
26 The script uses the using the shell "which" command to discover the application path, so the required applications should in PATH environment variable.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
27
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
28
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
29 Generate Reference Datasets as described in the Manual:
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
30
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
31 Reference Dataset
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
32 The reference dataset setup process has been simplified as of deFuse 0.6.0, and deFuse now automatically downloads all required files.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
33 The create_reference_dataset.pl script will download the genome and other source files, and build any derivative files including bowtie indices, gmap indices, and 2bit files. Run the following command. Expect this step to take at least 12 hours.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
34 create_reference_dataset.pl -c config.txt
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
35
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
36 These datasets should be referenced in the tool-data/defuse.loc file.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
37
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
38 The create_reference_dataset will run the create_reference_dataset.pl script to generate deFuse genome reference data in a galaxy dataset.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
39 This should me made available in the future as a Galaxy DataManager.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
40
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
41
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
42 Galaxy will try to auto-install dependencies:
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
43
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
44 External Tools ( http://sourceforge.net/apps/mediawiki/defuse/index.php?title=DeFuse_Version_0.6.2 )
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
45 deFuse relies on other publically available tools as part of its pipeline. Some of these tools are not included with the deFuse download. Obtain these tools as detailed below.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
46 Download samtools
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
47 The latest version of samtools can be downloaded from sourceforge: https://sourceforge.net/projects/samtools/files/samtools.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
48 Set the samtools_bin entry in config.txt to the fully qualified paths of the samtools binary.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
49 Download bowtie
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
50 The latest version of bowtie can be downloaded from sourceforge: http://sourceforge.net/projects/bowtie-bio/files/bowtie/. deFuse has been tested on version 0.12.5.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
51 Set the bowtie_bin and bowtie_build_bin entries in config.txt to the fully qualified paths of the bowtie and bowtie-build binaries.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
52 Download blat and faToTwoBit
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
53 The latest blat tool suite can be downloaded from the ucsc website: http://hgdownload.cse.ucsc.edu/admin/exe/. Download blat and faToTwoBit and set the blat_bin and fatotwobit_bin entries in config.txt to the fully qualified paths of the blat and faToTwoBit binaries.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
54 Download GMAP
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
55 The latest version of GMAP can be downloaded here http://research-pub.gene.com/gmap/. Build with a default configuration. Do not worry about the `--with-gmapdb` build flag, deFuse will request a specific directory for the database anyway.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
56 Download R
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
57 The latest version of R can be downloaded from the R project website: http://www.r-project.org/. Install R and then locate the R and Rscript executables, and set the r_bin and rscript_bin entries in config.txt to the path of those executables.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
58 Install the ada package. Run R, then at the prompt type install.packages("ada")
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
59 Reference Dataset
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
60 The reference dataset setup process has been simplified as of deFuse 0.6.0, and deFuse now automatically downloads all required files.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
61 The create_reference_dataset.pl script will download the genome and other source files, and build any derivative files including bowtie indices, gmap indices, and 2bit files. Run the following command. Expect this step to take at least 12 hours.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
62 create_reference_dataset.pl -c config.txt
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
63
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
64
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
65 defuse_trinity_analysis.py - Validating deFuse predictions using Trinity de novo assembled transcripts
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
66
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
67 DeFuse provides a total fusion sequence of 200-500 nucleotides (nts) around the fusion breakpoint. This may be insufficient to predict the effect of the fusion on protein production. To get a view of the full transcript containing the fusion, Trinity de novo transcripts from the RNA-seq data are compared with the deFuse fusion sequences using a subsequence around the deFuse indetified fusion breakpoint. The Trinity transcriptToOrfs output provides potential proteins from the projected fusion transcript.
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
68
63f23d5db27c planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 2c2fd38cb761ec57bac7a0bd376e6aa2b88265d0-dirty
jjohnson
parents:
diff changeset
69