annotate README @ 18:547d8db4673e

Update create_reference_dataset for non human genome builds
author Jim Johnson <jj@umn.edu>
date Sat, 15 Jun 2013 14:36:47 -0500
parents 06675bd664ee
children 225750bf3770
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
8
06675bd664ee Update to DeFuse verion 0.6.1 and change tool dependencies for autoinstall
Jim Johnson <jj@umn.edu>
parents: 6
diff changeset
1 The DeFuse galaxy tool is based on DeFuse_Version_0.6.1
0
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
2 http://sourceforge.net/apps/mediawiki/defuse/index.php?title=Main_Page
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
3
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
4 DeFuse is a software package for gene fusion discovery using RNA-Seq data. The software uses clusters of discordant paired end alignments to inform a split read alignment analysis for finding fusion boundaries. The software also employs a number of heuristic filters in an attempt to reduce the number of false positives and produces a fully annotated output for each predicted fusion.
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
5
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
6
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
7 Manual:
18
547d8db4673e Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents: 8
diff changeset
8 http://sourceforge.net/apps/mediawiki/defuse/index.php?title=DeFuse_Version_0.6.1
0
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
9
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
10 The included tool_dependencies.xml will download and install the defuse code.
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
11 It will set the environment variable: "DEFUSE_PATH" to the location of the defuse install.
4
ffc5e442c1ca Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
12 The tool_dependencies.xml also has the download for bowtie.
0
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
13
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
14
6
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
15 The defuse.pl command relies on a configuration file to specifiy options, the location of reference data, and other applications that it depends upon: bowtie, bowtie-build, samtools, gmap, blat, fatotwobit, R, and Rscript.
0
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
16
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
17 The DeFuse galaxy tool can either construct the config.txt file that is mentioned in the defuse manual, or select an existing config.txt file in the users history.
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
18 When constructing the config.txt file, the DeFuse tool uses the values selected in: tool-data/defuse.loc
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
19 The dictionary field in the tool-data/defuse.loc can be used to set fields in the config.txt file, including the site specific location of reference data and the paths to the other application binaries.
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
20 The "Defuse parameter settings" are used to alter options in the config.txt file.
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
21
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
22 The DeFuse galaxy tool also generates a bash script to run defuse.
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
23 That script will attempt to edit the config.txt file to specifiy any unset paths to applications that defuse relies upon:
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
24 bowtie, bowtie-build, samtools, blat, fatotwobit, R, and Rscript
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
25 The script uses the using the shell "which" command to discover the application path, so the required applications should in PATH environment variable.
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
26
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
27
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
28 Generate Reference Datasets as described in the Manual:
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
29
6
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
30 Reference Dataset
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
31 The reference dataset setup process has been simplified as of deFuse 0.6.0, and deFuse now automatically downloads all required files.
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
32 The create_reference_dataset.pl script will download the genome and other source files, and build any derivative files including bowtie indices, gmap indices, and 2bit files. Run the following command. Expect this step to take at least 12 hours.
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
33 create_reference_dataset.pl -c config.txt
0
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
34
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
35 These datasets should be referenced in the tool-data/defuse.loc file.
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
36
18
547d8db4673e Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents: 8
diff changeset
37 The create_reference_dataset will run the create_reference_dataset.pl script to generate deFuse genome reference data in a galaxy dataset.
547d8db4673e Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents: 8
diff changeset
38 This should me made available in the future as a Galaxy DataManager.
547d8db4673e Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents: 8
diff changeset
39
0
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
40
8
06675bd664ee Update to DeFuse verion 0.6.1 and change tool dependencies for autoinstall
Jim Johnson <jj@umn.edu>
parents: 6
diff changeset
41 Galaxy will try to auto-install dependencies:
06675bd664ee Update to DeFuse verion 0.6.1 and change tool dependencies for autoinstall
Jim Johnson <jj@umn.edu>
parents: 6
diff changeset
42
06675bd664ee Update to DeFuse verion 0.6.1 and change tool dependencies for autoinstall
Jim Johnson <jj@umn.edu>
parents: 6
diff changeset
43 External Tools ( http://sourceforge.net/apps/mediawiki/defuse/index.php?title=DeFuse_Version_0.6.1 )
4
ffc5e442c1ca Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
44 deFuse relies on other publically available tools as part of its pipeline. Some of these tools are not included with the deFuse download. Obtain these tools as detailed below.
6
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
45 Download samtools
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
46 The latest version of samtools can be downloaded from sourceforge: https://sourceforge.net/projects/samtools/files/samtools.
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
47 Set the samtools_bin entry in config.txt to the fully qualified paths of the samtools binary.
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
48 Download bowtie
4
ffc5e442c1ca Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
49 The latest version of bowtie can be downloaded from sourceforge: http://sourceforge.net/projects/bowtie-bio/files/bowtie/. deFuse has been tested on version 0.12.5.
ffc5e442c1ca Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
50 Set the bowtie_bin and bowtie_build_bin entries in config.txt to the fully qualified paths of the bowtie and bowtie-build binaries.
ffc5e442c1ca Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
51 Download blat and faToTwoBit
ffc5e442c1ca Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
52 The latest blat tool suite can be downloaded from the ucsc website: http://hgdownload.cse.ucsc.edu/admin/exe/. Download blat and faToTwoBit and set the blat_bin and fatotwobit_bin entries in config.txt to the fully qualified paths of the blat and faToTwoBit binaries.
6
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
53 Download GMAP
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
54 The latest version of GMAP can be downloaded here http://research-pub.gene.com/gmap/. Build with a default configuration. Do not worry about the `--with-gmapdb` build flag, deFuse will request a specific directory for the database anyway.
4
ffc5e442c1ca Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
55 Download R
ffc5e442c1ca Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
56 The latest version of R can be downloaded from the R project website: http://www.r-project.org/. Install R and then locate the R and Rscript executables, and set the r_bin and rscript_bin entries in config.txt to the path of those executables.
6
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
57 Install the ada package. Run R, then at the prompt type install.packages("ada")
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
58 Reference Dataset
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
59 The reference dataset setup process has been simplified as of deFuse 0.6.0, and deFuse now automatically downloads all required files.
5545ec69acbd Defuse version 0.6.0
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
60 The create_reference_dataset.pl script will download the genome and other source files, and build any derivative files including bowtie indices, gmap indices, and 2bit files. Run the following command. Expect this step to take at least 12 hours.
4
ffc5e442c1ca Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
61 create_reference_dataset.pl -c config.txt
0
b75ea9927793 Uploaded
jjohnson
parents:
diff changeset
62