Mercurial > repos > drosofff > msp_sr_signature
annotate signature.xml @ 1:9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
Also added a working test case. (<tests> and </tests> was missing, and output file must be references with file= and not value=).
| author | chris <drosofff@gmail.com> | 
|---|---|
| date | Mon, 16 Feb 2015 12:08:18 +0100 | 
| parents | d613dbee3ce4 | 
| children | 2b30861d95f4 | 
| rev | line source | 
|---|---|
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
1 <tool id="signature" name="Small RNA Signatures" version="2.0.1"> | 
| 0 | 2 <description></description> | 
| 3 <requirements> | |
| 4 <requirement type="package" version="0.12.7">bowtie</requirement> | |
| 5 <requirement type="package" version="0.1.18">samtools</requirement> | |
| 6 <requirement type="package" version="0.7.7">pysam</requirement> | |
| 7 <requirement type="package" version="2.14">biocbasics</requirement> | |
| 8 <requirement type="package" version="3.0.3">R</requirement> | |
| 9 </requirements> | |
| 10 <command interpreter="python"> | |
| 11 signature.py | |
| 12 --input $refGenomeSource.input | |
| 13 --inputFormat $refGenomeSource.input.ext | |
| 14 --minquery $minquery | |
| 15 --maxquery $maxquery | |
| 16 --mintarget $mintarget | |
| 17 --maxtarget $maxtarget | |
| 18 --minscope $minscope | |
| 19 --maxscope $maxscope | |
| 20 --outputOverlapDataframe $output | |
| 21 #if $refGenomeSource.genomeSource == "history": | |
| 22 --referenceGenome $refGenomeSource.ownFile | |
| 23 #else: | |
| 24 #silent reference= filter( lambda x: str( x[0] ) == str( $input.dbkey ), $__app__.tool_data_tables[ 'bowtie_indexes' ].get_fields() )[0][-1] | |
| 25 --referenceGenome $reference | |
| 26 --extract_index | |
| 27 #end if | |
| 28 --graph $graph_type | |
| 29 --rcode $sigplotter | |
| 30 </command> | |
| 31 | |
| 32 <inputs> | |
| 33 <conditional name="refGenomeSource"> | |
| 34 <param name="genomeSource" type="select" label="Will you select a reference genome from your history or use a built-in index?" help="Built-ins were indexed using default options"> | |
| 35 <option value="indexed">Use a built-in index</option> | |
| 36 <option value="history">Use one from the history</option> | |
| 37 </param> | |
| 38 <when value="indexed"> | |
| 39 <param name="input" type="data" format="tabular,sam,bam" label="Compute signature from this bowtie standard output"> | |
| 40 <validator type="dataset_metadata_in_data_table" table_name="bowtie_indexes" metadata_name="dbkey" metadata_column="0" message="database not set for this bowtie output. Select the database(=genome used for matching) manually, or select a reference fasta from your history."/> | |
| 41 </param> | |
| 42 </when> | |
| 43 <when value="history"> | |
| 44 <param name="ownFile" type="data" format="fasta" label="Select the fasta reference" /> | |
| 45 <param name="input" type="data" format="tabular,sam,bam" label="Compute signature from this bowtie standard output"/> | |
| 46 </when> | |
| 47 </conditional> <!-- refGenomeSource --> | |
| 48 <param name="minquery" type="integer" size="3" value="23" label="Min size of query small RNAs" help="'23' = 23 nucleotides"/> | |
| 49 <param name="maxquery" type="integer" size="3" value="29" label="Max size of query small RNAs" help="'29' = 29 nucleotides"/> | |
| 50 <param name="mintarget" type="integer" size="3" value="23" label="Min size of target small RNAs" help="'23' = 23 nucleotides"/> | |
| 51 <param name="maxtarget" type="integer" size="3" value="29" label="Max size of target small RNAs" help="'29' = 29 nucleotides"/> | |
| 52 <param name="minscope" type="integer" size="3" value="1" label="Minimal relative overlap analyzed" help="'1' = 1 nucleotide overlap"/> | |
| 53 <param name="maxscope" type="integer" size="3" value="26" label="Maximal relative overlap analyzed" help="'1' = 1 nucleotide overlap"/> | |
| 54 <param name="graph_type" type="select" label="Graph type" help="Signature can be computed globally or by item present in the alignment file"> | |
| 55 <option value="global" selected="True">Global</option> | |
| 56 <option value="lattice">Lattice</option> | |
| 57 </param> | |
| 58 </inputs> | |
| 59 | |
| 60 <configfiles> | |
| 61 <configfile name="sigplotter"> | |
| 62 graph_type = "${graph_type}" | |
| 63 | |
| 64 globalgraph = function () { | |
| 65 ## Setup R error handling to go to stderr | |
| 66 options( show.error.messages=F, | |
| 67 error = function () { cat( geterrmessage(), file=stderr() ); q( "no", 1, F ) } ) | |
| 68 signature = read.delim("${output}", header=TRUE) | |
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
69 signaturez=data.frame(signature[,1], (signature[,2] -mean(signature[,2]))/sd(signature[,2])) | 
| 
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
70 overlap_prob_z=data.frame(signature[,1], (signature[,3] -mean(signature[,3]))/sd(signature[,3])) | 
| 0 | 71 YLIM=max(signature[,2]) | 
| 72 | |
| 73 ## Open output2 PDF file | |
| 74 pdf( "${output2}" ) | |
| 75 par(mfrow=c(2,2),oma = c(0, 0, 3, 0)) | |
| 76 | |
| 77 plot(signature[,1:2], type = "h", main="Numbers of pairs", cex.main=1, xlab="overlap (nt)", ylim=c(0,YLIM), ylab="Numbers of pairs", col="darkslateblue", lwd=4) | |
| 78 | |
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
79 plot(signaturez, type = "l", main="Number of pairs Z-scores", cex.main=1, xlab="overlap (nt)", ylab="z-score", pch=19, cex=0.2, col="darkslateblue", lwd=2) | 
| 0 | 80 | 
| 81 plot(signature[,1], signature[,3]*100, type = "l", main="Overlap probabilities", | |
| 82 cex.main=1, xlab="overlap (nt)", ylab="Probability [%]", ylim=c(0,50), | |
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
83 pch=19, col="darkslateblue", lwd=2) | 
| 0 | 84 | 
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
85 plot(overlap_prob_z, type = "l", main="Overlap Probability Z-scores", cex.main=1, xlab="overlap (nt)", ylab="z-score", pch=19, cex=0.2, col="darkslateblue", lwd=2) | 
| 
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
86 | 
| 0 | 87 mtext("Overlap Signatures of ${minquery}-${maxquery} against ${mintarget}-${maxtarget}nt small RNAs", outer = TRUE, cex=1) | 
| 88 devname = dev.off() | |
| 89 ## Close the PDF file | |
| 90 } | |
| 91 | |
| 92 treillisgraph = function () { | |
| 93 ## Open output2 PDF file | |
| 94 pdf( "${output2}", paper="special", height=11.69, width=8.2677 ) | |
| 95 signature = read.delim("${output}", header=TRUE) | |
| 96 options( show.error.messages=F, | |
| 97 error = function () { cat( geterrmessage(), file=stderr() ); q( "no", 1, F ) } ) | |
| 98 library(lattice) | |
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
99 print (xyplot(signature[,3]*100~signature[,1]|signature[,4], type = "l", xlim=c(${minscope},${maxscope}), main="ping-pong Signature of ${minquery}-${maxquery} against ${mintarget}-${maxtarget}nt small RNAs", | 
| 0 | 100 par.strip.text=list(cex=.5), strip=strip.custom(which.given=1, bg="lightblue"), scales=list(cex=0.5), | 
| 101 cex.main=1, cex=.5, xlab="overlap (nt)", ylab="ping-pong signal [%]", | |
| 102 pch=19, col="darkslateblue", lwd =1.5, cex.lab=1.2, cex.axis=1.2, | |
| 103 layout=c(4,12), as.table=TRUE, newpage = T) ) | |
| 104 devnname = dev.off() | |
| 105 } | |
| 106 | |
| 107 if (graph_type=="global") { | |
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
108 globalgraph() | 
| 0 | 109 | 
| 110 } | |
| 111 if(graph_type=="lattice") { | |
| 112 treillisgraph() | |
| 113 } | |
| 114 </configfile> | |
| 115 </configfiles> | |
| 116 | |
| 117 <outputs> | |
| 118 <data name="output" format="tabular" label = "signature data frame"/> | |
| 119 <data name="output2" format="pdf" label="Overlap probabilities"/> | |
| 120 </outputs> | |
| 121 | |
| 122 <help> | |
| 123 | |
| 124 **What it does** | |
| 125 | |
| 126 This tool computes the number of pairs by overlap classes (in nt) from a bowtie output file, the z-score calculated from these numbers of pairs, and the ping-pong signal as described in Brennecke et al (2009) Science. | |
| 127 The numerical options set the min and max size of both the query small rna class and the target small rna class | |
| 128 Three type of signals are plotted in separate pdf files, the number of pairs founds, the z-score calculated from these numbers of pairs, and the ping-pong signal as described in Brennecke et al (2009) Science. | |
| 129 | |
| 130 </help> | |
| 131 | |
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
132 <tests> | 
| 0 | 133 <test> | 
| 134 <param name="genomeSource" value="history" /> | |
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
135 <param name="ownFile" value ="ensembl.fa" ftype="fasta" /> | 
| 
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
136 <param name="input" value="sr_bowtie.bam" ftype="bam" /> | 
| 0 | 137 <param name="minquery" value="23" /> | 
| 138 <param name="maxquery" value="29" /> | |
| 139 <param name="mintarget" value="23" /> | |
| 140 <param name="maxtarget" value="29" /> | |
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
141 <param name="minscope" value="5" /> | 
| 
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
142 <param name="maxscope" value="15" /> | 
| 0 | 143 <param name="graph_type" value="global" /> | 
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
144 <output name="output" ftype="tabular" file="signature.tab"/> | 
| 
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
145 <output name="output2" ftype="pdf" file="signature.pdf"/> | 
| 0 | 146 </test> | 
| 
1
 
9274c7b1e85c
Fixed issue: now the plot properly reflects a subset of analysed overlaps, i.e 5 to 15 nucleotides of overlap.
 
chris <drosofff@gmail.com> 
parents: 
0 
diff
changeset
 | 
147 </tests> | 
| 0 | 148 | 
| 149 | |
| 150 </tool> | |
| 151 | 
