comparison tools/protein_analysis/signalp3.xml @ 1:9a8a7f680dd6

Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
author peterjc
date Tue, 07 Jun 2011 17:38:05 -0400
parents a2eeeaa6f75e
children fe10f448d641
comparison
equal deleted inserted replaced
0:a2eeeaa6f75e 1:9a8a7f680dd6
1 <tool id="signalp3" name="SignalP 3.0" version="0.0.1"> 1 <tool id="signalp3" name="SignalP 3.0" version="0.0.3">
2 <description>Find signal peptides in protein sequences</description> 2 <description>Find signal peptides in protein sequences</description>
3 <command interpreter="python"> 3 <command interpreter="python">
4 signalp3.py $organism $truncate 8 $fasta_file $tabular_file 4 signalp3.py $organism $truncate 8 $fasta_file $tabular_file
5 ##I want the number of threads to be a Galaxy config option... 5 ##I want the number of threads to be a Galaxy config option...
6 </command> 6 </command>
24 <tests> 24 <tests>
25 <test> 25 <test>
26 <param name="fasta_file" value="four_human_proteins.fasta" ftype="fasta"/> 26 <param name="fasta_file" value="four_human_proteins.fasta" ftype="fasta"/>
27 <param name="organism" value="euk"/> 27 <param name="organism" value="euk"/>
28 <param name="truncate" value="0"/> 28 <param name="truncate" value="0"/>
29 <output name="tabular_file" file="four_human_proteins.signalp3.tsv" ftype="tabular"/> 29 <output name="tabular_file" file="four_human_proteins.signalp3.tabular" ftype="tabular"/>
30 </test>
31 <test>
32 <param name="fasta_file" value="empty.fasta" ftype="fasta"/>
33 <param name="organism" value="euk"/>
34 <param name="truncate" value="60"/>
35 <output name="tabular_file" file="empty_signalp3.tabular" ftype="tabular"/>
36 </test>
37 <test>
38 <param name="fasta_file" value="empty.fasta" ftype="fasta"/>
39 <param name="organism" value="gram+"/>
40 <param name="truncate" value="80"/>
41 <output name="tabular_file" file="empty_signalp3.tabular" ftype="tabular"/>
42 </test>
43 <test>
44 <param name="fasta_file" value="empty.fasta" ftype="fasta"/>
45 <param name="organism" value="gram-"/>
46 <param name="truncate" value="0"/>
47 <output name="tabular_file" file="empty_signalp3.tabular" ftype="tabular"/>
30 </test> 48 </test>
31 </tests> 49 </tests>
32 <help> 50 <help>
33 51
34 **What it does** 52 **What it does**
35 53
36 This calls the SignalP v3.0 tool for prediction of signal peptides, which uses both a neural network (NN) and Hidden Markmov Model (HMM) to produce two sets of scores. 54 This calls the SignalP v3.0 tool for prediction of signal peptides, which uses both a Neural Network (NN) and Hidden Markov Model (HMM) to produce two sets of scores.
37 55
38 The input is a FASTA file of protein sequences, and the output is tabular with twenty columns (one row per protein): 56 The input is a FASTA file of protein sequences, and the output is tabular with twenty columns (one row per protein):
39 57
40 * Sequence identifier 58 * Sequence identifier
41 * Neural Network (NN) predictions (13 columns) 59 * Neural Network (NN) predictions (13 columns)
55 73
56 Y-max is a derivative of the C-score combined with the S-score resulting in a better cleavage site prediction than the raw C-score alone. This is due to the fact that multiple high-peaking C-scores can be found in one sequence, where only one is the true cleavage site. The cleavage site is assigned from the Y-score where the slope of the S-score is steep and a significant C-score is found. 74 Y-max is a derivative of the C-score combined with the S-score resulting in a better cleavage site prediction than the raw C-score alone. This is due to the fact that multiple high-peaking C-scores can be found in one sequence, where only one is the true cleavage site. The cleavage site is assigned from the Y-score where the slope of the S-score is steep and a significant C-score is found.
57 75
58 The S-mean is the average of the S-score, ranging from the N-terminal amino acid to the amino acid assigned with the highest Y-max score, thus the S-mean score is calculated for the length of the predicted signal peptide. The S-mean score was in SignalP version 2.0 used as the criteria for discrimination of secretory and non-secretory proteins. 76 The S-mean is the average of the S-score, ranging from the N-terminal amino acid to the amino acid assigned with the highest Y-max score, thus the S-mean score is calculated for the length of the predicted signal peptide. The S-mean score was in SignalP version 2.0 used as the criteria for discrimination of secretory and non-secretory proteins.
59 77
60 The D-score is introduced in SignalP version 3.0 and is a simple average of the S-mean and Y-max score. The score shows superior discrimination performance of secretory and non-secretory proteins to that of the S-mean score which was used in SignalP version 1 and 2. 78 The D-score was introduced in SignalP version 3.0 and is a simple average of the S-mean and Y-max score. The score shows superior discrimination performance of secretory and non-secretory proteins to that of the S-mean score which was used in SignalP version 1 and 2.
61 79
62 For non-secretory proteins all the scores represented in the SignalP3-NN output should ideally be very low. 80 For non-secretory proteins all the scores represented in the SignalP3-NN output should ideally be very low.
63 81
64 **Hidden Markov Model Scores** 82 **Hidden Markov Model Scores**
65 83