annotate rgFastQC.xml @ 10:5b253fafd35a draft

Uploaded
author bgruening
date Thu, 06 Jun 2013 02:44:50 -0400
parents f126b49e93e7
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
9
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
1 <tool name="FastQC: Comprehensive QC" id="fastqc" version="0.53">
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
2 <description>reporting for short read sequence</description>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
3 <command interpreter="python">
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
4 rgFastQC.py -i "$input_file" -d "$html_file.files_path" -o "$html_file" -n "$out_prefix" -f "$input_file.ext" -j "$input_file.name"
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
5 #if $contaminants.dataset and str($contaminants) > ''
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
6 -c "$contaminants"
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
7 #end if
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
8 -e fastqc
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
9 </command>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
10 <requirements>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
11 <requirement type="package" version="0.10.1">fastqc_dist_0_10_1_dependency</requirement>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
12 </requirements>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
13 <inputs>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
14 <param format="fastqsanger,fastq,bam,sam" name="input_file" type="data" label="Short read data from your current history" />
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
15 <param name="out_prefix" value="FastQC" type="text" label="Title for the output file - to remind you what the job was for" size="80"
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
16 help="Letters and numbers only please - other characters will be removed">
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
17 <sanitizer invalid_char="">
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
18 <valid initial="string.letters,string.digits"/>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
19 </sanitizer>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
20 </param>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
21 <param name="contaminants" type="data" format="tabular" optional="true" label="Contaminant list"
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
22 help="tab delimited file with 2 columns: name and sequence. For example: Illumina Small RNA RT Primer CAAGCAGAAGACGGCATACGA"/>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
23 </inputs>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
24 <outputs>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
25 <data format="html" name="html_file" label="${out_prefix}_${input_file.name}.html" />
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
26 </outputs>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
27 <tests>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
28 <test>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
29 <param name="input_file" value="1000gsample.fastq" />
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
30 <param name="out_prefix" value="fastqc_out" />
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
31 <param name="contaminants" value="fastqc_contaminants.txt" ftype="tabular" />
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
32 <output name="html_file" file="fastqc_report.html" ftype="html" lines_diff="100"/>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
33 </test>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
34 </tests>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
35 <help>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
36
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
37 .. class:: infomark
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
38
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
39 **Purpose**
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
40
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
41 FastQC aims to provide a simple way to do some quality control checks on raw
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
42 sequence data coming from high throughput sequencing pipelines.
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
43 It provides a modular set of analyses which you can use to give a quick
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
44 impression of whether your data has any problems of
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
45 which you should be aware before doing any further analysis.
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
46
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
47 The main functions of FastQC are:
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
48
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
49 - Import of data from BAM, SAM or FastQ files (any variant)
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
50 - Providing a quick overview to tell you in which areas there may be problems
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
51 - Summary graphs and tables to quickly assess your data
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
52 - Export of results to an HTML based permanent report
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
53 - Offline operation to allow automated generation of reports without running the interactive application
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
54
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
55
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
56 -----
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
57
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
58
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
59 .. class:: infomark
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
60
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
61 **FastQC**
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
62
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
63 This is a Galaxy wrapper. It merely exposes the external package FastQC_ which is documented at FastQC_
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
64 Kindly acknowledge it as well as this tool if you use it.
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
65 FastQC incorporates the Picard-tools_ libraries for sam/bam processing.
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
66
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
67 The contaminants file parameter was borrowed from the independently developed
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
68 fastqcwrapper contributed to the Galaxy Community Tool Shed by J. Johnson.
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
69
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
70 -----
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
71
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
72 .. class:: infomark
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
73
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
74 **Inputs and outputs**
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
75
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
76 FastQC_ is the best place to look for documentation - it's very good.
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
77 A summary follows below for those in a tearing hurry.
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
78
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
79 This wrapper will accept a Galaxy fastq, sam or bam as the input read file to check.
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
80 It will also take an optional file containing a list of contaminants information, in the form of
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
81 a tab-delimited file with 2 columns, name and sequence.
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
82
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
83 The tool produces a single HTML output file that contains all of the results, including the following:
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
84
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
85 - Basic Statistics
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
86 - Per base sequence quality
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
87 - Per sequence quality scores
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
88 - Per base sequence content
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
89 - Per base GC content
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
90 - Per sequence GC content
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
91 - Per base N content
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
92 - Sequence Length Distribution
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
93 - Sequence Duplication Levels
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
94 - Overrepresented sequences
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
95 - Kmer Content
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
96
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
97 All except Basic Statistics and Overrepresented sequences are plots.
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
98 .. _FastQC: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
99 .. _Picard-tools: http://picard.sourceforge.net/index.shtml
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
100
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
101 </help>
f126b49e93e7 Uploaded
bgruening
parents:
diff changeset
102 </tool>