diff preprocess.xml @ 11:ab9e3c8ab443 draft

planemo upload for repository https://github.com/geraldinepascal/FROGS-wrappers/ commit f1287ef131de1eb33c9d59c1b66312fe854a8c64-dirty
author oinizan
date Fri, 13 May 2022 15:33:27 +0000
parents 7bf54edaba24
children 613b7551f28b
line wrap: on
line diff
--- a/preprocess.xml	Thu May 12 11:40:39 2022 +0000
+++ b/preprocess.xml	Fri May 13 15:33:27 2022 +0000
@@ -340,7 +340,7 @@
    :Files: One sequence file by sample (format `FASTQ <https://en.wikipedia.org/wiki/FASTA_format>`_)
    :Example: splA.fastq.gz,  splB.fastq.gz
 
-Remark: In an archive if you use R1 and R2 files their names must end with *_R1* and *_R2*. The part upstram from this tag (_R1 and _R2) will be consider as sample name.
+Remark: In an archive, if you use R1 and R2 files, their names must end with *_R1* and *_R2*. The part upstream from this tag (_R1 and _R2) will be consider as sample name.
 
 .. class:: h3
 
@@ -348,29 +348,29 @@
 
 **Sequence file** (dereplicated.fasta):
 
- Only one file with all samples sequences (format `FASTA <https://en.wikipedia.org/wiki/FASTA_format>`_). These sequences are dereplicated: strictly identical sequence are represented only once, the initial count by sample is kept in count file (see bellowà) and the total count is added in the sequence header. A "FROGS_combined" suffix will be added to un-merged pair sequence if you want to keep them.
+ Only one file with all samples sequences (format `FASTA <https://en.wikipedia.org/wiki/FASTA_format>`_). These sequences are dereplicated: strictly identical sequences are represented only once, the initial count by sample is kept in count file (see bellow) and the total count is added in the sequence header. A "FROGS_combined" suffix will be added to un-merged pair sequence if you want to keep them.
 
 **Count file** (count.tsv):
 
  This file contains the count of all unique sequences in each sample (format `TSV <https://en.wikipedia.org/wiki/Tab-separated_values>`_).
 
-**Summary file** (report.html):
+**Report file** (report.html):
 
  This file reports the number of remaining sequences after each filter (format `HTML <https://en.wikipedia.org/wiki/HTML>`_). Depending of the tool configuration there will be more or less filtering steps so more or less bars in the barplot.
 
- .. image:: static/images/FROGS_preprocess_summary_v3.png
+ .. image:: FROGS_preprocess_summary_v3.png
      :height: 850
      :width: 831
 
  It also presents the length distribution of the full amplicon sequences after merging step and after filtering steps.
 
- .. image:: static/images/FROGS_preprocess_lengthsSamples_v3.png
+ .. image:: FROGS_preprocess_lengthsSamples_v3.png
      :height: 379
      :width: 364
 
 .. class:: infomark page-header h2
 
-How it works
+How it works ?
 
 .. csv-table::
    :header: "Steps", "Illumina", "454"
@@ -378,8 +378,8 @@
    :class: table table-striped
 
    "1", "For un-merged data: Merges R1 and R2 with a maximum of M% mismatch in the overlaped region(`VSEARCH <https://github.com/torognes/vsearch/>`_ or `FLASH <http://ccb.jhu.edu/software/FLASH/>`_ or optionnaly `PEAR <https://sco.h-its.org/exelixis/web/software/pear/>`_) with a minimum of 10 bp in the overlap region. Resulting un-merged reads may optionnaly be artificially combined by adding 100 N between the reads", "/"
-   "2", "If sequencing protocol is the illumina standard protocol : Removes sequences where the two primers are not present and removes primers in the remaining sequence (`cutadapt <http://cutadapt.readthedocs.org/en/latest/guide.html>`_). The primer search accepts 10% of differences", "Removes sequences where the two primers are not present, removes primers sequence from amplicon sequence and reverse complement the sequences on strand -  (`cutadapt <http://cutadapt.readthedocs.org/en/latest/guide.html>`_). The primer search accepts 10% of differences"
-   "3", "Filters sequences with ambiguous nucleotides and for merged sequences filters on their length which must be range between 'Minimum amplicon size - primer length' and 'Maximum amplicon size - primer length'", "Removes sequences with at least one homopolymer with more than seven nucleotides and with a distance of less than or equal to 10 nucleo-tides between two poor quality positions, i.e. with a Phred quality score lesser than 10"
+   "2", "If sequencing protocol is the illumina standard protocol: Removes sequences where the two primers are not present and removes primers in the remaining sequence (`cutadapt <http://cutadapt.readthedocs.org/en/latest/guide.html>`_). The primer search accepts 10% of differences", "Removes sequences where the two primers are not present, removes primers sequence from amplicon sequence and reverse complement the sequences on strand -  (`cutadapt <http://cutadapt.readthedocs.org/en/latest/guide.html>`_). The primer search accepts 10% of differences"
+   "3", "Filters sequences with ambiguous nucleotides and for merged sequences filters on their length which must be range between 'Minimum amplicon size - primer length' and 'Maximum amplicon size - primer length'", "Removes sequences with at least one homopolymer with more than seven nucleotides and with a distance of less than or equal to 10 nucleotides between two poor quality positions, i.e. with a Phred quality score lesser than 10"
    "4", "Dereplicates sequences", "Dereplicates sequences"
 
 
@@ -391,7 +391,8 @@
 
 Keeping or not un-merged paired reads
 
-This option is usefull when and only when, targeted amplicon is longer than the sequencing technology can provide (ITS amplicon for example). In other case, carefully, you will only keep noise in your analysis.
+.. class:: warningmark
+This option is usefull when and only when, **targeted amplicon is longer than the sequencing technology** can provide (ITS amplicons, V1-V4 region of 16S for example). In other case, carefully, you will only keep noise in your analysis.
 
 .. class:: h3
 
@@ -399,13 +400,13 @@
 
 - **Case of a sequencing of overlapping sequences: case of 16S V3-V4 amplicon MiSeq sequencing**
 
-.. image:: static/images/FROGS_preprocess_overlapped_sequence.png
+.. image:: FROGS_preprocess_overlapped_sequence.png
      :height: 261
      :width: 531
 
 - **Case of a sequencing of non-overlapping sequences: case of ITS1 amplicon MiSeq sequencing**
 
-.. image:: static/images/FROGS_preprocess_combined_sequence1.png
+.. image:: FROGS_preprocess_combined_sequence1.png
      :height: 279
      :width: 797
 
@@ -413,7 +414,7 @@
 
 **“FROGS combined” warning points**
 
-Reads pair are not merged because:
+Read pairs are not merged because:
 
     - the real amplicon length is greater than de number of base sequences (490 bp for MiSeq 2x250bp, remember of the minimum 10 bp overlap)
     - the overlapped region is smaller than 10 (fixed parameter in FROGS).
@@ -421,7 +422,7 @@
 Thus, “FROGS combined” sequences are artificial and present particular features especially on size.
 Imagine a MiSeq sequencing of 2x250pb with sequences that cannot overlap, the resulting “FROGS combined” sequences length will be fixed to 600 bp.
 
-.. image:: static/images/FROGS_preprocess_combined_sequence2.png
+.. image:: FROGS_preprocess_combined_sequence2.png
      :height: 357
      :width: 798
 
@@ -429,7 +430,7 @@
 
 Primers parameters
 
-The (`Kozich et al. 2013 <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3753973/>`_ ) protocol uses custom sequencing primers which are also the PCR primers. In this case the reads do not contain the PCR primers.
+The (`Kozich et al. 2013 <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3753973/>`_ ) protocol uses custom sequencing primers that are also the PCR primers. In this case, the reads do not contain the PCR primers.
 
 In case of Illumina standard protocol, the primers must be provided in 5' to 3' orientation.
 
@@ -445,19 +446,19 @@
 
 .. class:: h3
 
-FLASH : Amplicons sizes parameters
+FLASH : Amplicon size parameters
 
  .. class:: infomark
 
- We now recommend to use PEAR if availbale (only for accademic user) or Vsearch.
+ We now recommend to use PEAR if availbale (only for accademic user) or Vsearch. PEAR is available only in command line.
 
- The two following images show two examples of perfect values fors sizes parameters.
+ The two following images show two examples of perfect values for sizes parameters.
 
- .. image:: static/images/FROGS_preprocess_ampliconSize_unimodal_v3.png
+ .. image:: FROGS_preprocess_ampliconSize_unimodal_v3.png
     :height: 415
     :width: 676
 
- .. image:: static/images/FROGS_preprocess_ampliconSize_multimodal_v3.png
+ .. image:: FROGS_preprocess_ampliconSize_multimodal_v3.png
     :height: 415
     :width: 676