changeset 27:558a46a635a9 draft

Uploaded
author cropgeeks
date Sun, 22 Apr 2018 04:44:44 -0400
parents d1d232d4cb2f
children 27777dd17bfe
files ukseed_stage1.xml ukseed_stage2.xml
diffstat 2 files changed, 25 insertions(+), 16 deletions(-) [+]
line wrap: on
line diff
--- a/ukseed_stage1.xml	Fri Apr 20 14:44:25 2018 -0400
+++ b/ukseed_stage1.xml	Sun Apr 22 04:44:44 2018 -0400
@@ -27,11 +27,17 @@
     </stdio>
 
 <help>
-In **UK-SeeD Data Analysis Infrastructure**, a BBSRC-Newton funded project, we have deployed an advanced computing hardware and software platform for the 
-analysis of large genomics datasets for wheat varieties. The platform integrates computing resources and bioinformatics expertise to 
-enable crop geneticists to implement sophisticated data analysis algorithms to improve the use of genetic resources for wheat and 
-other important crops. The computing platform is distributed across the partners’ sites with hardware deployed at CIMMYT (Mexico) and 
-the Earlham Institute (UK).  
+This pipeline has been developed for loading a DArT SNP or SilicoDArT, report and apply filters to those datasets based on locus and
+individuals call rates and locus reproducibility. It also allows data export for other formats such as GDS, plink bed and a text file
+with a header line, and then one line per sample with V+6 where V is the number of variants suitable for loading into R. Finally the
+pipeline perform a Principal Coordinates Analysis (PCoA, = Multidimensional scaling, MDS) to explore similarities of data and outputs
+a vcf file suitable for visualizing in CurlyWhirly.
+
+In **UK-SeeD Data Analysis Infrastructure**, a BBSRC-Newton funded project, we have deployed an advanced computing hardware and software
+platform for the analysis of large genomics datasets for wheat varieties. The platform integrates computing resources and bioinformatics
+expertise to enable crop geneticists to implement sophisticated data analysis algorithms to improve the use of genetic resources for
+wheat and other important crops. The computing platform is distributed across the partners’ sites with hardware deployed at CIMMYT
+(Mexico) and the Earlham Institute (UK).  
 
 |LOGOS|
 
--- a/ukseed_stage2.xml	Fri Apr 20 14:44:25 2018 -0400
+++ b/ukseed_stage2.xml	Sun Apr 22 04:44:44 2018 -0400
@@ -21,14 +21,11 @@
         <param format="csv,txt" name="input" type="data" label="Input file"
             help="Input file of genotype data"/>
 		
-		<param name="gl_call_rate" type="float" value="0.75" label="gl_call_rate"
-			help="gl_call_rate"/>
+		<param name="gl_call_rate" type="float" value="0.75" label="Minimum call rate per locus"/>
 			
-		<param name="gl_final" type="float" value="0.8" label="gl_final"
-			help="gl_final"/>
+		<param name="gl_final" type="float" value="0.8" label="Minimum call rate per individual"/>
 			
-		<param name="gl_rep" type="float" value="0.98" label="gl_rep"
-			help="gl_rep"/>
+		<param name="gl_rep" type="float" value="0.98" label="Minimum locus reproducibility"/>
     </inputs>
 
     <outputs>
@@ -40,11 +37,17 @@
     </stdio>
 
 <help>
-In **UK-SeeD Data Analysis Infrastructure**, a BBSRC-Newton funded project, we have deployed an advanced computing hardware and software platform for the 
-analysis of large genomics datasets for wheat varieties. The platform integrates computing resources and bioinformatics expertise to 
-enable crop geneticists to implement sophisticated data analysis algorithms to improve the use of genetic resources for wheat and 
-other important crops. The computing platform is distributed across the partners’ sites with hardware deployed at CIMMYT (Mexico) and 
-the Earlham Institute (UK).  
+This pipeline has been developed for loading a DArT SNP or SilicoDArT, report and apply filters to those datasets based on locus and
+individuals call rates and locus reproducibility. It also allows data export for other formats such as GDS, plink bed and a text file
+with a header line, and then one line per sample with V+6 where V is the number of variants suitable for loading into R. Finally the
+pipeline perform a Principal Coordinates Analysis (PCoA, = Multidimensional scaling, MDS) to explore similarities of data and outputs
+a vcf file suitable for visualizing in CurlyWhirly.
+
+In **UK-SeeD Data Analysis Infrastructure**, a BBSRC-Newton funded project, we have deployed an advanced computing hardware and software
+platform for the analysis of large genomics datasets for wheat varieties. The platform integrates computing resources and bioinformatics
+expertise to enable crop geneticists to implement sophisticated data analysis algorithms to improve the use of genetic resources for
+wheat and other important crops. The computing platform is distributed across the partners’ sites with hardware deployed at CIMMYT
+(Mexico) and the Earlham Institute (UK).  
 
 |LOGOS|