Mercurial > repos > cropgeeks > ukseed
changeset 27:558a46a635a9 draft
Uploaded
author | cropgeeks |
---|---|
date | Sun, 22 Apr 2018 04:44:44 -0400 |
parents | d1d232d4cb2f |
children | 27777dd17bfe |
files | ukseed_stage1.xml ukseed_stage2.xml |
diffstat | 2 files changed, 25 insertions(+), 16 deletions(-) [+] |
line wrap: on
line diff
--- a/ukseed_stage1.xml Fri Apr 20 14:44:25 2018 -0400 +++ b/ukseed_stage1.xml Sun Apr 22 04:44:44 2018 -0400 @@ -27,11 +27,17 @@ </stdio> <help> -In **UK-SeeD Data Analysis Infrastructure**, a BBSRC-Newton funded project, we have deployed an advanced computing hardware and software platform for the -analysis of large genomics datasets for wheat varieties. The platform integrates computing resources and bioinformatics expertise to -enable crop geneticists to implement sophisticated data analysis algorithms to improve the use of genetic resources for wheat and -other important crops. The computing platform is distributed across the partners’ sites with hardware deployed at CIMMYT (Mexico) and -the Earlham Institute (UK). +This pipeline has been developed for loading a DArT SNP or SilicoDArT, report and apply filters to those datasets based on locus and +individuals call rates and locus reproducibility. It also allows data export for other formats such as GDS, plink bed and a text file +with a header line, and then one line per sample with V+6 where V is the number of variants suitable for loading into R. Finally the +pipeline perform a Principal Coordinates Analysis (PCoA, = Multidimensional scaling, MDS) to explore similarities of data and outputs +a vcf file suitable for visualizing in CurlyWhirly. + +In **UK-SeeD Data Analysis Infrastructure**, a BBSRC-Newton funded project, we have deployed an advanced computing hardware and software +platform for the analysis of large genomics datasets for wheat varieties. The platform integrates computing resources and bioinformatics +expertise to enable crop geneticists to implement sophisticated data analysis algorithms to improve the use of genetic resources for +wheat and other important crops. The computing platform is distributed across the partners’ sites with hardware deployed at CIMMYT +(Mexico) and the Earlham Institute (UK). |LOGOS|
--- a/ukseed_stage2.xml Fri Apr 20 14:44:25 2018 -0400 +++ b/ukseed_stage2.xml Sun Apr 22 04:44:44 2018 -0400 @@ -21,14 +21,11 @@ <param format="csv,txt" name="input" type="data" label="Input file" help="Input file of genotype data"/> - <param name="gl_call_rate" type="float" value="0.75" label="gl_call_rate" - help="gl_call_rate"/> + <param name="gl_call_rate" type="float" value="0.75" label="Minimum call rate per locus"/> - <param name="gl_final" type="float" value="0.8" label="gl_final" - help="gl_final"/> + <param name="gl_final" type="float" value="0.8" label="Minimum call rate per individual"/> - <param name="gl_rep" type="float" value="0.98" label="gl_rep" - help="gl_rep"/> + <param name="gl_rep" type="float" value="0.98" label="Minimum locus reproducibility"/> </inputs> <outputs> @@ -40,11 +37,17 @@ </stdio> <help> -In **UK-SeeD Data Analysis Infrastructure**, a BBSRC-Newton funded project, we have deployed an advanced computing hardware and software platform for the -analysis of large genomics datasets for wheat varieties. The platform integrates computing resources and bioinformatics expertise to -enable crop geneticists to implement sophisticated data analysis algorithms to improve the use of genetic resources for wheat and -other important crops. The computing platform is distributed across the partners’ sites with hardware deployed at CIMMYT (Mexico) and -the Earlham Institute (UK). +This pipeline has been developed for loading a DArT SNP or SilicoDArT, report and apply filters to those datasets based on locus and +individuals call rates and locus reproducibility. It also allows data export for other formats such as GDS, plink bed and a text file +with a header line, and then one line per sample with V+6 where V is the number of variants suitable for loading into R. Finally the +pipeline perform a Principal Coordinates Analysis (PCoA, = Multidimensional scaling, MDS) to explore similarities of data and outputs +a vcf file suitable for visualizing in CurlyWhirly. + +In **UK-SeeD Data Analysis Infrastructure**, a BBSRC-Newton funded project, we have deployed an advanced computing hardware and software +platform for the analysis of large genomics datasets for wheat varieties. The platform integrates computing resources and bioinformatics +expertise to enable crop geneticists to implement sophisticated data analysis algorithms to improve the use of genetic resources for +wheat and other important crops. The computing platform is distributed across the partners’ sites with hardware deployed at CIMMYT +(Mexico) and the Earlham Institute (UK). |LOGOS|