# HG changeset patch # User rico # Date 1333724053 14400 # Node ID 1cfaf6ef61a52f813f007175bba9785255aae48f Uploaded diff -r 000000000000 -r 1cfaf6ef61a5 evaluate_population_numbers.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/evaluate_population_numbers.xml Fri Apr 06 10:54:13 2012 -0400 @@ -0,0 +1,54 @@ + + possible numbers of populations + + + evaluate_population_numbers.bash "${input.extra_files_path}/admix.ped" "$output" "$max_populations" + + + + + + + + + + + + + + + + + + + + + + + + + + + +**What it does** + +The users selects a set of data generated by the Galaxy tool to "prepare +to look for population structure". For all possible numbers K of ancestral +populations, from 1 up to a user-specified maximum, this tool produces values +that indicate how well the data can be explained as genotypes from individuals +derived from K ancestral populations. These values are computed by a 5-fold +cross-validation procedure, so that a good choice for K will exhibit a low +cross-validation error compared with other potential settings for K. + +**Acknowledgments** + +We use the program "Admixture", downloaded from + +http://www.genetics.ucla.edu/software/admixture/ + +and described in the paper "Fast model-based estimation of ancestry in +unrelated individuals" by David H. Alexander, John Novembre and Kenneth Lange, +Genome Research 19 (2009), pp. 1655-1664. Admixture is called with the "--cv" +flag to produce these values. + +