# HG changeset patch # User greg # Date 1334600432 14400 # Node ID 93ef08cd452bb7a2c4dbcba3a2d87c9c36d5cb13 Uploaded diff -r 000000000000 -r 93ef08cd452b evaluate_population_numbers.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/evaluate_population_numbers.xml Mon Apr 16 14:20:32 2012 -0400 @@ -0,0 +1,54 @@ + + possible numbers of populations + + + evaluate_population_numbers.bash "${input.extra_files_path}/admix.ped" "$output" "$max_populations" + + + + + + + + + + + + + + + + + + + + + + + + + + + +**What it does** + +The users selects a set of data generated by the Galaxy tool to "prepare +to look for population structure". For all possible numbers K of ancestral +populations, from 1 up to a user-specified maximum, this tool produces values +that indicate how well the data can be explained as genotypes from individuals +derived from K ancestral populations. These values are computed by a 5-fold +cross-validation procedure, so that a good choice for K will exhibit a low +cross-validation error compared with other potential settings for K. + +**Acknowledgments** + +We use the program "Admixture", downloaded from + +http://www.genetics.ucla.edu/software/admixture/ + +and described in the paper "Fast model-based estimation of ancestry in +unrelated individuals" by David H. Alexander, John Novembre and Kenneth Lange, +Genome Research 19 (2009), pp. 1655-1664. Admixture is called with the "--cv" +flag to produce these values. + +