# HG changeset patch # User rico # Date 1333653689 14400 # Node ID 83806667ff3a5f78ecbb43f582ed18218b4b83e2 # Parent 40244cd272faea1da3f947ad956e2734ac713978 Uploaded diff -r 40244cd272fa -r 83806667ff3a rank_pathways.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rank_pathways.xml Thu Apr 05 15:21:29 2012 -0400 @@ -0,0 +1,74 @@ + + affected KEGG pathways + + + #if str($output_format) == 'a' + calctfreq.py + #else if str($output_format) == 'b' + calclenchange.py + #end if + "--loc_file=${GALAXY_DATA_INDEX_DIR}/gd.rank.loc" + "--species=${input.metadata.dbkey}" + "--input=${input}" + "--output=${output}" + "--posKEGGclmn=${input.metadata.kegg_path}" + "--KEGGgeneposcolmn=${input.metadata.kegg_gene}" + + + + + + + + + + + + + + + + + + + + + + + + + +**What it does** + +This tool produces a table ranking the pathways based on the percentage +of genes in an input dataset, out of the total in each pathway. +Alternatively, the tool ranks the pathways based on the change in +length and number of paths connecting sources and sinks. This change is +calculated between graphs representing pathways with and without excluding +the nodes that represent the genes in an input list. Sources are all +the nodes representing the initial reactants/products in the pathway. +Sinks are all the nodes representing the final reactants/products in +the pathway. + +If pathways are ranked by percentage of genes affected, the output is +a tabular dataset with the following columns: + + 1. number of genes in the pathway present in the input dataset + 2. percentage of the total genes in the pathway included in the input dataset + 3. rank of the frequency (from high freq to low freq) + 4. name of the pathway + +If pathways are ranked by change in length and number of paths, the +output is a tabular dataset with the following columns: + + 1. change in the mean length of paths between sources and sinks + 2. mean length of paths between sources and sinks in the pathway including the genes in the input dataset. If the pathway do not have sources/sinks, the length is assumed to be infinite (I) + 3. mean length of paths between sources and sinks in the pathway excluding the genes in the input dataset. If the pathway do not have sources/sinks, the length is assumed to be infinite (I) + 4. rank of the change in the mean length of paths between sources and sinks (from high change to low change) + 5. change in the number of paths between sources and sinks + 6. number of paths between sources and sinks in the pathway including the genes in the input dataset. If the pathway do not have sources/sinks, it is assumed to be a circuit (C) + 7. number of paths between sources and sinks in the pathway excluding the genes in the input dataset. If the pathway do not have sources/sinks, it is assumed to be a circuit (C) + 8. rank of the change in the number of paths between sources and sinks (from high change to low change) + 9. name of the pathway + +