Mercurial > repos > nanettec > classifier
changeset 2:b4929d26d955 draft
Uploaded
line wrap: on
line diff
--- a/classifier/classifier.xml Mon Mar 14 09:37:07 2016 -0400 +++ b/classifier/classifier.xml Wed Mar 16 09:47:17 2016 -0400 @@ -1,7 +1,7 @@ <tool id="classifier5" name="Classify eQTLs" version="5.0.0"> <description> as cis or trans</description> <command interpreter="python"> - classifier.py --rscript \$R_SCRIPT_PATH/eqtl_genes_positions_plot.txt --input1 $input1 --input2 $input2 --input3 $input3 --input4 $input4 --output1 $output1 --output2 $output2 --output3 $output3 --output4 $output4 --output5 $output5 --output6 $output6 --output7 $output7 --output8 $output8 + classifier.py --rscript \$R_SCRIPT_PATH/classifier/eqtl_genes_positions_plot.txt --input1 $input1 --input2 $input2 --input3 $input3 --input4 $input4 --output1 $output1 --output2 $output2 --output3 $output3 --output4 $output4 --output5 $output5 --output6 $output6 --output7 $output7 --output8 $output8 </command> <inputs> <param label="eQTL results file" name="input1" type="data" format="tabular" help="A tabular file with the mapped eQTLs and its associated statistics"></param> @@ -30,9 +30,11 @@ **What it does** -Classifies an eQTL as 'cis' if it maps within a distance of +/- 5 cM of its target gene's location, then that gene is claimed to be cis-regulated by the eQTL. +Calculates the average genetic interval size across all eQTLs. -Classifies an eQTL as 'trans' if it maps to a different region on the genome than the location of the target gene (further away than 5 cM from the target gene). +Classifies an eQTL as 'cis' if it maps within half the above mentioned interval size of the gene exhibiting the eQTL. + +Classifies an eQTL as 'trans' if it maps to a different region on the genome than the location of the gene exhibiting the eQTL (further away than half the above mentioned interval size from the gene). Classifies an eQTL as 'no_result' if the location of the target gene is not known. @@ -40,17 +42,14 @@ **Example input files** -eQTL results file, each row correspond to an eQTL (13 columns; only a part of the file is shown):: +eQTL results file, each row correspond to an eQTL (21 columns; only a part of the file is shown):: - gene index chr start_marker start_int end_marker end_int peak_marker peak_int peakLR rsq rtsq parent_up_reg - geneA 1 6 13 1.5139 15 1.6431 13 1.5539 12.7532485 0.1337606 0.3630217 parentA - geneC 2 9 5 0.8106 6 0.9614 6 0.9214 20.344489 0.1559524 0.3123026 parentB - geneC 3 9 8 1.2052 8 1.2452 8 1.2052 16.6822024 0.1244943 0.314542 parentA - geneD 4 9 1 0.0001 2 0.2395 1 0.1201 19.531317 0.1753893 0.4300621 parentA - geneH 5 1 1 0.0001 1 0.1001 1 0.0001 19.5727096 0.1373944 0.392982 parentB - geneH 6 1 9 1.0268 11 1.2164 10 1.1261 13.5560176 0.095168 0.4823061 parentB - geneH 7 6 14 1.5977 15 1.8031 15 1.7231 19.8953622 0.3181244 0.3909106 parentB - geneI 8 9 7 1.0982 9 1.3079 8 1.2052 20.3966235 0.1305025 0.4233788 parentA +trait_name trait_number eQTL_number chr peak_marker peak_position peak_LR peak_LOD R2 TR2 S additive dominance LOD1_L_m LOD1_L_pos LOD1_R_m LOD1_R_pos LOD2_L_m LOD2_L_pos LOD2_R_m LOD2_R_pos + geneA 106 2 10 4 0.5206 13.0002477 2.821053751 0.1067186 0.2802598 2741.216084 -80.0805117 0 3 0.4045 5 0.6791 3 0.3583 5 0.7505 + geneB 434 3 6 3 0.1455 13.000651 2.821141267 0.0881461 0.3710748 38.650035 502.7692948 0 2 0.0847 3 0.2153 1 0.0112 3 0.2763 + geneC 343 2 4 10 1.1039 13.0012249 2.821265803 0.1168611 0.3068127 42.9667077 -101.8310204 0 10 1.0217 10 1.1078 9 0.9838 10 1.1118 + geneD 384 1 1 19 2.3414 13.0022994 2.82149897 0.1372476 0.1985604 2.1933164 -688.0268455 0 19 2.1956 20 2.4956 19 2.0883 20 2.5488 + geneD 267 2 9 8 1.2052 13.0026682 2.821578999 0.0862225 0.3794662 55.4157254 278.1351403 0 7 1.2023 8 1.2277 7 1.1994 8 1.2466 Chromosome summary file, each row correspond to a chromosome (6 columns; only a part of the file is shown). The last row gives the total across the genome::
--- a/classifier/readme.txt Mon Mar 14 09:37:07 2016 -0400 +++ b/classifier/readme.txt Wed Mar 16 09:47:17 2016 -0400 @@ -4,13 +4,16 @@ ### This is the second tool in the eQTL backend pipeline: lookup, classification, frequency, sliding window frequency, hotspots, GO enrichment -Classifies an eQTL as 'cis' if it maps within a distance of +/- 5 cM of its target gene's location, then that gene is claimed to be cis-regulated by the eQTL. +Link to the workflow (for import into Galaxy): http://chewbacca.bi.up.ac.za:8080/u/nanette/w/back-end-workflow-2 + +Calculates the average genetic interval size across all eQTLs. -Classifies an eQTL as 'trans' if it maps to a different region on the genome than the location of the target gene (further away than 5 cM from the target gene). +Classifies an eQTL as 'cis' if it maps within half the above mentioned interval size of the gene exhibiting the eQTL. + +Classifies an eQTL as 'trans' if it maps to a different region on the genome than the location of the gene exhibiting the eQTL (further away than half the above mentioned interval size from the gene). Classifies an eQTL as 'no_result' if the location of the target gene is not known. - --------------- Installation ---------------