| Previous changeset 27:ddd76b6db251 (2013-08-07) Next changeset 29:ca87f891210c (2013-08-07) |
|
Commit message:
Uploaded |
|
added:
rgedgeRpaired.xml.camera |
|
removed:
rgedgeRpaired.xml |
| b |
| diff -r ddd76b6db251 -r c4ee2e69d691 rgedgeRpaired.xml --- a/rgedgeRpaired.xml Wed Aug 07 02:10:19 2013 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
| b |
| b'@@ -1,1084 +0,0 @@\n-<tool id="rgDifferentialCount" name="Differential_Count" version="0.20">\n- <description>models using BioConductor packages</description>\n- <requirements>\n- <requirement type="package" version="2.12">biocbasics</requirement>\n- <requirement type="package" version="3.0.1">r3</requirement>\n- <requirement type="package" version="1.3.18">graphicsmagick</requirement>\n- <requirement type="package" version="9.07">ghostscript</requirement>\n- </requirements>\n- \n- <command interpreter="python">\n- rgToolFactory.py --script_path "$runme" --interpreter "Rscript" --tool_name "DifferentialCounts" \n- --output_dir "$html_file.files_path" --output_html "$html_file" --make_HTML "yes"\n- </command>\n- <inputs>\n- <param name="input1" type="data" format="tabular" label="Select an input matrix - rows are contigs, columns are counts for each sample"\n- help="Use the HTSeq based count matrix preparation tool to create these matrices from BAM/SAM files and a GTF file of genomic features"/>\n- <param name="title" type="text" value="Differential Counts" size="80" label="Title for job outputs" \n- help="Supply a meaningful name here to remind you what the outputs contain">\n- <sanitizer invalid_char="">\n- <valid initial="string.letters,string.digits"><add value="_" /> </valid>\n- </sanitizer>\n- </param>\n- <param name="treatment_name" type="text" value="Treatment" size="50" label="Treatment Name"/>\n- <param name="Treat_cols" label="Select columns containing treatment." type="data_column" data_ref="input1" numerical="True" \n- multiple="true" use_header_names="true" size="120" display="checkboxes">\n- <validator type="no_options" message="Please select at least one column."/>\n- </param>\n- <param name="control_name" type="text" value="Control" size="50" label="Control Name"/>\n- <param name="Control_cols" label="Select columns containing control." type="data_column" data_ref="input1" numerical="True" \n- multiple="true" use_header_names="true" size="120" display="checkboxes" optional="true">\n- </param>\n- <param name="subjectids" type="text" optional="true" size="120" value = ""\n- label="IF SUBJECTS NOT ALL INDEPENDENT! Enter comma separated strings to indicate sample labels for (eg) pairing - must be one for every column in input"\n- help="Leave blank if no pairing, but eg if data from sample id A99 is in columns 2,4 and id C21 is in 3,5 then enter \'A99,C21,A99,C21\'">\n- <sanitizer>\n- <valid initial="string.letters,string.digits"><add value="," /> </valid>\n- </sanitizer>\n- </param>\n- <param name="fQ" type="float" value="0.3" size="5" label="Non-differential contig count quantile threshold - zero to analyze all non-zero read count contigs"\n- help="May be a good or a bad idea depending on the biology and the question. EG 0.3 = sparsest 30% of contigs with at least one read are removed before analysis"/>\n- <param name="useNDF" type="boolean" truevalue="T" falsevalue="F" checked="false" size="1" \n- label="Non differential filter - remove contigs below a threshold (1 per million) for half or more samples"\n- help="May be a good or a bad idea depending on the biology and the question. This was the old default. Quantile based is available as an alternative"/>\n-\n- <conditional name="edgeR">\n- <param name="doedgeR" type="select" \n- label="Run this model using edgeR"\n- help="edgeR uses a negative binomial model and seems to be powerful, even with few replicates">\n- <option value="F">Do not run edgeR</option>\n- <option value="T" selected="true">Run edgeR</option>\n- </param>\n- <when value="T">\n- <param name="edgeR_priordf" type="integer" value="20" size="3" \n- label="prior.df for tagwise dispersion - lower value = more emphasis on each tag\'s variance. Replaces prior.n and prior.df = prior.n * residual.df"\n'..b'Preprint.pdf\n-\n-See Also\n-\n-A voom case study is given in the edgeR User\'s Guide.\n-\n-vooma is a similar function but for microarrays instead of RNA-seq.\n-\n-\n-***old rant on changes to Bioconductor package variable names between versions***\n-\n-The edgeR authors made a small cosmetic change in the name of one important variable (from p.value to PValue) \n-breaking this and all other code that assumed the old name for this variable, \n-between edgeR2.4.4 and 2.4.6 (the version for R 2.14 as at the time of writing). \n-This means that all code using edgeR is sensitive to the version. I think this was a very unwise thing \n-to do because it wasted hours of my time to track down and will similarly cost other edgeR users dearly\n-when their old scripts break. This tool currently now works with 2.4.6.\n-\n-**Note on prior.N**\n-\n-http://seqanswers.com/forums/showthread.php?t=5591 says:\n-\n-*prior.n*\n-\n-The value for prior.n determines the amount of smoothing of tagwise dispersions towards the common dispersion. \n-You can think of it as like a "weight" for the common value. (It is actually the weight for the common likelihood \n-in the weighted likelihood equation). The larger the value for prior.n, the more smoothing, i.e. the closer your \n-tagwise dispersion estimates will be to the common dispersion. If you use a prior.n of 1, then that gives the \n-common likelihood the weight of one observation.\n-\n-In answer to your question, it is a good thing to squeeze the tagwise dispersions towards a common value, \n-or else you will be using very unreliable estimates of the dispersion. I would not recommend using the value that \n-you obtained from estimateSmoothing()---this is far too small and would result in virtually no moderation \n-(squeezing) of the tagwise dispersions. How many samples do you have in your experiment? \n-What is the experimental design? If you have few samples (less than 6) then I would suggest a prior.n of at least 10. \n-If you have more samples, then the tagwise dispersion estimates will be more reliable, \n-so you could consider using a smaller prior.n, although I would hesitate to use a prior.n less than 5. \n-\n-\n-From Bioconductor Digest, Vol 118, Issue 5, Gordon writes:\n-\n-Dear Dorota,\n-\n-The important settings are prior.df and trend.\n-\n-prior.n and prior.df are related through prior.df = prior.n * residual.df,\n-and your experiment has residual.df = 36 - 12 = 24. So the old setting of\n-prior.n=10 is equivalent for your data to prior.df = 240, a very large\n-value. Going the other way, the new setting of prior.df=10 is equivalent\n-to prior.n=10/24.\n-\n-To recover old results with the current software you would use\n-\n- estimateTagwiseDisp(object, prior.df=240, trend="none")\n-\n-To get the new default from old software you would use\n-\n- estimateTagwiseDisp(object, prior.n=10/24, trend=TRUE)\n-\n-Actually the old trend method is equivalent to trend="loess" in the new\n-software. You should use plotBCV(object) to see whether a trend is\n-required.\n-\n-Note you could also use\n-\n- prior.n = getPriorN(object, prior.df=10)\n-\n-to map between prior.df and prior.n.\n-\n-----\n-\n-**Attributions**\n-\n-edgeR - edgeR_ \n-\n-VOOM/limma - limma_VOOM_ \n-\n-DESeq2 - DESeq2_ for details\n-\n-See above for Bioconductor package documentation for packages exposed in Galaxy by this tool and app store package.\n-\n-Galaxy_ (that\'s what you are using right now!) for gluing everything together \n-\n-Otherwise, all code and documentation comprising this tool was written by Ross Lazarus and is \n-licensed to you under the LGPL_ like other rgenetics artefacts\n-\n-.. _LGPL: http://www.gnu.org/copyleft/lesser.html\n-.. _HTSeq: http://www-huber.embl.de/users/anders/HTSeq/doc/index.html\n-.. _edgeR: http://www.bioconductor.org/packages/release/bioc/html/edgeR.html\n-.. _DESeq2: http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html\n-.. _limma_VOOM: http://www.bioconductor.org/packages/release/bioc/html/limma.html\n-.. _Galaxy: http://getgalaxy.org\n-</help>\n-\n-</tool>\n-\n-\n' |
| b |
| diff -r ddd76b6db251 -r c4ee2e69d691 rgedgeRpaired.xml.camera --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/rgedgeRpaired.xml.camera Wed Aug 07 02:41:40 2013 -0400 |
| b |
| b'@@ -0,0 +1,1084 @@\n+<tool id="rgDifferentialCount" name="Differential_Count" version="0.20">\n+ <description>models using BioConductor packages</description>\n+ <requirements>\n+ <requirement type="package" version="2.12">biocbasics</requirement>\n+ <requirement type="package" version="3.0.1">r3</requirement>\n+ <requirement type="package" version="1.3.18">graphicsmagick</requirement>\n+ <requirement type="package" version="9.07">ghostscript</requirement>\n+ </requirements>\n+ \n+ <command interpreter="python">\n+ rgToolFactory.py --script_path "$runme" --interpreter "Rscript" --tool_name "DifferentialCounts" \n+ --output_dir "$html_file.files_path" --output_html "$html_file" --make_HTML "yes"\n+ </command>\n+ <inputs>\n+ <param name="input1" type="data" format="tabular" label="Select an input matrix - rows are contigs, columns are counts for each sample"\n+ help="Use the HTSeq based count matrix preparation tool to create these matrices from BAM/SAM files and a GTF file of genomic features"/>\n+ <param name="title" type="text" value="Differential Counts" size="80" label="Title for job outputs" \n+ help="Supply a meaningful name here to remind you what the outputs contain">\n+ <sanitizer invalid_char="">\n+ <valid initial="string.letters,string.digits"><add value="_" /> </valid>\n+ </sanitizer>\n+ </param>\n+ <param name="treatment_name" type="text" value="Treatment" size="50" label="Treatment Name"/>\n+ <param name="Treat_cols" label="Select columns containing treatment." type="data_column" data_ref="input1" numerical="True" \n+ multiple="true" use_header_names="true" size="120" display="checkboxes">\n+ <validator type="no_options" message="Please select at least one column."/>\n+ </param>\n+ <param name="control_name" type="text" value="Control" size="50" label="Control Name"/>\n+ <param name="Control_cols" label="Select columns containing control." type="data_column" data_ref="input1" numerical="True" \n+ multiple="true" use_header_names="true" size="120" display="checkboxes" optional="true">\n+ </param>\n+ <param name="subjectids" type="text" optional="true" size="120" value = ""\n+ label="IF SUBJECTS NOT ALL INDEPENDENT! Enter comma separated strings to indicate sample labels for (eg) pairing - must be one for every column in input"\n+ help="Leave blank if no pairing, but eg if data from sample id A99 is in columns 2,4 and id C21 is in 3,5 then enter \'A99,C21,A99,C21\'">\n+ <sanitizer>\n+ <valid initial="string.letters,string.digits"><add value="," /> </valid>\n+ </sanitizer>\n+ </param>\n+ <param name="fQ" type="float" value="0.3" size="5" label="Non-differential contig count quantile threshold - zero to analyze all non-zero read count contigs"\n+ help="May be a good or a bad idea depending on the biology and the question. EG 0.3 = sparsest 30% of contigs with at least one read are removed before analysis"/>\n+ <param name="useNDF" type="boolean" truevalue="T" falsevalue="F" checked="false" size="1" \n+ label="Non differential filter - remove contigs below a threshold (1 per million) for half or more samples"\n+ help="May be a good or a bad idea depending on the biology and the question. This was the old default. Quantile based is available as an alternative"/>\n+\n+ <conditional name="edgeR">\n+ <param name="doedgeR" type="select" \n+ label="Run this model using edgeR"\n+ help="edgeR uses a negative binomial model and seems to be powerful, even with few replicates">\n+ <option value="F">Do not run edgeR</option>\n+ <option value="T" selected="true">Run edgeR</option>\n+ </param>\n+ <when value="T">\n+ <param name="edgeR_priordf" type="integer" value="20" size="3" \n+ label="prior.df for tagwise dispersion - lower value = more emphasis on each tag\'s variance. Replaces prior.n and prior.df = prior.n * residual.df"\n'..b'Preprint.pdf\n+\n+See Also\n+\n+A voom case study is given in the edgeR User\'s Guide.\n+\n+vooma is a similar function but for microarrays instead of RNA-seq.\n+\n+\n+***old rant on changes to Bioconductor package variable names between versions***\n+\n+The edgeR authors made a small cosmetic change in the name of one important variable (from p.value to PValue) \n+breaking this and all other code that assumed the old name for this variable, \n+between edgeR2.4.4 and 2.4.6 (the version for R 2.14 as at the time of writing). \n+This means that all code using edgeR is sensitive to the version. I think this was a very unwise thing \n+to do because it wasted hours of my time to track down and will similarly cost other edgeR users dearly\n+when their old scripts break. This tool currently now works with 2.4.6.\n+\n+**Note on prior.N**\n+\n+http://seqanswers.com/forums/showthread.php?t=5591 says:\n+\n+*prior.n*\n+\n+The value for prior.n determines the amount of smoothing of tagwise dispersions towards the common dispersion. \n+You can think of it as like a "weight" for the common value. (It is actually the weight for the common likelihood \n+in the weighted likelihood equation). The larger the value for prior.n, the more smoothing, i.e. the closer your \n+tagwise dispersion estimates will be to the common dispersion. If you use a prior.n of 1, then that gives the \n+common likelihood the weight of one observation.\n+\n+In answer to your question, it is a good thing to squeeze the tagwise dispersions towards a common value, \n+or else you will be using very unreliable estimates of the dispersion. I would not recommend using the value that \n+you obtained from estimateSmoothing()---this is far too small and would result in virtually no moderation \n+(squeezing) of the tagwise dispersions. How many samples do you have in your experiment? \n+What is the experimental design? If you have few samples (less than 6) then I would suggest a prior.n of at least 10. \n+If you have more samples, then the tagwise dispersion estimates will be more reliable, \n+so you could consider using a smaller prior.n, although I would hesitate to use a prior.n less than 5. \n+\n+\n+From Bioconductor Digest, Vol 118, Issue 5, Gordon writes:\n+\n+Dear Dorota,\n+\n+The important settings are prior.df and trend.\n+\n+prior.n and prior.df are related through prior.df = prior.n * residual.df,\n+and your experiment has residual.df = 36 - 12 = 24. So the old setting of\n+prior.n=10 is equivalent for your data to prior.df = 240, a very large\n+value. Going the other way, the new setting of prior.df=10 is equivalent\n+to prior.n=10/24.\n+\n+To recover old results with the current software you would use\n+\n+ estimateTagwiseDisp(object, prior.df=240, trend="none")\n+\n+To get the new default from old software you would use\n+\n+ estimateTagwiseDisp(object, prior.n=10/24, trend=TRUE)\n+\n+Actually the old trend method is equivalent to trend="loess" in the new\n+software. You should use plotBCV(object) to see whether a trend is\n+required.\n+\n+Note you could also use\n+\n+ prior.n = getPriorN(object, prior.df=10)\n+\n+to map between prior.df and prior.n.\n+\n+----\n+\n+**Attributions**\n+\n+edgeR - edgeR_ \n+\n+VOOM/limma - limma_VOOM_ \n+\n+DESeq2 - DESeq2_ for details\n+\n+See above for Bioconductor package documentation for packages exposed in Galaxy by this tool and app store package.\n+\n+Galaxy_ (that\'s what you are using right now!) for gluing everything together \n+\n+Otherwise, all code and documentation comprising this tool was written by Ross Lazarus and is \n+licensed to you under the LGPL_ like other rgenetics artefacts\n+\n+.. _LGPL: http://www.gnu.org/copyleft/lesser.html\n+.. _HTSeq: http://www-huber.embl.de/users/anders/HTSeq/doc/index.html\n+.. _edgeR: http://www.bioconductor.org/packages/release/bioc/html/edgeR.html\n+.. _DESeq2: http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html\n+.. _limma_VOOM: http://www.bioconductor.org/packages/release/bioc/html/limma.html\n+.. _Galaxy: http://getgalaxy.org\n+</help>\n+\n+</tool>\n+\n+\n' |