Mercurial > repos > fubar > brokenandnotdeletablebyowneroradmin
view README.txt @ 21:0ee2b06ea304
fix destination names of test files to match test generator schema
author | ross lazarus ross.lazarus@gmail.com |
---|---|
date | Tue, 05 Jun 2012 22:18:30 +1000 |
parents | a87a262220a4 |
children | 8289ebc513ab |
line wrap: on
line source
# WARNING before you start # Install on a private Galaxy ONLY # Please NEVER on a public or production instance *Read on if* You use a production Galaxy; Your users sometimes take data out of Galaxy, process it with ugly little perl/awk/sed/R... scripts and put it back; They do this when they can't do some transformation in Galaxy (the 90/10 rule); You don't have enough developer resources for wrapping dozens of even relatively simple tools; Your institution would be better off if those nasty, feral scripts were all tucked safely in a local toolshed. *The good news* If it can be trivially scripted, it can be running safely in your local Galaxy via your own local toolshed. That's what this tool does. You paste a simple script and the tool returns a new, real Galaxy tool, ready to be installed from the local toolshed to local servers. Scripts can be wrapped and online literally within minutes. *To fully and safely exploit the awesome power* of Galaxy with this tool installed you should be a developer installing this tool on a personal/scratch local instance - ie a private site because then, if you break it, you get to keep all the pieces see https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home *To make the tool work* If not already there, please add: <datatype extension="toolshed.gz" type="galaxy.datatypes.binary:Binary" mimetype="multipart/x-gzip" subclass="True" /> to your local data_types_conf.xml Then, PUT some IDs in the list in the XML before you restart Galaxy to load this new tool please otherwise, the tool won't run for anybody. *What it does* This is a tool factory for simple scripts in python, R or whatever ails ye. LIMITED to simple scripts that read one input from the history. Optionally can write one new history dataset, and optionally collect any number of outputs into links on an autogenerated HTML page. Generated tools can be edited and enhanced like any Galaxy tool, so start small and build up A generated script gets you a serious leg up to a more complex one. *What you do* You paste and run your script you fix the syntax errors and eventually it runs that's pretty good because you can use the redo button to rerun it and edit the script as you debug - cool, but now the power really kicks in, because once the script works on some test data, you can generate a toolshed compatible gzip file containing your script neatly wrapped and hidden safely inside an ordinary Galaxy script in your local toolshed. That means safe and largely automated installation in any production Galaxy configured to use your toolshed or the tool Automated build for tests still being worked on - should be done soon *Generated tool Security* Once you install a generated tool, it's just another tool - assuming the script is safe. They just run normally and their user cannot do anything unusually insecure but please, practice safe toolshed. Read the fucking code before you install any tool. Especially this one - it is really scary. If you opt for an HTML output, you get all the script outputs arranged as a single Html history item - all output files are linked, thumbnails for all the pdfs. Ugly but really inexpensive. Patches welcome please. long route to June 2012 product derived from an integrated script model called rgBaseScriptWrapper.py Note to the unwary: This tool allows arbitrary scripting on your Galaxy as the Galaxy user There is nothing stopping a malicious user doing whatever they choose Extremely dangerous!! Totally insecure. So, trusted users only copyright ross lazarus (ross stop lazarus at gmail stop com) May 2012 all rights reserved Licensed under the LGPL if you want to improve it, feel free https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home Material for our more enthusiastic readers continues below **Motivation** Simple transformation, filtering or reporting scripts get written, run and lost every day in most busy labs - even ours where Galaxy is in use. This 'dark script matter' is pervasive and generally not reproducible. **Benefits** For our group, this allows Galaxy to fill that important dark script gap - all those "small" bioinformatics tasks. Once a user has a working R (or python or perl) script that does something Galaxy cannot currently do (eg transpose a tabular file) and takes parameters the way Galaxy supplies them (see example below), they: 1. Install the tool factory on a personal private instance 2. Upload a small test data set 3. Paste the script into the 'script' text box and iteratively run the insecure tool on test data until it works right - there is absolutely no reason to do this anywhere other than on a personal private instance. 4. Once it works right, set the 'Generate toolshed gzip' option and run it again. 5. A toolshed style gzip appears ready to upload and install like any other Toolshed entry. 6. Upload the new tool to the toolshed 7. Ask the local admin to check the new tool to confirm it's not evil and install it in the local production galaxy **Simple examples on the tool form** A simple Rscript "filter" showing how the command line parameters can be handled, takes an input file, does something (transpose in this case) and writes the results to a new tabular file:: # transpose a tabular input file and write as a tabular output file ourargs = commandArgs(TRUE) inf = ourargs[1] outf = ourargs[2] inp = read.table(inf,head=F,row.names=NULL,sep='\t') outp = t(inp) write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=F) A more complex Rscript example takes no input file but generates a random heatmap pdf - you must make sure the option to create an HTML output file is turned on for this to work. The heatmap will be presented as a thumbnail linked to the pdf in the resulting HTML page:: # note this script takes NO input or output because it generates random data foo = data.frame(a=runif(100),b=runif(100),c=runif(100),d=runif(100),e=runif(100),f=runif(100)) bar = as.matrix(foo) pdf( "heattest.pdf" ) heatmap(bar,main='Random Heatmap') dev.off() A Python example that reverses each row of a tabular file. You'll need to remove the leading spaces for this to work if cut and pasted into the script box. Note that you can already do this in Galaxy by setting up the cut columns tool with the correct number of columns in reverse order,but this script will work for any number of columns so is completely generic:: # reverse order of columns in a tabular file import sys inp = sys.argv[1] outp = sys.argv[2] i = open(inp,'r') o = open(outp,'w') for row in i: rs = row.rstrip().split('\t') rs.reverse() o.write('\t'.join(rs)) o.write('\n') i.close() o.close() **Attribution** Copyright Ross Lazarus (ross period lazarus at gmail period com) May 2012 All rights reserved. Licensed under the LGPL_ **Obligatory screenshot** http://bitbucket.org/fubar/galaxytoolmaker/src/fda8032fe989/images/dynamicScriptTool.png