# HG changeset patch
# User ross lazarus ross.lazarus@gmail.com
# Date 1338597788 -36000
# Node ID 7221619caefa0554e01f69d19af7d407b626fec2
# Parent 78044a3d4a2145c4c7c9d6095a52b5c2df4695ab
Updated name and added crude gzip generator for toolshed
TODO: add tests and new XML tool descriptor as soon as Greg has it nailed down.
diff -r 78044a3d4a21 -r 7221619caefa README.txt
--- a/README.txt Thu May 31 10:49:22 2012 +1000
+++ b/README.txt Sat Jun 02 10:43:08 2012 +1000
@@ -1,28 +1,29 @@
-= WARNING before you start =
-This tool should only ever be installed on a private instance - NEVER use it on a public Galaxy because the risks are too awful to contemplate let alone manage. You have been warned.
+= WARNING before you start = This tool should only ever be installed on a private Galaxy instance - NEVER use it on a public
+Galaxy because the risks are too awful to contemplate let alone manage. You have been warned.
-== Motivation ==
-Simple transformation, filtering or reporting scripts get written, run and lost every day in most busy labs - even ours where Galaxy is in use. This 'dark script matter' is pervasive and generally not reproducible.
+== Motivation == Simple transformation, filtering or reporting scripts get written, run and lost every day in most busy labs
+- even ours where Galaxy is in use. This 'dark script matter' is pervasive and generally not reproducible.
-After considerable nagging, I wrote and installed a new tool locked down to allow only two other trusted bioinformatician users to paste and run (NO sandbox!) arbitrary scripts - see screenshot attached.
+== Benefits == For our group, this allows Galaxy to fill that important dark script gap - all those "small" bioinformatics
+tasks. Once a user has a working R (or python or perl) script that does something Galaxy cannot currently do (eg transpose a
+tabular file) and takes parameters the way Galaxy supplies them (see example below), they:
-== Benefits ==
-For our group, this allows Galaxy to fill that important dark script gap - all those "small" bioinformatics tasks - because once a trusted user has a working R (or python or perl) script that takes parameters the way Galaxy supplies them (see example below), they:
-
-1) run the new tool
+1. Install the tool factory on a personal private instance
-2) paste their code into the tool 'script' text box
+2. Upload a small test data set
-3) select the optional history input (and some other odds and ends - see screen shot) and
+3. Paste the script into the 'script' text box and iteratively run the insecure tool on test data until it works right -
+there is absolutely no reason to do this anywhere other than on a personal private instance.
-4) run the tool and thus the script.
+4. Once it works right, set the 'Generate toolshed gzip' option and run it again.
-== Proposal - a Galaxy toolshed interface ==
-Rerunning the output of the existing script reruns the same script of course, so we're now better off than we were before. But what about adding some code to this script runner tool to generate a new Galaxy tool as a ready to install toolshed entry?
+5. A toolshed style gzip appears ready to upload and install like any other Toolshed entry.
+
+6. Upload the new tool to the toolshed
-This (imho) will be a very low impedence way to generate new simple Galaxy tools - run them until they work then package them up and deploy/distribute for any user to use - no new security risks.
+7. Ask the local admin to check the new tool to confirm it's not evil and install it in the local production galaxy
-Does this seem like a good idea to anyone else?
+New mantra: Galaxy can efficiently soak up all your lab's dark script matter and make it reproducible and shareable.
== Proof of concept ==
@@ -30,14 +31,15 @@
[[http://bitbucket.org/fubar/galaxytoolmaker/src/fda8032fe989/images/dynamicScriptTool.png|proof of concept screengrab]]
=== Sample Rscript ===
-As a working example, this trivial Rscript replaces the score column in a bed file with a random number:
+As a working example, this trivial Rscript transposes a tabular file:
+
{{{
ourargs = commandArgs(TRUE)
inf = ourargs[1]
outf = ourargs[2]
inp = read.table(inf,head=F,row.names=NULL,sep='\t')
- inp[,5] = runif ( nrow(inp) )
- write.table(inp,outf, quote=FALSE, sep="\t",row.names=F,col.names=F)
+ outp = t(inp)
+ write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=F)
}}}
== Licensing ==
diff -r 78044a3d4a21 -r 7221619caefa rgDynamicScriptWrapper.py
--- a/rgDynamicScriptWrapper.py Thu May 31 10:49:22 2012 +1000
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,254 +0,0 @@
-# rgDynamicScriptWrapper.py
-# derived from
-# rgBaseScriptWrapper.py
-# to run some user supplied code
-# extremely dangerous
-# trusted users only - private site only
-# a list in the xml is searched - only users in the list can run this tool.
-#
-# copyright ross lazarus (ross.lazarus@gmail.com) May 2012
-#
-# all rights reserved
-# Licensed under the LGPL for your pleasure
-# Derived from rgDGE.py in May 2012
-# generalized to run required interpreter
-# to make your own tools based on a given script and interpreter such as perl or python
-# clone this and the corresponding xml wrapper
-# replace the parameters/inputs/outputs and the configfile contents with your script
-# Use the $foo syntax to place your parameter values inside the script to assign them - at run time, the script will be used as a template
-# and returned as part of the output to the user - with the right values for all the parameters.
-# Note that this assumes you want all the outputs arranged as a single Html file output
-# after this generic script runner runs your script with the specified interpreter,
-# it will collect all output files into the specified output_html, making thumbnails for all the pdfs it finds and making links for all the other files.
-
-import sys
-import shutil
-import subprocess
-import os
-import time
-import tempfile
-import optparse
-
-progname = os.path.split(sys.argv[0])[1]
-myversion = 'V000.1 May 2012'
-verbose = False
-debug = False
-
-# characters that are allowed but need to be escaped
-# also a test sandboxing of any R system commands
-# ultimately futile - we need to generate a new tool
-# which will have no new security problems!
-mapped_chars = { '>' :'__gt__',
- '<' :'__lt__',
- "'" :'__sq__',
- '"' :'__dq__',
- '{' :'__oc__',
- '}' :'__cc__',
- '@' : '__at__',
- '\n' : '__cn__',
- '\r' : '__cr__',
- '\t' : '__tc__',
- '#' : '__pd__',
- '[' :'__ob__',
- ']' :'__cb__',
- '\t' : 'Xt',
- 'systemCallsAreNotAllowed' : 'system'
- }
-
-galhtmlprefix = """
-
-
-
\n')
- html.append(galhtmlattr % (progname,timenow()))
- html.append(galhtmlpostfix)
- htmlf = file(self.opts.output_html,'w')
- htmlf.write('\n'.join(html))
- htmlf.write('\n')
- htmlf.close()
- return retval
-
-
-def main():
- u = """
- This is a Galaxy wrapper. It expects to be called by a special purpose tool.xml as:
- rgBaseScriptWrapper.py --script_path "$scriptPath" --tool_name "foo" --interpreter "Rscript"
-
- """
- permitted_users = ['rlazarus@bakeridi.edu.au','akaspi@bakeridi.edu.au','mziemann@bakeridi.edu.edu']
- op = optparse.OptionParser()
- a = op.add_option
- a('--script_path',default=None)
- a('--tool_name',default=None)
- a('--interpreter',default=None)
- a('--output_dir',default=None)
- a('--output_html',default=None)
- a('--input_tab',default='NONE')
- a('--output_tab',default='NONE')
- a('--user_email',default=None)
- a('--bad_user',default=None)
- opts, args = op.parse_args()
- assert not opts.bad_user,'%s is NOT authorized to use this tool. Please ask your friendly admin' % opts.bad_user
- assert opts.tool_name,'## Dynamic script wrapper expects a tool name - eg --tool_name=DESeq'
- assert opts.interpreter,'## Dynamic script wrapper expects an interpreter - eg --interpreter=Rscript'
- assert os.path.isfile(opts.script_path),'## Dynamic script wrapper expects a script path - eg --script_path=foo.R'
- if opts.output_dir:
- try:
- os.makedirs(opts.output_dir)
- except:
- pass
- r = ScriptRunner(opts)
- retcode = r.run()
- if retcode:
- print >> sys.stderr,'Executing your script %s failed - return code was %d' % (opts.tool_name,retcode)
- sys.exit(retcode) # indicate failure to job runner
-
-
-if __name__ == "__main__":
- main()
-
-
diff -r 78044a3d4a21 -r 7221619caefa rgDynamicScriptWrapper.xml
--- a/rgDynamicScriptWrapper.xml Thu May 31 10:49:22 2012 +1000
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,94 +0,0 @@
-
- DIY scripting
-
-#if ( $__user_email__ not in ['rlazarus@bakeridi.edu.au','mziemann@bakeridi.edu.au','akaspi@bakeridi.edu.au'] ):
- rgDynamicScriptWrapper.py --bad_user $__user_email__
- #else:
- rgDynamicScriptWrapper.py --script_path "$runme" --interpreter "$interpreter"
- --tool_name "$tool_name" --input_tab "$input1" --user_email "${__user_email__}"
- #if $makeHTML.value=="yes":
- --output_dir "$html_file.files_path" --output_html "$html_file"
- #end if
- #if $makeTAB.value=="yes":
- --output_tab "$tab_file"
- #end if
-#end if
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- makeTAB=="yes"
-
-
- makeHTML=="yes"
-
-
-
-
-${dynScript}
-
-
-
-**What it does**
-This tool enables a user to paste and submit an arbitrary R/python/perl script to run in Galaxy.
-This is (extremely) insecure.
-
-**Restrictions**
-This tool will ONLY work if your user id has been added to the local copy's list of permitted users.
-Ask your friendly Galaxy administrator to edit this tool's source for you if you need this.
-
-**Note to system administrators**
-Under no circumstances should you allow any user to use this tool unless you really, really trust them to do
-no harm.
-
-**Use on public servers**
-is STRONGLY discouraged for obvious reasons
-
-**Scripting conventions**
-The pasted script will be executed.
-It will get the path to the (optional) input tabular data file path or NONE if you do not select one
-as the first command line parameter
-
-The script must write it's output as tab delimited text to the path found as the second command line parameter
-Note that if an optional HTML output is selected, all the output files spewed by your script will be nicely presented as links to the user.
-Any pdf images will automagically be converted to show thumbnails in that output.
-This can be handy for complex scripts creating lots of output.
-
-**Simple Rscript example**
-
-A simple "filter" that takes an input file, does something and writes the results to a new tabular file might look like this::
-
- ourargs = commandArgs(TRUE)
- inf = ourargs[1]
- outf = ourargs[2]
- inp = read.table(inf,head=F,row.names=NULL,sep='\t')
- inp[,5] = runif ( nrow(inp) )
- write.table(inp,outf, quote=FALSE, sep="\t",row.names=F,col.names=F)
-
-
-
-
-
-
-
diff -r 78044a3d4a21 -r 7221619caefa rgToolFactory.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/rgToolFactory.py Sat Jun 02 10:43:08 2012 +1000
@@ -0,0 +1,286 @@
+# rgDynamicScriptWrapper.py
+# derived from
+# rgBaseScriptWrapper.py
+# to run some user supplied code
+# extremely dangerous
+# trusted users only - private site only
+# a list in the xml is searched - only users in the list can run this tool.
+#
+# copyright ross lazarus (ross.lazarus@gmail.com) May 2012
+#
+# all rights reserved
+# Licensed under the LGPL for your pleasure
+# Derived from rgDGE.py in May 2012
+# generalized to run required interpreter
+# to make your own tools based on a given script and interpreter such as perl or python
+# clone this and the corresponding xml wrapper
+# replace the parameters/inputs/outputs and the configfile contents with your script
+# Use the $foo syntax to place your parameter values inside the script to assign them - at run time, the script will be used as a template
+# and returned as part of the output to the user - with the right values for all the parameters.
+# Note that this assumes you want all the outputs arranged as a single Html file output
+# after this generic script runner runs your script with the specified interpreter,
+# it will collect all output files into the specified output_html, making thumbnails for all the pdfs it finds and making links for all the other files.
+
+import sys
+import shutil
+import subprocess
+import os
+import time
+import tempfile
+import optparse
+import tarfile
+import re
+progname = os.path.split(sys.argv[0])[1]
+myversion = 'V000.1 May 2012'
+verbose = False
+debug = False
+
+
+galhtmlprefix = """
+
+
+
+
+
+
+
+
+
\n')
+ html.append(galhtmlattr % (progname,timenow()))
+ html.append(galhtmlpostfix)
+ htmlf = file(self.opts.output_html,'w')
+ htmlf.write('\n'.join(html))
+ htmlf.write('\n')
+ htmlf.close()
+ self.html = html
+
+
+ def run(self):
+ """
+ """
+ if self.opts.output_dir or self.opts.makeTool:
+ sto = open(self.tlog,'w')
+ p = subprocess.Popen(' '.join(self.cl),shell=True,stdout=sto,stderr=sto,stdin=subprocess.PIPE,cwd=self.opts.output_dir)
+ else:
+ p = subprocess.Popen(' '.join(self.cl),shell=True,stdin=subprocess.PIPE)
+ p.stdin.write(self.script)
+ p.stdin.close()
+ retval = p.wait()
+ if self.opts.output_dir or self.opts.makeTool:
+ sto.close()
+ self.makeHtml()
+ return retval
+
+
+def main():
+ u = """
+ This is a Galaxy wrapper. It expects to be called by a special purpose tool.xml as:
+ rgBaseScriptWrapper.py --script_path "$scriptPath" --tool_name "foo" --interpreter "Rscript"
+
+ """
+ op = optparse.OptionParser()
+ a = op.add_option
+ a('--script_path',default=None)
+ a('--tool_name',default=None)
+ a('--interpreter',default=None)
+ a('--output_dir',default=None)
+ a('--output_html',default=None)
+ a('--input_tab',default='NONE')
+ a('--output_tab',default='NONE')
+ a('--user_email',default=None)
+ a('--bad_user',default=None)
+ a('--makeTool',default=None)
+ opts, args = op.parse_args()
+ assert not opts.bad_user,'%s is NOT authorized to use this tool. Please ask your friendly admin' % opts.bad_user
+ assert opts.tool_name,'## Tool Factory expects a tool name - eg --tool_name=DESeq'
+ assert opts.interpreter,'## Tool Factory wrapper expects an interpreter - eg --interpreter=Rscript'
+ assert os.path.isfile(opts.script_path),'## Tool Factory wrapper expects a script path - eg --script_path=foo.R'
+ if opts.output_dir:
+ try:
+ os.makedirs(opts.output_dir)
+ except:
+ pass
+ r = ScriptRunner(opts)
+ if opts.makeTool:
+ retcode = r.makeTooltar()
+ else:
+ retcode = r.run()
+ if retcode:
+ sys.exit(retcode) # indicate failure to job runner
+
+
+if __name__ == "__main__":
+ main()
+
+
diff -r 78044a3d4a21 -r 7221619caefa rgToolFactory.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/rgToolFactory.xml Sat Jun 02 10:43:08 2012 +1000
@@ -0,0 +1,101 @@
+
+ Makes scripts into tools
+
+#if ( $__user_email__ not in ['ross.lazarus@gmail.com',] ):
+ rgToolFactory.py --bad_user $__user_email__
+ #else:
+ rgToolFactory.py --script_path "$runme" --interpreter "$interpreter"
+ --tool_name "$tool_name" --input_tab "$input1" --user_email "${__user_email__}"
+ #if $makeHTML.value=="yes" or $makeTool.value=="yes":
+ --output_dir "$html_file.files_path" --output_html "$html_file"
+ #end if
+ #if $makeTAB.value=="yes":
+ --output_tab "$tab_file"
+ #end if
+ #if $makeTool.value=="yes":
+ --makeTool "$makeTool"
+ #end if
+#end if
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ makeTAB=="yes"
+
+
+ makeHTML=="yes" or makeTool=="yes"
+
+
+
+
+${dynScript}
+
+
+
+**What it does**
+This tool enables a user to paste and submit an arbitrary R/python/perl script to run in Galaxy.
+This is (extremely) insecure.
+
+**Restrictions**
+This tool will ONLY work if your user id has been added to the local copy's list of permitted users.
+Ask your friendly Galaxy administrator to edit this tool's source for you if you need this.
+
+**Note to system administrators**
+Under no circumstances should you allow any user to use this tool unless you really, really trust them to do
+no harm.
+
+**Use on public servers**
+is STRONGLY discouraged for obvious reasons
+
+**Scripting conventions**
+The pasted script will be executed.
+It will get the path to the (optional) input tabular data file path or NONE if you do not select one
+as the first command line parameter
+
+The script must write it's output as tab delimited text to the path found as the second command line parameter
+Note that if an optional HTML output is selected, all the output files spewed by your script will be nicely presented as links to the user.
+Any pdf images will automagically be converted to show thumbnails in that output.
+This can be handy for complex scripts creating lots of output.
+
+**Simple Rscript example**
+
+A simple "filter" that takes an input file, does something and writes the results to a new tabular file might look like this::
+
+ ourargs = commandArgs(TRUE)
+ inf = ourargs[1]
+ outf = ourargs[2]
+ inp = read.table(inf,head=F,row.names=NULL,sep='\t')
+ outp = t(inp)
+ write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=F)
+
+
+
+
+
+