comparison README @ 0:631dfde45073 draft default tip

First tool-shed public version
author gordon
date Tue, 09 Oct 2012 18:48:06 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:631dfde45073
1 These are Galaxy wrappers for common unix text-processing tools.
2
3 Source:
4 http://hannonlab.cshl.edu/galaxy_unix_tools/index.html
5
6 Contact: gordon at cshl dot edu
7
8 NOTE: You must install some programs manually. See below for details.
9
10 The tools are:
11
12 * awk - The AWK programmning language ( http://www.gnu.org/software/gawk/ )
13 * sed - Stream Editor ( http://sed.sf.net )
14 * grep - Search files ( http://www.gnu.org/software/grep/ )
15 * GNU Coreutils programs ( http://www.gnu.org/software/coreutils/ ):
16 * sort - sort files
17 * join - join two files, based on common key field.
18 * cut - keep/discard fields from a file
19 * uniq - keep unique/duplicated lines in a file
20 * head - keep the first X lines in a file.
21 * tail - keep the last X lines in a file.
22
23 Few improvements over the standard tools:
24
25 * EasyJoin - A Join tool that does not require pre-sorted the files ( https://github.com/agordon/filo/blob/scripts/src/scripts/easyjoin )
26 * Multi-Join - Join multiple (>2) files ( https://github.com/agordon/filo/blob/scripts/src/scripts/multijoin )
27 * Sort-Header - Sort a file, while maintaining the first line as header line ( https://github.com/agordon/filo/blob/scripts/src/scripts/sort-header )
28 * Find_and_Replace - Find/Replace text in a line or specific column.
29 * Grep with Perl syntax - uses grep with Perl-Compatible regular expressions.
30 * HTML'd Grep - grep text in a file, and produced high-lighted HTML output, for easier viewing ( uses https://github.com/agordon/filo/blob/scripts/src/scripts/sort-header )
31
32
33 Requirements
34 ============
35 1. Coreutils vesion 8.19 or later.
36 2. AWK version 4.0.1 or later.
37 3. SED version 4.2 *with* a special patch
38 4. Grep with PCRE support
39
40
41 NOTE About Security
42 ===================
43 The included tools are secure (barring unintentional bugs):
44 The main concern might be executing system commands with awk's "system" and sed's "e" commands,
45 or reading/writing arbitrary files with awk's redirection and sed's "r/w" commands.
46 These commands are DISABLED using the "--sandbox" parameter to awk and sed.
47
48 User trying to run an awk program similar to:
49 BEGIN { system("ls") }
50 Will get an error (in Galaxy) saying:
51 fatal: 'system' function not allowed in sandbox mode.
52
53 User trying to run a SED program similar to:
54 1els
55 will get an error (in Galaxy) saying:
56 sed: -e expression #1, char 2: e/r/w commands disabled in sandbox mode
57
58 That being said, if you do find some vulnerability in these tools, please let me know and I'll fix them.
59
60
61 Installation
62 ============
63
64 ## GNU coreutils
65 wget http://ftp.gnu.org/gnu/coreutils/coreutils-8.19.tar.xz
66 tar -xJf coreutils-8.19.tar.xz
67 cd coreutils-8.19
68 ./configure --prefix=/INSTALL/PATH
69 make
70 sudo make install
71
72
73 ## AWK
74 wget http://ftp.gnu.org/gnu/gawk/gawk-4.0.1.tar.gz
75 tar -xf gawk-4.0.1.tar.gz
76 cd gawk-4.0.1
77 ./configure --prefix=/INSTALL/PATH
78 make
79 sudo make install
80
81 ## SED
82 wget ftp://ftp.gnu.org/gnu/sed/sed-4.2.tar.gz
83 wget http://cancan.cshl.edu/labmembers/gordon/files/sed-4.2-sandbox.patch
84 tar -xf sed-4.2.tar.gz
85 patch -p0 < sed-4.2-sandbox.patch
86 cd sed-4.2
87 ./configure --prefix=/INSTALL/PATH
88 make
89 sudo make install
90
91 ## Grep
92 wget ftp://ftp.gnu.org/gnu/grep/grep-2.14.tar.xz
93 tar -xJf grep-2.14.tar.xz
94 cd grep-2.14
95 ./configure --enable-perl-regexp --prefix=/INSTALL/PATH
96 make
97 sudo make install
98
99