|
0
|
1 <tool id="cshl_sort_header" name="Sort" version="0.1.1">
|
|
|
2 <command interpreter="perl">sort-header
|
|
|
3 --header $header
|
|
|
4 $unique
|
|
|
5 $ignore_case
|
|
|
6 --stable
|
|
|
7 -t ' '
|
|
|
8 #for $key in $sortkeys
|
|
|
9 '-k ${key.column}${key.order}${key.style},${key.column}'
|
|
|
10 #end for
|
|
|
11 --output '$out_file1'
|
|
|
12 '$input1'
|
|
|
13 </command>
|
|
|
14
|
|
|
15 <inputs>
|
|
|
16 <param format="txt" name="input1" type="data" label="Sort Query" />
|
|
|
17
|
|
|
18 <!-- header line is boolean for now, but the values are 1 or 0.
|
|
|
19 in the future, we can use Galaxy's number-of-comment-lines variable -->
|
|
|
20 <param name="header" type="boolean" checked="false" truevalue="1" falsevalue="0"
|
|
|
21 label="First line is a header line" help="Use if first line contains column headers. It will not be sorted." />
|
|
|
22
|
|
|
23 <param name="unique" type="boolean" checked="false" truevalue="--unique" falsevalue=""
|
|
|
24 label="Output unique values" help="Print only unique values (based on sorted key columns. See help section for details." />
|
|
|
25
|
|
|
26 <param name="ignore_case" type="boolean" checked="false" truevalue="-i" falsevalue="" label="Ignore case" help="Sort and Join key column values regardless of upper/lower case letters." />
|
|
|
27
|
|
|
28 <repeat name="sortkeys" title="sort key">
|
|
|
29 <param name="column" label="on column" type="data_column" data_ref="input1" accept_default="true" />
|
|
|
30 <param name="order" type="select" display="radio" label="in">
|
|
|
31 <option value="">Ascending order</option>
|
|
|
32 <option value="r">Descending order</option>
|
|
|
33 </param>
|
|
|
34 <param name="style" type="select" display="radio" label="Flavor">
|
|
|
35 <option value="n">Fast numeric sort ([-n])</option>
|
|
|
36 <option value="g">General numeric sort ( scientific notation [-g])</option>
|
|
|
37 <option value="V">Natural/Version sort ([-V]) </option>
|
|
|
38 <option value="">Alphabetical sort</option>
|
|
|
39 <option value="h">Human-readable numbers (-h)</option>
|
|
|
40 <option value="R">Random order</option>
|
|
|
41 </param>
|
|
|
42 </repeat>
|
|
|
43 </inputs>
|
|
|
44 <tests>
|
|
|
45 </tests>
|
|
|
46 <outputs>
|
|
|
47 <data format="input" name="out_file1" metadata_source="input1"
|
|
|
48 />
|
|
|
49 </outputs>
|
|
|
50 <help>
|
|
|
51
|
|
|
52 **What it does**
|
|
|
53
|
|
|
54 This tool sorts an input file.
|
|
|
55
|
|
|
56 -----
|
|
|
57
|
|
|
58 **Sorting Styles**
|
|
|
59
|
|
|
60 * **Fast Numeric**: sort by numeric values. Handles integer values (e.g. 43, 134) and decimal-point values (e.g. 3.14). *Does not* handle scientific notation (e.g. -2.32e2).
|
|
|
61 * **General Numeric**: sort by numeric values. Handles all numeric notations (including scientific notation). Slower than *fast numeric*, so use only when necessary.
|
|
|
62 * **Natural Sort**: Sort in 'natural' order (natural to humans, not to computers). See example below.
|
|
|
63 * **Alphabetical sort**: Sort in strict alphabetical order. See example below.
|
|
|
64 * **Human-readable numbers**: Sort human readble numbers (e.g. 1G > 2M > 3K > 400)
|
|
|
65 * **Random order**: return lines in random order.
|
|
|
66
|
|
|
67 ------
|
|
|
68
|
|
|
69 **Example - Header line**
|
|
|
70
|
|
|
71 **Input file** (note first line is a header line, should not be sorted)::
|
|
|
72
|
|
|
73 Fruit Color Price
|
|
|
74 Banana Yellow 4.1
|
|
|
75 Avocado Green 8.0
|
|
|
76 Apple Red 3.0
|
|
|
77 Melon Green 6.1
|
|
|
78
|
|
|
79 **Sorting** by **numeric order** on column **3**, with **header**, will return::
|
|
|
80
|
|
|
81 Fruit Color Price
|
|
|
82 Apple Red 3.0
|
|
|
83 Banana Yellow 4.1
|
|
|
84 Melon Green 6.1
|
|
|
85 Avocado Green 8.0
|
|
|
86
|
|
|
87
|
|
|
88 -----
|
|
|
89
|
|
|
90 **Example - Natural vs. Alphabetical sorting**
|
|
|
91
|
|
|
92 Given the following list::
|
|
|
93
|
|
|
94 chr4
|
|
|
95 chr13
|
|
|
96 chr1
|
|
|
97 chr10
|
|
|
98 chr20
|
|
|
99 chr2
|
|
|
100
|
|
|
101 **Alphabetical sort** would produce the following sorted list::
|
|
|
102
|
|
|
103 chr1
|
|
|
104 chr10
|
|
|
105 chr13
|
|
|
106 chr2
|
|
|
107 chr20
|
|
|
108 chr4
|
|
|
109
|
|
|
110 **Natural Sort** would produce the following sorted list::
|
|
|
111
|
|
|
112 chr1
|
|
|
113 chr2
|
|
|
114 chr4
|
|
|
115 chr10
|
|
|
116 chr13
|
|
|
117 chr20
|
|
|
118
|
|
|
119
|
|
|
120 .. class:: infomark
|
|
|
121
|
|
|
122 If you're planning to use the file with another tool that expected sorted files (such as *join*), you should use the **Alphabetical sort**, not the **Natural Sort**. Natural sort order is easier for humans, but is unnatural for computer programs.
|
|
|
123
|
|
|
124 -----
|
|
|
125
|
|
|
126 *sort-header* is was written by A. Gordon ( gordon at cshl dot edu )
|
|
|
127
|
|
|
128 </help>
|
|
|
129 </tool>
|