0
|
1 <tool id="unixtools_multijoin'_tool" name="Multi-Join" version="0.1.1">
|
|
2 <description>(combine multiple files)</description>
|
|
3 <command interpreter="perl">multijoin
|
|
4 --key '$key_column'
|
|
5 --values '$value_columns'
|
|
6 --filler '$filler'
|
|
7 $ignore_dups
|
|
8 $output_header
|
|
9 $input_header
|
|
10 #for $file in $files
|
|
11 '$file.filename'
|
|
12 #end for
|
|
13 > '$output'
|
|
14 </command>
|
|
15
|
|
16 <inputs>
|
|
17 <repeat name="files" title="file to join">
|
|
18 <param name="filename" label="Add file" type="data" format="txt" />
|
|
19 </repeat>
|
|
20
|
|
21 <param name="key_column" label="Common key column" type="integer"
|
|
22 value="1" help="Usually gene-ID or other common value" />
|
|
23
|
|
24 <param name="value_columns" label="Column with values to preserve" type="text"
|
|
25 value="2,3,4" help="Enter comma-separated list of columns, e.g. 3,6,8">
|
|
26 <sanitizer>
|
|
27 <valid initial="string.printable">
|
|
28 <remove value="'"/>
|
|
29 </valid>
|
|
30 </sanitizer>
|
|
31 </param>
|
|
32
|
|
33 <param name="output_header" type="boolean" checked="false" truevalue="--out-header" falsevalue="" label="Add header line to the output file" help="" />
|
|
34 <param name="input_header" type="boolean" checked="false" truevalue="--in-header" falsevalue="" label="Input files contain a header line (as first line)" help="" />
|
|
35 <param name="ignore_dups" type="boolean" checked="false" truevalue="--ignore-dups" falsevalue="" label="Ignore duplicated keys" help="If not set, duplicated keys in the same file will cause an error." />
|
|
36 <param name="filler" type="text" size="20" value="0" label="Value to put in unpaired (empty) fields">
|
|
37 <sanitizer>
|
|
38 <valid initial="string.printable">
|
|
39 <remove value="'"/>
|
|
40 </valid>
|
|
41 </sanitizer>
|
|
42 </param>
|
|
43
|
|
44 </inputs>
|
|
45 <outputs>
|
|
46 <data name="output" format="input" metadata_source="input1" />
|
|
47 </outputs>
|
|
48
|
|
49 <help>
|
|
50 **What it does**
|
|
51
|
|
52 This tool joins multiple tabular files based on a common key column.
|
|
53
|
|
54 -----
|
|
55
|
|
56 **Example**
|
|
57
|
|
58 To join three files, based on the 4th column, and keeping the 7th,8th,9th columns:
|
|
59
|
|
60 **First file (AAA)**::
|
|
61
|
|
62 chr4 888449 890171 FBtr0308778 0 + 266 1527 1722
|
|
63 chr4 972167 979017 FBtr0310651 0 - 3944 6428 6850
|
|
64 chr4 972186 979017 FBtr0089229 0 - 3944 6428 6831
|
|
65 chr4 972186 979017 FBtr0089231 0 - 3944 6428 6831
|
|
66 chr4 972186 979017 FBtr0089233 0 - 3944 6428 6831
|
|
67 chr4 995793 996435 FBtr0111046 0 + 7 166 642
|
|
68 chr4 995793 997931 FBtr0111044 0 + 28 683 2138
|
|
69 chr4 995793 997931 FBtr0111045 0 + 28 683 2138
|
|
70 chr4 1034029 1047719 FBtr0089223 0 - 5293 13394 13690
|
|
71 ...
|
|
72
|
|
73
|
|
74 **Second File (BBB)**::
|
|
75
|
|
76 chr4 90286 134453 FBtr0309803 0 + 657 29084 44167
|
|
77 chr4 251355 266499 FBtr0089116 0 + 56 1296 15144
|
|
78 chr4 252050 266506 FBtr0308086 0 + 56 1296 14456
|
|
79 chr4 252050 266506 FBtr0308087 0 + 56 1296 14456
|
|
80 chr4 252053 266528 FBtr0300796 0 + 56 1296 14475
|
|
81 chr4 252053 266528 FBtr0300800 0 + 56 1296 14475
|
|
82 chr4 252055 266528 FBtr0300798 0 + 56 1296 14473
|
|
83 chr4 252055 266528 FBtr0300799 0 + 56 1296 14473
|
|
84 chr4 252541 266528 FBtr0300797 0 + 56 1296 13987
|
|
85 ...
|
|
86
|
|
87 **Third file (CCC)**::
|
|
88
|
|
89 chr4 972167 979017 FBtr0310651 0 - 9927 6738 6850
|
|
90 chr4 972186 979017 FBtr0089229 0 - 9927 6738 6831
|
|
91 chr4 972186 979017 FBtr0089231 0 - 9927 6738 6831
|
|
92 chr4 972186 979017 FBtr0089233 0 - 9927 6738 6831
|
|
93 chr4 995793 996435 FBtr0111046 0 + 5 304 642
|
|
94 chr4 995793 997931 FBtr0111044 0 + 17 714 2138
|
|
95 chr4 995793 997931 FBtr0111045 0 + 17 714 2138
|
|
96 chr4 1034029 1047719 FBtr0089223 0 - 17646 13536 13690
|
|
97 ...
|
|
98
|
|
99
|
|
100 **Joining** the files, using **key column 4**, **value columns 7,8,9** and a **header line**, will return::
|
|
101
|
|
102 key AAA__V7 AAA__V8 AAA__V9 BBB__V7 BBB__V8 BBB__V9 CCC__V7 CCC__V8 CCC__V9
|
|
103 FBtr0089116 0 0 0 56 1296 15144 0 0 0
|
|
104 FBtr0089223 5293 13394 13690 0 0 0 17646 13536 13690
|
|
105 FBtr0089229 3944 6428 6831 0 0 0 9927 6738 6831
|
|
106 FBtr0089231 3944 6428 6831 0 0 0 9927 6738 6831
|
|
107 FBtr0089233 3944 6428 6831 0 0 0 9927 6738 6831
|
|
108 FBtr0111044 28 683 2138 0 0 0 17 714 2138
|
|
109 FBtr0111045 28 683 2138 0 0 0 17 714 2138
|
|
110 FBtr0111046 7 166 642 0 0 0 5 304 642
|
|
111 FBtr0300796 0 0 0 56 1296 14475 0 0 0
|
|
112 ...
|
|
113
|
|
114
|
|
115 # Input files need not be sorted.
|
|
116
|
|
117 -----
|
|
118
|
|
119 *multijoin* was written by A. Gordon (gordon at cshl dot edu)
|
|
120
|
|
121 </help>
|
|
122 </tool>
|