0
|
1 NAME
|
|
2 MergeTextFiles.pl - Merge multiple CSV or TSV text files into a single
|
|
3 text file
|
|
4
|
|
5 SYNOPSIS
|
|
6 MergeTextFiles.pl TextFiles...
|
|
7
|
|
8 MergeTextFiles.pl [-h, --help] [--indelim comma | semicolon] [-c,
|
|
9 --columns colnum,...;... | collabel,...;...] [-k, --keys colnum,...;...
|
|
10 | collabel,...;...] [-m, --mode colnum | collabel] [-o, --overwrite]
|
|
11 [--outdelim comma | tab | semicolon] [-q, --quote yes | no] [-r, --root
|
|
12 rootname] [-s, --startcol colnum | collabel] [--startcolmode before |
|
|
13 after] [-w, --workingdir dirname] TextFiles...
|
|
14
|
|
15 DESCRIPTION
|
|
16 Merge multiple CSV or TSV *TextFiles* into first *TextFile* to generate
|
|
17 a single text file. Unless -k --keys option is used, data rows from
|
|
18 other *TextFiles* are added to first *TextFile* in a sequential order,
|
|
19 and the number of rows in first *TextFile* is used to determine how many
|
|
20 rows of data are added from other *TextFiles*.
|
|
21
|
|
22 Multiple *TextFiles* names are separated by space. The valid file
|
|
23 extensions are *.csv* and *.tsv* for comma/semicolon and tab delimited
|
|
24 text files respectively. All other file names are ignored. All the text
|
|
25 files in a current directory can be specified by **.csv*, **.tsv*, or
|
|
26 the current directory name. The --indelim option determines the format
|
|
27 of *TextFiles*. Any file which doesn't correspond to the format
|
|
28 indicated by --indelim option is ignored.
|
|
29
|
|
30 OPTIONS
|
|
31 -h, --help
|
|
32 Print this help message.
|
|
33
|
|
34 --indelim *comma | semicolon*
|
|
35 Input delimiter for CSV *TextFile(s)*. Possible values: *comma or
|
|
36 semicolon*. Default value: *comma*. For TSV files, this option is
|
|
37 ignored and *tab* is used as a delimiter.
|
|
38
|
|
39 -c, --columns *colnum,...;... | collabel,...;...*
|
|
40 This value is mode specific. It is a list of columns to merge into
|
|
41 first text file specified by column numbers or labels for each text
|
|
42 file delimited by ";". All specified text files are merged into
|
|
43 first text file.
|
|
44
|
|
45 Default value: *all;all;...*. By default, all columns from specified
|
|
46 text files are merged into first text file.
|
|
47
|
|
48 For *colnum* mode, input value format is:
|
|
49 *colnum,...;colnum,...;...*. Example:
|
|
50
|
|
51 "1,2;1,3,4;7,8,9"
|
|
52
|
|
53 For *collabel* mode, input value format is:
|
|
54 *collabel,...;collabel,...;...*. Example:
|
|
55
|
|
56 "MW,SumNO;SumNHOH,ClogP,PSA;MolName,Mol_Id,Extreg"
|
|
57
|
|
58 -k, --keys *colnum,...;... | collabel,...;...*
|
|
59 This value is mode specific. It specifies column keys to use for
|
|
60 merging all specified text files into first text file. The column
|
|
61 keys are specified by column numbers or labels for each text file
|
|
62 delimited by ";".
|
|
63
|
|
64 By default, data rows from text files are merged into first file in
|
|
65 the order they appear.
|
|
66
|
|
67 For *colnum* mode, input value format is:*colkeynum, colkeynum;...*.
|
|
68 Example:
|
|
69
|
|
70 "1;3;7"
|
|
71
|
|
72 For *collabel* mode, input value format is:*colkeylabel,
|
|
73 colkeylabel;...*. Example:
|
|
74
|
|
75 "Mol_Id;Mol_Id;Cmpd_Id"
|
|
76
|
|
77 -m, --mode *colnum | collabel*
|
|
78 Specify how to merge text files: using column numbers or column
|
|
79 labels. Possible values: *colnum or collabel*. Default value:
|
|
80 *colnum*.
|
|
81
|
|
82 -o, --overwrite
|
|
83 Overwrite existing files.
|
|
84
|
|
85 --outdelim *comma | tab | semicolon*
|
|
86 Output text file delimiter. Possible values: *comma, tab, or
|
|
87 semicolon* Default value: *comma*.
|
|
88
|
|
89 -q, --quote *yes | no*
|
|
90 Put quotes around column values in output text file. Possible
|
|
91 values: *yes or no*. Default value: *yes*.
|
|
92
|
|
93 -r, --root *rootname*
|
|
94 New text file name is generated using the root: <Root>.<Ext>.
|
|
95 Default file name: <FirstTextFileName>1To<Count>Merged.<Ext>. The
|
|
96 csv, and tsv <Ext> values are used for comma/semicolon, and tab
|
|
97 delimited text files respectively.
|
|
98
|
|
99 -s, --startcol *colnum | collabel*
|
|
100 This value is mode specific. It specifies the column in first text
|
|
101 file which is used for start merging other text files.For *colnum*
|
|
102 mode, specify column number and for *collabel* mode, specify column
|
|
103 label.
|
|
104
|
|
105 Default value: *last*. Start merge after the last column.
|
|
106
|
|
107 --startcolmode *before | after*
|
|
108 Start the merge before or after the -s, --startcol value. Possible
|
|
109 values: *before or after* Default value: *after*.
|
|
110
|
|
111 -w, --workingdir *dirname*
|
|
112 Location of working directory. Default: current directory.
|
|
113
|
|
114 EXAMPLES
|
|
115 To merge Sample2.csv and Sample3.csv into Sample1.csv and generate
|
|
116 NewSample.csv, type:
|
|
117
|
|
118 % MergeTextFiles.pl -r NewSample -o Sample1.csv Sample2.csv
|
|
119 Sample3.csv
|
|
120
|
|
121 To merge all Sample*.tsv and generate NewSample.tsv file, type:
|
|
122
|
|
123 % MergeTextFiles.pl -r NewSample --indelim comma --outdelim tab -o
|
|
124 Sample*.csv
|
|
125
|
|
126 To merge column numbers "1,2" and "3,4,5" from Sample2.csv and
|
|
127 Sample3.csv into Sample1.csv starting before column number 3 in
|
|
128 Sample1.csv and to generate NewSample.csv without quoting column data,
|
|
129 type:
|
|
130
|
|
131 % MergeTextFiles.pl -s 3 --startcolmode before -r NewSample -q no
|
|
132 -m colnum -c "all;1,2;3,4,5" -o Sample1.csv Sample2.csv
|
|
133 Sample3.csv
|
|
134
|
|
135 To merge column "Mol_ID,Formula,MolWeight" and "Mol_ID,NAME,ChemBankID"
|
|
136 from Sample2.csv and Sample3.csv into Sample1.csv using "Mol_ID" as a
|
|
137 column keys starting after the last column and to generate
|
|
138 NewSample.tsv, type:
|
|
139
|
|
140 % MergeTextFiles.pl -r NewSample --outdelim tab -k "Mol_ID;Mol_ID;
|
|
141 Mol_ID" -m collabel -c "all;Mol_ID,Formula,MolWeight;Mol_ID,NAME,
|
|
142 ChemBankID" -o Sample1.csv Sample2.csv Sample3.csv
|
|
143
|
|
144 AUTHOR
|
|
145 Manish Sud <msud@san.rr.com>
|
|
146
|
|
147 SEE ALSO
|
|
148 JoinTextFiles.pl, MergeTextFilesWithSD.pl, ModifyTextFilesFormat.pl,
|
|
149 SplitTextFiles.pl
|
|
150
|
|
151 COPYRIGHT
|
|
152 Copyright (C) 2015 Manish Sud. All rights reserved.
|
|
153
|
|
154 This file is part of MayaChemTools.
|
|
155
|
|
156 MayaChemTools is free software; you can redistribute it and/or modify it
|
|
157 under the terms of the GNU Lesser General Public License as published by
|
|
158 the Free Software Foundation; either version 3 of the License, or (at
|
|
159 your option) any later version.
|
|
160
|