Mercurial > repos > deepakjadmin > mayatool3_test2
comparison docs/scripts/txt/MergeTextFiles.txt @ 0:4816e4a8ae95 draft default tip
Uploaded
author | deepakjadmin |
---|---|
date | Wed, 20 Jan 2016 09:23:18 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:4816e4a8ae95 |
---|---|
1 NAME | |
2 MergeTextFiles.pl - Merge multiple CSV or TSV text files into a single | |
3 text file | |
4 | |
5 SYNOPSIS | |
6 MergeTextFiles.pl TextFiles... | |
7 | |
8 MergeTextFiles.pl [-h, --help] [--indelim comma | semicolon] [-c, | |
9 --columns colnum,...;... | collabel,...;...] [-k, --keys colnum,...;... | |
10 | collabel,...;...] [-m, --mode colnum | collabel] [-o, --overwrite] | |
11 [--outdelim comma | tab | semicolon] [-q, --quote yes | no] [-r, --root | |
12 rootname] [-s, --startcol colnum | collabel] [--startcolmode before | | |
13 after] [-w, --workingdir dirname] TextFiles... | |
14 | |
15 DESCRIPTION | |
16 Merge multiple CSV or TSV *TextFiles* into first *TextFile* to generate | |
17 a single text file. Unless -k --keys option is used, data rows from | |
18 other *TextFiles* are added to first *TextFile* in a sequential order, | |
19 and the number of rows in first *TextFile* is used to determine how many | |
20 rows of data are added from other *TextFiles*. | |
21 | |
22 Multiple *TextFiles* names are separated by space. The valid file | |
23 extensions are *.csv* and *.tsv* for comma/semicolon and tab delimited | |
24 text files respectively. All other file names are ignored. All the text | |
25 files in a current directory can be specified by **.csv*, **.tsv*, or | |
26 the current directory name. The --indelim option determines the format | |
27 of *TextFiles*. Any file which doesn't correspond to the format | |
28 indicated by --indelim option is ignored. | |
29 | |
30 OPTIONS | |
31 -h, --help | |
32 Print this help message. | |
33 | |
34 --indelim *comma | semicolon* | |
35 Input delimiter for CSV *TextFile(s)*. Possible values: *comma or | |
36 semicolon*. Default value: *comma*. For TSV files, this option is | |
37 ignored and *tab* is used as a delimiter. | |
38 | |
39 -c, --columns *colnum,...;... | collabel,...;...* | |
40 This value is mode specific. It is a list of columns to merge into | |
41 first text file specified by column numbers or labels for each text | |
42 file delimited by ";". All specified text files are merged into | |
43 first text file. | |
44 | |
45 Default value: *all;all;...*. By default, all columns from specified | |
46 text files are merged into first text file. | |
47 | |
48 For *colnum* mode, input value format is: | |
49 *colnum,...;colnum,...;...*. Example: | |
50 | |
51 "1,2;1,3,4;7,8,9" | |
52 | |
53 For *collabel* mode, input value format is: | |
54 *collabel,...;collabel,...;...*. Example: | |
55 | |
56 "MW,SumNO;SumNHOH,ClogP,PSA;MolName,Mol_Id,Extreg" | |
57 | |
58 -k, --keys *colnum,...;... | collabel,...;...* | |
59 This value is mode specific. It specifies column keys to use for | |
60 merging all specified text files into first text file. The column | |
61 keys are specified by column numbers or labels for each text file | |
62 delimited by ";". | |
63 | |
64 By default, data rows from text files are merged into first file in | |
65 the order they appear. | |
66 | |
67 For *colnum* mode, input value format is:*colkeynum, colkeynum;...*. | |
68 Example: | |
69 | |
70 "1;3;7" | |
71 | |
72 For *collabel* mode, input value format is:*colkeylabel, | |
73 colkeylabel;...*. Example: | |
74 | |
75 "Mol_Id;Mol_Id;Cmpd_Id" | |
76 | |
77 -m, --mode *colnum | collabel* | |
78 Specify how to merge text files: using column numbers or column | |
79 labels. Possible values: *colnum or collabel*. Default value: | |
80 *colnum*. | |
81 | |
82 -o, --overwrite | |
83 Overwrite existing files. | |
84 | |
85 --outdelim *comma | tab | semicolon* | |
86 Output text file delimiter. Possible values: *comma, tab, or | |
87 semicolon* Default value: *comma*. | |
88 | |
89 -q, --quote *yes | no* | |
90 Put quotes around column values in output text file. Possible | |
91 values: *yes or no*. Default value: *yes*. | |
92 | |
93 -r, --root *rootname* | |
94 New text file name is generated using the root: <Root>.<Ext>. | |
95 Default file name: <FirstTextFileName>1To<Count>Merged.<Ext>. The | |
96 csv, and tsv <Ext> values are used for comma/semicolon, and tab | |
97 delimited text files respectively. | |
98 | |
99 -s, --startcol *colnum | collabel* | |
100 This value is mode specific. It specifies the column in first text | |
101 file which is used for start merging other text files.For *colnum* | |
102 mode, specify column number and for *collabel* mode, specify column | |
103 label. | |
104 | |
105 Default value: *last*. Start merge after the last column. | |
106 | |
107 --startcolmode *before | after* | |
108 Start the merge before or after the -s, --startcol value. Possible | |
109 values: *before or after* Default value: *after*. | |
110 | |
111 -w, --workingdir *dirname* | |
112 Location of working directory. Default: current directory. | |
113 | |
114 EXAMPLES | |
115 To merge Sample2.csv and Sample3.csv into Sample1.csv and generate | |
116 NewSample.csv, type: | |
117 | |
118 % MergeTextFiles.pl -r NewSample -o Sample1.csv Sample2.csv | |
119 Sample3.csv | |
120 | |
121 To merge all Sample*.tsv and generate NewSample.tsv file, type: | |
122 | |
123 % MergeTextFiles.pl -r NewSample --indelim comma --outdelim tab -o | |
124 Sample*.csv | |
125 | |
126 To merge column numbers "1,2" and "3,4,5" from Sample2.csv and | |
127 Sample3.csv into Sample1.csv starting before column number 3 in | |
128 Sample1.csv and to generate NewSample.csv without quoting column data, | |
129 type: | |
130 | |
131 % MergeTextFiles.pl -s 3 --startcolmode before -r NewSample -q no | |
132 -m colnum -c "all;1,2;3,4,5" -o Sample1.csv Sample2.csv | |
133 Sample3.csv | |
134 | |
135 To merge column "Mol_ID,Formula,MolWeight" and "Mol_ID,NAME,ChemBankID" | |
136 from Sample2.csv and Sample3.csv into Sample1.csv using "Mol_ID" as a | |
137 column keys starting after the last column and to generate | |
138 NewSample.tsv, type: | |
139 | |
140 % MergeTextFiles.pl -r NewSample --outdelim tab -k "Mol_ID;Mol_ID; | |
141 Mol_ID" -m collabel -c "all;Mol_ID,Formula,MolWeight;Mol_ID,NAME, | |
142 ChemBankID" -o Sample1.csv Sample2.csv Sample3.csv | |
143 | |
144 AUTHOR | |
145 Manish Sud <msud@san.rr.com> | |
146 | |
147 SEE ALSO | |
148 JoinTextFiles.pl, MergeTextFilesWithSD.pl, ModifyTextFilesFormat.pl, | |
149 SplitTextFiles.pl | |
150 | |
151 COPYRIGHT | |
152 Copyright (C) 2015 Manish Sud. All rights reserved. | |
153 | |
154 This file is part of MayaChemTools. | |
155 | |
156 MayaChemTools is free software; you can redistribute it and/or modify it | |
157 under the terms of the GNU Lesser General Public License as published by | |
158 the Free Software Foundation; either version 3 of the License, or (at | |
159 your option) any later version. | |
160 |