Mercurial > repos > deepakjadmin > mayatool3_test2
comparison docs/scripts/txt/MergeTextFiles.txt @ 0:4816e4a8ae95 draft default tip
Uploaded
| author | deepakjadmin |
|---|---|
| date | Wed, 20 Jan 2016 09:23:18 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:4816e4a8ae95 |
|---|---|
| 1 NAME | |
| 2 MergeTextFiles.pl - Merge multiple CSV or TSV text files into a single | |
| 3 text file | |
| 4 | |
| 5 SYNOPSIS | |
| 6 MergeTextFiles.pl TextFiles... | |
| 7 | |
| 8 MergeTextFiles.pl [-h, --help] [--indelim comma | semicolon] [-c, | |
| 9 --columns colnum,...;... | collabel,...;...] [-k, --keys colnum,...;... | |
| 10 | collabel,...;...] [-m, --mode colnum | collabel] [-o, --overwrite] | |
| 11 [--outdelim comma | tab | semicolon] [-q, --quote yes | no] [-r, --root | |
| 12 rootname] [-s, --startcol colnum | collabel] [--startcolmode before | | |
| 13 after] [-w, --workingdir dirname] TextFiles... | |
| 14 | |
| 15 DESCRIPTION | |
| 16 Merge multiple CSV or TSV *TextFiles* into first *TextFile* to generate | |
| 17 a single text file. Unless -k --keys option is used, data rows from | |
| 18 other *TextFiles* are added to first *TextFile* in a sequential order, | |
| 19 and the number of rows in first *TextFile* is used to determine how many | |
| 20 rows of data are added from other *TextFiles*. | |
| 21 | |
| 22 Multiple *TextFiles* names are separated by space. The valid file | |
| 23 extensions are *.csv* and *.tsv* for comma/semicolon and tab delimited | |
| 24 text files respectively. All other file names are ignored. All the text | |
| 25 files in a current directory can be specified by **.csv*, **.tsv*, or | |
| 26 the current directory name. The --indelim option determines the format | |
| 27 of *TextFiles*. Any file which doesn't correspond to the format | |
| 28 indicated by --indelim option is ignored. | |
| 29 | |
| 30 OPTIONS | |
| 31 -h, --help | |
| 32 Print this help message. | |
| 33 | |
| 34 --indelim *comma | semicolon* | |
| 35 Input delimiter for CSV *TextFile(s)*. Possible values: *comma or | |
| 36 semicolon*. Default value: *comma*. For TSV files, this option is | |
| 37 ignored and *tab* is used as a delimiter. | |
| 38 | |
| 39 -c, --columns *colnum,...;... | collabel,...;...* | |
| 40 This value is mode specific. It is a list of columns to merge into | |
| 41 first text file specified by column numbers or labels for each text | |
| 42 file delimited by ";". All specified text files are merged into | |
| 43 first text file. | |
| 44 | |
| 45 Default value: *all;all;...*. By default, all columns from specified | |
| 46 text files are merged into first text file. | |
| 47 | |
| 48 For *colnum* mode, input value format is: | |
| 49 *colnum,...;colnum,...;...*. Example: | |
| 50 | |
| 51 "1,2;1,3,4;7,8,9" | |
| 52 | |
| 53 For *collabel* mode, input value format is: | |
| 54 *collabel,...;collabel,...;...*. Example: | |
| 55 | |
| 56 "MW,SumNO;SumNHOH,ClogP,PSA;MolName,Mol_Id,Extreg" | |
| 57 | |
| 58 -k, --keys *colnum,...;... | collabel,...;...* | |
| 59 This value is mode specific. It specifies column keys to use for | |
| 60 merging all specified text files into first text file. The column | |
| 61 keys are specified by column numbers or labels for each text file | |
| 62 delimited by ";". | |
| 63 | |
| 64 By default, data rows from text files are merged into first file in | |
| 65 the order they appear. | |
| 66 | |
| 67 For *colnum* mode, input value format is:*colkeynum, colkeynum;...*. | |
| 68 Example: | |
| 69 | |
| 70 "1;3;7" | |
| 71 | |
| 72 For *collabel* mode, input value format is:*colkeylabel, | |
| 73 colkeylabel;...*. Example: | |
| 74 | |
| 75 "Mol_Id;Mol_Id;Cmpd_Id" | |
| 76 | |
| 77 -m, --mode *colnum | collabel* | |
| 78 Specify how to merge text files: using column numbers or column | |
| 79 labels. Possible values: *colnum or collabel*. Default value: | |
| 80 *colnum*. | |
| 81 | |
| 82 -o, --overwrite | |
| 83 Overwrite existing files. | |
| 84 | |
| 85 --outdelim *comma | tab | semicolon* | |
| 86 Output text file delimiter. Possible values: *comma, tab, or | |
| 87 semicolon* Default value: *comma*. | |
| 88 | |
| 89 -q, --quote *yes | no* | |
| 90 Put quotes around column values in output text file. Possible | |
| 91 values: *yes or no*. Default value: *yes*. | |
| 92 | |
| 93 -r, --root *rootname* | |
| 94 New text file name is generated using the root: <Root>.<Ext>. | |
| 95 Default file name: <FirstTextFileName>1To<Count>Merged.<Ext>. The | |
| 96 csv, and tsv <Ext> values are used for comma/semicolon, and tab | |
| 97 delimited text files respectively. | |
| 98 | |
| 99 -s, --startcol *colnum | collabel* | |
| 100 This value is mode specific. It specifies the column in first text | |
| 101 file which is used for start merging other text files.For *colnum* | |
| 102 mode, specify column number and for *collabel* mode, specify column | |
| 103 label. | |
| 104 | |
| 105 Default value: *last*. Start merge after the last column. | |
| 106 | |
| 107 --startcolmode *before | after* | |
| 108 Start the merge before or after the -s, --startcol value. Possible | |
| 109 values: *before or after* Default value: *after*. | |
| 110 | |
| 111 -w, --workingdir *dirname* | |
| 112 Location of working directory. Default: current directory. | |
| 113 | |
| 114 EXAMPLES | |
| 115 To merge Sample2.csv and Sample3.csv into Sample1.csv and generate | |
| 116 NewSample.csv, type: | |
| 117 | |
| 118 % MergeTextFiles.pl -r NewSample -o Sample1.csv Sample2.csv | |
| 119 Sample3.csv | |
| 120 | |
| 121 To merge all Sample*.tsv and generate NewSample.tsv file, type: | |
| 122 | |
| 123 % MergeTextFiles.pl -r NewSample --indelim comma --outdelim tab -o | |
| 124 Sample*.csv | |
| 125 | |
| 126 To merge column numbers "1,2" and "3,4,5" from Sample2.csv and | |
| 127 Sample3.csv into Sample1.csv starting before column number 3 in | |
| 128 Sample1.csv and to generate NewSample.csv without quoting column data, | |
| 129 type: | |
| 130 | |
| 131 % MergeTextFiles.pl -s 3 --startcolmode before -r NewSample -q no | |
| 132 -m colnum -c "all;1,2;3,4,5" -o Sample1.csv Sample2.csv | |
| 133 Sample3.csv | |
| 134 | |
| 135 To merge column "Mol_ID,Formula,MolWeight" and "Mol_ID,NAME,ChemBankID" | |
| 136 from Sample2.csv and Sample3.csv into Sample1.csv using "Mol_ID" as a | |
| 137 column keys starting after the last column and to generate | |
| 138 NewSample.tsv, type: | |
| 139 | |
| 140 % MergeTextFiles.pl -r NewSample --outdelim tab -k "Mol_ID;Mol_ID; | |
| 141 Mol_ID" -m collabel -c "all;Mol_ID,Formula,MolWeight;Mol_ID,NAME, | |
| 142 ChemBankID" -o Sample1.csv Sample2.csv Sample3.csv | |
| 143 | |
| 144 AUTHOR | |
| 145 Manish Sud <msud@san.rr.com> | |
| 146 | |
| 147 SEE ALSO | |
| 148 JoinTextFiles.pl, MergeTextFilesWithSD.pl, ModifyTextFilesFormat.pl, | |
| 149 SplitTextFiles.pl | |
| 150 | |
| 151 COPYRIGHT | |
| 152 Copyright (C) 2015 Manish Sud. All rights reserved. | |
| 153 | |
| 154 This file is part of MayaChemTools. | |
| 155 | |
| 156 MayaChemTools is free software; you can redistribute it and/or modify it | |
| 157 under the terms of the GNU Lesser General Public License as published by | |
| 158 the Free Software Foundation; either version 3 of the License, or (at | |
| 159 your option) any later version. | |
| 160 |
