Mercurial > repos > deepakjadmin > mayatool3_test2
diff docs/scripts/txt/MergeTextFiles.txt @ 0:4816e4a8ae95 draft default tip
Uploaded
author | deepakjadmin |
---|---|
date | Wed, 20 Jan 2016 09:23:18 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/scripts/txt/MergeTextFiles.txt Wed Jan 20 09:23:18 2016 -0500 @@ -0,0 +1,160 @@ +NAME + MergeTextFiles.pl - Merge multiple CSV or TSV text files into a single + text file + +SYNOPSIS + MergeTextFiles.pl TextFiles... + + MergeTextFiles.pl [-h, --help] [--indelim comma | semicolon] [-c, + --columns colnum,...;... | collabel,...;...] [-k, --keys colnum,...;... + | collabel,...;...] [-m, --mode colnum | collabel] [-o, --overwrite] + [--outdelim comma | tab | semicolon] [-q, --quote yes | no] [-r, --root + rootname] [-s, --startcol colnum | collabel] [--startcolmode before | + after] [-w, --workingdir dirname] TextFiles... + +DESCRIPTION + Merge multiple CSV or TSV *TextFiles* into first *TextFile* to generate + a single text file. Unless -k --keys option is used, data rows from + other *TextFiles* are added to first *TextFile* in a sequential order, + and the number of rows in first *TextFile* is used to determine how many + rows of data are added from other *TextFiles*. + + Multiple *TextFiles* names are separated by space. The valid file + extensions are *.csv* and *.tsv* for comma/semicolon and tab delimited + text files respectively. All other file names are ignored. All the text + files in a current directory can be specified by **.csv*, **.tsv*, or + the current directory name. The --indelim option determines the format + of *TextFiles*. Any file which doesn't correspond to the format + indicated by --indelim option is ignored. + +OPTIONS + -h, --help + Print this help message. + + --indelim *comma | semicolon* + Input delimiter for CSV *TextFile(s)*. Possible values: *comma or + semicolon*. Default value: *comma*. For TSV files, this option is + ignored and *tab* is used as a delimiter. + + -c, --columns *colnum,...;... | collabel,...;...* + This value is mode specific. It is a list of columns to merge into + first text file specified by column numbers or labels for each text + file delimited by ";". All specified text files are merged into + first text file. + + Default value: *all;all;...*. By default, all columns from specified + text files are merged into first text file. + + For *colnum* mode, input value format is: + *colnum,...;colnum,...;...*. Example: + + "1,2;1,3,4;7,8,9" + + For *collabel* mode, input value format is: + *collabel,...;collabel,...;...*. Example: + + "MW,SumNO;SumNHOH,ClogP,PSA;MolName,Mol_Id,Extreg" + + -k, --keys *colnum,...;... | collabel,...;...* + This value is mode specific. It specifies column keys to use for + merging all specified text files into first text file. The column + keys are specified by column numbers or labels for each text file + delimited by ";". + + By default, data rows from text files are merged into first file in + the order they appear. + + For *colnum* mode, input value format is:*colkeynum, colkeynum;...*. + Example: + + "1;3;7" + + For *collabel* mode, input value format is:*colkeylabel, + colkeylabel;...*. Example: + + "Mol_Id;Mol_Id;Cmpd_Id" + + -m, --mode *colnum | collabel* + Specify how to merge text files: using column numbers or column + labels. Possible values: *colnum or collabel*. Default value: + *colnum*. + + -o, --overwrite + Overwrite existing files. + + --outdelim *comma | tab | semicolon* + Output text file delimiter. Possible values: *comma, tab, or + semicolon* Default value: *comma*. + + -q, --quote *yes | no* + Put quotes around column values in output text file. Possible + values: *yes or no*. Default value: *yes*. + + -r, --root *rootname* + New text file name is generated using the root: <Root>.<Ext>. + Default file name: <FirstTextFileName>1To<Count>Merged.<Ext>. The + csv, and tsv <Ext> values are used for comma/semicolon, and tab + delimited text files respectively. + + -s, --startcol *colnum | collabel* + This value is mode specific. It specifies the column in first text + file which is used for start merging other text files.For *colnum* + mode, specify column number and for *collabel* mode, specify column + label. + + Default value: *last*. Start merge after the last column. + + --startcolmode *before | after* + Start the merge before or after the -s, --startcol value. Possible + values: *before or after* Default value: *after*. + + -w, --workingdir *dirname* + Location of working directory. Default: current directory. + +EXAMPLES + To merge Sample2.csv and Sample3.csv into Sample1.csv and generate + NewSample.csv, type: + + % MergeTextFiles.pl -r NewSample -o Sample1.csv Sample2.csv + Sample3.csv + + To merge all Sample*.tsv and generate NewSample.tsv file, type: + + % MergeTextFiles.pl -r NewSample --indelim comma --outdelim tab -o + Sample*.csv + + To merge column numbers "1,2" and "3,4,5" from Sample2.csv and + Sample3.csv into Sample1.csv starting before column number 3 in + Sample1.csv and to generate NewSample.csv without quoting column data, + type: + + % MergeTextFiles.pl -s 3 --startcolmode before -r NewSample -q no + -m colnum -c "all;1,2;3,4,5" -o Sample1.csv Sample2.csv + Sample3.csv + + To merge column "Mol_ID,Formula,MolWeight" and "Mol_ID,NAME,ChemBankID" + from Sample2.csv and Sample3.csv into Sample1.csv using "Mol_ID" as a + column keys starting after the last column and to generate + NewSample.tsv, type: + + % MergeTextFiles.pl -r NewSample --outdelim tab -k "Mol_ID;Mol_ID; + Mol_ID" -m collabel -c "all;Mol_ID,Formula,MolWeight;Mol_ID,NAME, + ChemBankID" -o Sample1.csv Sample2.csv Sample3.csv + +AUTHOR + Manish Sud <msud@san.rr.com> + +SEE ALSO + JoinTextFiles.pl, MergeTextFilesWithSD.pl, ModifyTextFilesFormat.pl, + SplitTextFiles.pl + +COPYRIGHT + Copyright (C) 2015 Manish Sud. All rights reserved. + + This file is part of MayaChemTools. + + MayaChemTools is free software; you can redistribute it and/or modify it + under the terms of the GNU Lesser General Public License as published by + the Free Software Foundation; either version 3 of the License, or (at + your option) any later version. +