Mercurial > repos > deepakjadmin > mayatool3_test2
comparison docs/scripts/txt/MergeTextFilesWithSD.txt @ 0:4816e4a8ae95 draft default tip
Uploaded
| author | deepakjadmin |
|---|---|
| date | Wed, 20 Jan 2016 09:23:18 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:4816e4a8ae95 |
|---|---|
| 1 NAME | |
| 2 MergeTextFilesWithSD.pl - Merge CSV or TSV TextFile(s) into SDFile | |
| 3 | |
| 4 SYNOPSIS | |
| 5 MergeTextFilesWithSD.pl SDFile TextFile(s)... | |
| 6 | |
| 7 MergeTextFilesWithSD.pl [-h, --help] [--indelim comma | semicolon] [-c, | |
| 8 --columns colnum,...;... | collabel,...;...] [-k, --keys colkeynum;... | | |
| 9 colkeylabel;...] [-m, --mode colnum | collabel] [-o, --overwrite] [-r, | |
| 10 --root rootname] [-s, --sdkey sdfieldname] [-w, --workingdir dirname] | |
| 11 SDFile TextFile(s)... | |
| 12 | |
| 13 DESCRIPTION | |
| 14 Merge multiple CSV or TSV *TextFile(s)* into *SDFile*. Unless -k --keys | |
| 15 option is used, data rows from all *TextFile(s)* are added to *SDFile* | |
| 16 in a sequential order, and the number of compounds in *SDFile* is used | |
| 17 to determine how many rows of data are added from *TextFile(s)*. | |
| 18 | |
| 19 Multiple *TextFile(s)* names are separated by spaces. The valid file | |
| 20 extensions are *.csv* and *.tsv* for comma/semicolon and tab delimited | |
| 21 text files respectively. All other file names are ignored. All the text | |
| 22 files in a current directory can be specified by **.csv*, **.tsv*, or | |
| 23 the current directory name. The --indelim option determines the format | |
| 24 of *TextFile(s)*. Any file which doesn't correspond to the format | |
| 25 indicated by --indelim option is ignored. | |
| 26 | |
| 27 OPTIONS | |
| 28 -h, --help | |
| 29 Print this help message. | |
| 30 | |
| 31 --indelim *comma | semicolon* | |
| 32 Input delimiter for CSV *TextFile(s)*. Possible values: *comma or | |
| 33 semicolon*. Default value: *comma*. For TSV files, this option is | |
| 34 ignored and *tab* is used as a delimiter. | |
| 35 | |
| 36 -c, --columns *colnum,...;... | collabel,...;...* | |
| 37 This value is mode specific. It is a list of columns to merge into | |
| 38 *SDFile* specified by column numbers or labels for each text file | |
| 39 delimited by ";". All *TextFile(s)* are merged into *SDFile*. | |
| 40 | |
| 41 Default value: *all;all;...*. By default, all columns from | |
| 42 TextFile(s) are merged into *SDFile*. | |
| 43 | |
| 44 For *colnum* mode, input value format is: | |
| 45 *colnum,...;colnum,...;...*. Example: | |
| 46 | |
| 47 "1,2;1,3,4;7,8,9" | |
| 48 | |
| 49 For *collabel* mode, input value format is: | |
| 50 *collabel,...;collabel,...;...*. Example: | |
| 51 | |
| 52 "MW,SumNO;SumNHOH,ClogP,PSA;MolName,Mol_Id,Extreg" | |
| 53 | |
| 54 -k, --keys *colkeynum;... | colkeylabel;...* | |
| 55 This value is mode specific. It specifies column keys to use for | |
| 56 merging *TextFile(s)* into *SDFile*. The column keys, delimited by | |
| 57 ";", are specified by column numbers or labels for *TextFile(s)*. | |
| 58 | |
| 59 By default, data rows from *TextFile(s)* are merged into *SDFile* in | |
| 60 the order they appear. | |
| 61 | |
| 62 For *colnum* mode, input value format is:*colkeynum, colkeynum;...*. | |
| 63 Example: | |
| 64 | |
| 65 "1;3;7" | |
| 66 | |
| 67 For *collabel* mode, input value format is:*colkeylabel, | |
| 68 colkeylabel;...*. Example: | |
| 69 | |
| 70 "Mol_Id;Mol_Id;Cmpd_Id" | |
| 71 | |
| 72 -m, --mode *colnum | collabel* | |
| 73 Specify how to merge *TextFile(s)* into *SDFile*: using column | |
| 74 numbers or column labels. Possible values: *colnum or collabel*. | |
| 75 Default value: *colnum*. | |
| 76 | |
| 77 -o, --overwrite | |
| 78 Overwrite existing files. | |
| 79 | |
| 80 -r, --root *rootname* | |
| 81 New SD file name is generated using the root: <Root>.sdf. Default | |
| 82 file name: | |
| 83 <InitialSDFileName>MergedWith<FirstTextFileName>1To<Count>.sdf. | |
| 84 | |
| 85 -s, --sdkey *sdfieldname* | |
| 86 *SDFile* data field name used as a key to merge data from | |
| 87 TextFile(s). By default, data rows from *TextFile(s)* are merged | |
| 88 into *SDFile* in the order they appear. | |
| 89 | |
| 90 -w, --workingdir *dirname* | |
| 91 Location of working directory. Default: current directory. | |
| 92 | |
| 93 EXAMPLES | |
| 94 To merge Sample1.csv and Sample2.csv into Sample.sdf and generate | |
| 95 NewSample.sdf, type: | |
| 96 | |
| 97 % MergeTextFileswithSD.pl -r NewSample -o Sample.sdf | |
| 98 Sample1.csv Sample2.csv | |
| 99 | |
| 100 To merge all Sample*.tsv into Sample.sdf and generate NewSample.sdf | |
| 101 file, type: | |
| 102 | |
| 103 % MergeTextFilesWithSD.pl -r NewSample -o Sample.sdf | |
| 104 Sample*.tsv | |
| 105 | |
| 106 To merge column numbers "1,2" and "3,4,5" from Sample2.csv and | |
| 107 Sample3.csv into Sample.sdf and to generate NewSample.sdf, type: | |
| 108 | |
| 109 % MergeTextFilesWithSD.pl -r NewSample -m colnum -c "1,2;3,4,5" | |
| 110 -o Sample.sdf Sample1.csv Sample2.csv | |
| 111 | |
| 112 To merge column "Mol_ID,Formula,MolWeight" and "Mol_ID,ChemBankID,NAME" | |
| 113 from Sample1.csv and Sample2.csv into Sample.sdf using "Mol_ID" as SD | |
| 114 and column keys to generate NewSample.sdf, type: | |
| 115 | |
| 116 % MergeTextFilesWithSD.pl -r NewSample -s Mol_ID -k "Mol_ID;Mol_ID" | |
| 117 -m collabel -c "Mol_ID,Formula,MolWeight;Mol_ID,ChemBankID,NAME" | |
| 118 -o Sample1.sdf Sample1.csv Sample2.csv | |
| 119 | |
| 120 AUTHOR | |
| 121 Manish Sud <msud@san.rr.com> | |
| 122 | |
| 123 SEE ALSO | |
| 124 ExtractFromSDFiles.pl, FilterSDFiles.pl, InfoSDFiles.pl, JoinSDFiles.pl, | |
| 125 JoinTextFiles.pl, MergeTextFiles.pl, ModifyTextFilesFormat.pl, | |
| 126 SplitSDFiles.pl, SplitTextFiles.pl | |
| 127 | |
| 128 COPYRIGHT | |
| 129 Copyright (C) 2015 Manish Sud. All rights reserved. | |
| 130 | |
| 131 This file is part of MayaChemTools. | |
| 132 | |
| 133 MayaChemTools is free software; you can redistribute it and/or modify it | |
| 134 under the terms of the GNU Lesser General Public License as published by | |
| 135 the Free Software Foundation; either version 3 of the License, or (at | |
| 136 your option) any later version. | |
| 137 |
