comparison docs/scripts/txt/MergeTextFilesWithSD.txt @ 0:4816e4a8ae95 draft default tip

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 09:23:18 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:4816e4a8ae95
1 NAME
2 MergeTextFilesWithSD.pl - Merge CSV or TSV TextFile(s) into SDFile
3
4 SYNOPSIS
5 MergeTextFilesWithSD.pl SDFile TextFile(s)...
6
7 MergeTextFilesWithSD.pl [-h, --help] [--indelim comma | semicolon] [-c,
8 --columns colnum,...;... | collabel,...;...] [-k, --keys colkeynum;... |
9 colkeylabel;...] [-m, --mode colnum | collabel] [-o, --overwrite] [-r,
10 --root rootname] [-s, --sdkey sdfieldname] [-w, --workingdir dirname]
11 SDFile TextFile(s)...
12
13 DESCRIPTION
14 Merge multiple CSV or TSV *TextFile(s)* into *SDFile*. Unless -k --keys
15 option is used, data rows from all *TextFile(s)* are added to *SDFile*
16 in a sequential order, and the number of compounds in *SDFile* is used
17 to determine how many rows of data are added from *TextFile(s)*.
18
19 Multiple *TextFile(s)* names are separated by spaces. The valid file
20 extensions are *.csv* and *.tsv* for comma/semicolon and tab delimited
21 text files respectively. All other file names are ignored. All the text
22 files in a current directory can be specified by **.csv*, **.tsv*, or
23 the current directory name. The --indelim option determines the format
24 of *TextFile(s)*. Any file which doesn't correspond to the format
25 indicated by --indelim option is ignored.
26
27 OPTIONS
28 -h, --help
29 Print this help message.
30
31 --indelim *comma | semicolon*
32 Input delimiter for CSV *TextFile(s)*. Possible values: *comma or
33 semicolon*. Default value: *comma*. For TSV files, this option is
34 ignored and *tab* is used as a delimiter.
35
36 -c, --columns *colnum,...;... | collabel,...;...*
37 This value is mode specific. It is a list of columns to merge into
38 *SDFile* specified by column numbers or labels for each text file
39 delimited by ";". All *TextFile(s)* are merged into *SDFile*.
40
41 Default value: *all;all;...*. By default, all columns from
42 TextFile(s) are merged into *SDFile*.
43
44 For *colnum* mode, input value format is:
45 *colnum,...;colnum,...;...*. Example:
46
47 "1,2;1,3,4;7,8,9"
48
49 For *collabel* mode, input value format is:
50 *collabel,...;collabel,...;...*. Example:
51
52 "MW,SumNO;SumNHOH,ClogP,PSA;MolName,Mol_Id,Extreg"
53
54 -k, --keys *colkeynum;... | colkeylabel;...*
55 This value is mode specific. It specifies column keys to use for
56 merging *TextFile(s)* into *SDFile*. The column keys, delimited by
57 ";", are specified by column numbers or labels for *TextFile(s)*.
58
59 By default, data rows from *TextFile(s)* are merged into *SDFile* in
60 the order they appear.
61
62 For *colnum* mode, input value format is:*colkeynum, colkeynum;...*.
63 Example:
64
65 "1;3;7"
66
67 For *collabel* mode, input value format is:*colkeylabel,
68 colkeylabel;...*. Example:
69
70 "Mol_Id;Mol_Id;Cmpd_Id"
71
72 -m, --mode *colnum | collabel*
73 Specify how to merge *TextFile(s)* into *SDFile*: using column
74 numbers or column labels. Possible values: *colnum or collabel*.
75 Default value: *colnum*.
76
77 -o, --overwrite
78 Overwrite existing files.
79
80 -r, --root *rootname*
81 New SD file name is generated using the root: <Root>.sdf. Default
82 file name:
83 <InitialSDFileName>MergedWith<FirstTextFileName>1To<Count>.sdf.
84
85 -s, --sdkey *sdfieldname*
86 *SDFile* data field name used as a key to merge data from
87 TextFile(s). By default, data rows from *TextFile(s)* are merged
88 into *SDFile* in the order they appear.
89
90 -w, --workingdir *dirname*
91 Location of working directory. Default: current directory.
92
93 EXAMPLES
94 To merge Sample1.csv and Sample2.csv into Sample.sdf and generate
95 NewSample.sdf, type:
96
97 % MergeTextFileswithSD.pl -r NewSample -o Sample.sdf
98 Sample1.csv Sample2.csv
99
100 To merge all Sample*.tsv into Sample.sdf and generate NewSample.sdf
101 file, type:
102
103 % MergeTextFilesWithSD.pl -r NewSample -o Sample.sdf
104 Sample*.tsv
105
106 To merge column numbers "1,2" and "3,4,5" from Sample2.csv and
107 Sample3.csv into Sample.sdf and to generate NewSample.sdf, type:
108
109 % MergeTextFilesWithSD.pl -r NewSample -m colnum -c "1,2;3,4,5"
110 -o Sample.sdf Sample1.csv Sample2.csv
111
112 To merge column "Mol_ID,Formula,MolWeight" and "Mol_ID,ChemBankID,NAME"
113 from Sample1.csv and Sample2.csv into Sample.sdf using "Mol_ID" as SD
114 and column keys to generate NewSample.sdf, type:
115
116 % MergeTextFilesWithSD.pl -r NewSample -s Mol_ID -k "Mol_ID;Mol_ID"
117 -m collabel -c "Mol_ID,Formula,MolWeight;Mol_ID,ChemBankID,NAME"
118 -o Sample1.sdf Sample1.csv Sample2.csv
119
120 AUTHOR
121 Manish Sud <msud@san.rr.com>
122
123 SEE ALSO
124 ExtractFromSDFiles.pl, FilterSDFiles.pl, InfoSDFiles.pl, JoinSDFiles.pl,
125 JoinTextFiles.pl, MergeTextFiles.pl, ModifyTextFilesFormat.pl,
126 SplitSDFiles.pl, SplitTextFiles.pl
127
128 COPYRIGHT
129 Copyright (C) 2015 Manish Sud. All rights reserved.
130
131 This file is part of MayaChemTools.
132
133 MayaChemTools is free software; you can redistribute it and/or modify it
134 under the terms of the GNU Lesser General Public License as published by
135 the Free Software Foundation; either version 3 of the License, or (at
136 your option) any later version.
137