Mercurial > repos > deepakjadmin > mayatool3_test2
comparison docs/scripts/txt/SplitSDFiles.txt @ 0:4816e4a8ae95 draft default tip
Uploaded
author | deepakjadmin |
---|---|
date | Wed, 20 Jan 2016 09:23:18 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:4816e4a8ae95 |
---|---|
1 NAME | |
2 SplitSDFiles.pl - Split SDFile(s) into multiple SD files | |
3 | |
4 SYNOPSIS | |
5 SplitSDFiles.pl SDFile(s)... | |
6 | |
7 SplitSDFiles.pl [-c, --CmpdsMode DataField | MolName | RootPrefix] [-d, | |
8 --DataField DataFieldName] [-h, --help] [-m, --mode Cmpds | Files] [-n, | |
9 --numfiles number] [--numcmpds number] [-o, --overwrite] [-r, --root | |
10 rootname] [-w,--workingdir dirname] SDFile(s)... | |
11 | |
12 DESCRIPTION | |
13 Split *SDFile(s)* into multiple SD files. Each new SDFile contains a | |
14 compound subset of similar size from the initial file. Multiple | |
15 *SDFile(s)* names are separated by space. The valid file extensions are | |
16 *.sdf* and *.sd*. All other file names are ignored. All the SD files in | |
17 a current directory can be specified either by **.sdf* or the current | |
18 directory name. | |
19 | |
20 OPTIONS | |
21 -c, --CmpdsMode *DataField | MolName | RootPrefix* | |
22 This option is only used during *Cmpds* value of <-m, --mode> option | |
23 with specified --numcmpds value of 1. | |
24 | |
25 Specify how to generate new file names during *Cmpds* value of <-m, | |
26 --mode> option: use *SDFile(s)* datafield value or molname line for | |
27 a specific compound; generate a sequential ID using root prefix | |
28 specified by -r, --root option. | |
29 | |
30 Possible values: *DataField | MolName | RootPrefix | RootPrefix*. | |
31 Default: *RootPrefix*. | |
32 | |
33 For empty *MolName* and *DataField* values during these specified | |
34 modes, file name is automatically generated using *RootPrefix*. | |
35 | |
36 For *RootPrefix* value of -c, --CmpdsMode option, new file names are | |
37 generated using by appending compound record number to value of -r, | |
38 --root option. For example: *RootName*Cmd<RecordNumber>.sdf. | |
39 | |
40 Allowed characters in file names are: a-zA-Z0-9_. All other | |
41 characters in datafield values, molname line, and root prefix are | |
42 ignore during generation of file names. | |
43 | |
44 -d, --DataField *DataFieldName* | |
45 This option is only used during *DataField* value of <-c, | |
46 --CmpdsMode> option. | |
47 | |
48 Specify *SDFile(s)* datafield label name whose value is used for | |
49 generation of new file for a specific compound. Default value: | |
50 *None*. | |
51 | |
52 -h, --help | |
53 Print this help message. | |
54 | |
55 -m, --mode *Cmpds | Files* | |
56 Specify how to split *SDFile(s)*: split into files with each file | |
57 containing specified number of compounds or split into a specified | |
58 number of files. | |
59 | |
60 Possible values: *Cmpds | Files*. Default: *Files*. | |
61 | |
62 For *Cmpds* value of -m, --mode option, value of --numcmpds option | |
63 determines the number of new files. And value of -n, --numfiles | |
64 option is used to figure out the number of new files for *Files* | |
65 value of -m, --mode option. | |
66 | |
67 -n, --numfiles *number* | |
68 Number of new files to generate for each *SDFile(s)*. Default: *2*. | |
69 | |
70 This value is only used during *Files* value of -m, --mode option. | |
71 | |
72 --numcmpds *number* | |
73 Number of compounds in each new file corresponding to each | |
74 *SDFile(s)*. Default: *1*. | |
75 | |
76 This value is only used during *Cmpds* value of -m, --mode option. | |
77 | |
78 -o, --overwrite | |
79 Overwrite existing files. | |
80 | |
81 -r, --root *rootname* | |
82 New SD file names are generated using the root: | |
83 <Root>Part<Count>.sdf. Default new file names: <InitialSDFileName> | |
84 Part<Count>.sdf. This option is ignored for multiple input files. | |
85 | |
86 -w,--workingdir *dirname* | |
87 Location of working directory. Default: current directory. | |
88 | |
89 EXAMPLES | |
90 To split each SD file into 5 new SD files, type: | |
91 | |
92 % SplitSDFiles.pl -n 5 -o Sample1.sdf Sample2.sdf | |
93 % SplitSDFiles.pl -n 5 -o *.sdf | |
94 | |
95 To split Sample1.sdf into 10 new NewSample*.sdf files, type: | |
96 | |
97 % SplitSDFiles.pl -m Files -n 10 -r NewSample -o Sample1.sdf | |
98 | |
99 To split Sample1.sdf into new NewSample*.sdf files containing maximum of | |
100 5 compounds in each file, type: | |
101 | |
102 % SplitSDFiles.pl -m Cmpds --numcmpds 5 -r NewSample -o Sample1.sdf | |
103 | |
104 To split Sample1.sdf into new SD files containing one compound each with | |
105 new file names corresponding to molname line, type: | |
106 | |
107 % SplitSDFiles.pl -m Cmpds --numcmpds 1 -c MolName -o Sample1.sdf | |
108 | |
109 To split Sample1.sdf into new SD files containing one compound each with | |
110 new file names corresponding to value of datafield MolID, type: | |
111 | |
112 % SplitSDFiles.pl -m Cmpds --numcmpds 1 -c DataField -d MolID | |
113 -o Sample1.sdf | |
114 | |
115 AUTHOR | |
116 Manish Sud <msud@san.rr.com> | |
117 | |
118 SEE ALSO | |
119 InfoSDFiles.pl, JoinSDFiles.pl, MolFilesToSD.pl, SDToMolFiles.pl | |
120 | |
121 COPYRIGHT | |
122 Copyright (C) 2015 Manish Sud. All rights reserved. | |
123 | |
124 This file is part of MayaChemTools. | |
125 | |
126 MayaChemTools is free software; you can redistribute it and/or modify it | |
127 under the terms of the GNU Lesser General Public License as published by | |
128 the Free Software Foundation; either version 3 of the License, or (at | |
129 your option) any later version. | |
130 |