Mercurial > repos > deepakjadmin > mayatool3_test2
comparison docs/scripts/txt/CalculatePhysicochemicalProperties.txt @ 0:4816e4a8ae95 draft default tip
Uploaded
author | deepakjadmin |
---|---|
date | Wed, 20 Jan 2016 09:23:18 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:4816e4a8ae95 |
---|---|
1 NAME | |
2 CalculatePhysicochemicalProperties.pl - Calculate physicochemical | |
3 properties for SD files | |
4 | |
5 SYNOPSIS | |
6 CalculatePhysicochemicalProperties.pl SDFile(s)... | |
7 | |
8 PhysicochemicalProperties.pl [--AromaticityModel *AromaticityModelType*] | |
9 [--CompoundID DataFieldName or LabelPrefixString] [--CompoundIDLabel | |
10 text] [--CompoundIDMode] [--DataFields "FieldLabel1, FieldLabel2,..."] | |
11 [-d, --DataFieldsMode All | Common | Specify | CompoundID] [-f, --Filter | |
12 Yes | No] [-h, --help] [--HydrogenBonds HBondsType1 | HBondsType2] [-k, | |
13 --KeepLargestComponent Yes | No] [-m, --mode All | RuleOf5 | RuleOf3 | | |
14 "name1, [name2,...]"] [--MolecularComplexity *Name,Value, | |
15 [Name,Value,...]*] [--OutDelim comma | tab | semicolon] [--output SD | | |
16 text | both] [-o, --overwrite] [--Precision | |
17 Name,Number,[Name,Number,..]] [--RotatableBonds Name,Value, | |
18 [Name,Value,...]] [--RuleOf3Violations Yes | No] [--RuleOf5Violations | |
19 Yes | No] [-q, --quote Yes | No] [-r, --root RootName] [-w, --WorkingDir | |
20 dirname] SDFile(s)... | |
21 | |
22 DESCRIPTION | |
23 Calculate physicochemical properties for *SDFile(s)* and create | |
24 appropriate SD or CSV/TSV text file(s) containing calculated properties. | |
25 | |
26 The current release of MayaChemTools supports the calculation of these | |
27 physicochemical properties: | |
28 | |
29 MolecularWeight, ExactMass, HeavyAtoms, Rings, AromaticRings, | |
30 van der Waals MolecularVolume [ Ref 93 ], RotatableBonds, | |
31 HydrogenBondDonors, HydrogenBondAcceptors, LogP and | |
32 Molar Refractivity (SLogP and SMR) [ Ref 89 ], Topological Polar | |
33 Surface Area (TPSA) [ Ref 90 ], Fraction of SP3 carbons (Fsp3Carbons) | |
34 and SP3 carbons (Sp3Carbons) [ Ref 115-116, Ref 119 ], | |
35 MolecularComplexity [ Ref 117-119 ] | |
36 | |
37 Multiple SDFile names are separated by spaces. The valid file extensions | |
38 are *.sdf* and *.sd*. All other file names are ignored. All the SD files | |
39 in a current directory can be specified either by **.sdf* or the current | |
40 directory name. | |
41 | |
42 The calculation of molecular complexity using *MolecularComplexityType* | |
43 parameter corresponds to the number of bits-set or unique keys [ Ref | |
44 117-119 ] in molecular fingerprints. Default value for | |
45 *MolecularComplexityType*: *MACCSKeys* of size 166. The calculation of | |
46 MACCSKeys is relatively expensive and can take rather substantial amount | |
47 of time. | |
48 | |
49 OPTIONS | |
50 --AromaticityModel *MDLAromaticityModel | TriposAromaticityModel | | |
51 MMFFAromaticityModel | ChemAxonBasicAromaticityModel | | |
52 ChemAxonGeneralAromaticityModel | DaylightAromaticityModel | | |
53 MayaChemToolsAromaticityModel* | |
54 Specify aromaticity model to use during detection of aromaticity. | |
55 Possible values in the current release are: *MDLAromaticityModel, | |
56 TriposAromaticityModel, MMFFAromaticityModel, | |
57 ChemAxonBasicAromaticityModel, ChemAxonGeneralAromaticityModel, | |
58 DaylightAromaticityModel or MayaChemToolsAromaticityModel*. Default | |
59 value: *MayaChemToolsAromaticityModel*. | |
60 | |
61 The supported aromaticity model names along with model specific | |
62 control parameters are defined in AromaticityModelsData.csv, which | |
63 is distributed with the current release and is available under | |
64 lib/data directory. Molecule.pm module retrieves data from this file | |
65 during class instantiation and makes it available to method | |
66 DetectAromaticity for detecting aromaticity corresponding to a | |
67 specific model. | |
68 | |
69 --CompoundID *DataFieldName or LabelPrefixString* | |
70 This value is --CompoundIDMode specific and indicates how compound | |
71 ID is generated. | |
72 | |
73 For *DataField* value of --CompoundIDMode option, it corresponds to | |
74 datafield label name whose value is used as compound ID; otherwise, | |
75 it's a prefix string used for generating compound IDs like | |
76 LabelPrefixString<Number>. Default value, *Cmpd*, generates compound | |
77 IDs which look like Cmpd<Number>. | |
78 | |
79 Examples for *DataField* value of --CompoundIDMode: | |
80 | |
81 MolID | |
82 ExtReg | |
83 | |
84 Examples for *LabelPrefix* or *MolNameOrLabelPrefix* value of | |
85 --CompoundIDMode: | |
86 | |
87 Compound | |
88 | |
89 The value specified above generates compound IDs which correspond to | |
90 Compound<Number> instead of default value of Cmpd<Number>. | |
91 | |
92 --CompoundIDLabel *text* | |
93 Specify compound ID column label for CSV/TSV text file(s) used | |
94 during *CompoundID* value of --DataFieldsMode option. Default value: | |
95 *CompoundID*. | |
96 | |
97 --CompoundIDMode *DataField | MolName | LabelPrefix | | |
98 MolNameOrLabelPrefix* | |
99 Specify how to generate compound IDs and write to CSV/TSV text | |
100 file(s) along with calculated physicochemical properties for *text | | |
101 both* values of --output option: use a *SDFile(s)* datafield value; | |
102 use molname line from *SDFile(s)*; generate a sequential ID with | |
103 specific prefix; use combination of both MolName and LabelPrefix | |
104 with usage of LabelPrefix values for empty molname lines. | |
105 | |
106 Possible values: *DataField | MolName | LabelPrefix | | |
107 MolNameOrLabelPrefix*. Default value: *LabelPrefix*. | |
108 | |
109 For *MolNameAndLabelPrefix* value of --CompoundIDMode, molname line | |
110 in *SDFile(s)* takes precedence over sequential compound IDs | |
111 generated using *LabelPrefix* and only empty molname values are | |
112 replaced with sequential compound IDs. | |
113 | |
114 This is only used for *CompoundID* value of --DataFieldsMode option. | |
115 | |
116 --DataFields *"FieldLabel1,FieldLabel2,..."* | |
117 Comma delimited list of *SDFiles(s)* data fields to extract and | |
118 write to CSV/TSV text file(s) along with calculated physicochemical | |
119 properties for *text | both* values of --output option. | |
120 | |
121 This is only used for *Specify* value of --DataFieldsMode option. | |
122 | |
123 Examples: | |
124 | |
125 Extreg | |
126 MolID,CompoundName | |
127 | |
128 -d, --DataFieldsMode *All | Common | Specify | CompoundID* | |
129 Specify how data fields in *SDFile(s)* are transferred to output | |
130 CSV/TSV text file(s) along with calculated physicochemical | |
131 properties for *text | both* values of --output option: transfer all | |
132 SD data field; transfer SD data files common to all compounds; | |
133 extract specified data fields; generate a compound ID using molname | |
134 line, a compound prefix, or a combination of both. Possible values: | |
135 *All | Common | specify | CompoundID*. Default value: *CompoundID*. | |
136 | |
137 -f, --Filter *Yes | No* | |
138 Specify whether to check and filter compound data in SDFile(s). | |
139 Possible values: *Yes or No*. Default value: *Yes*. | |
140 | |
141 By default, compound data is checked before calculating | |
142 physiochemical properties and compounds containing atom data | |
143 corresponding to non-element symbols or no atom data are ignored. | |
144 | |
145 -h, --help | |
146 Print this help message. | |
147 | |
148 --HydrogenBonds *HBondsType1 | HBondsType2* | |
149 Parameters to control calculation of hydrogen bond donors and | |
150 acceptors. Possible values: *HBondsType1, HydrogenBondsType1, | |
151 HBondsType2, HydrogenBondsType2*. Default value: *HBondsType2* which | |
152 corresponds to RuleOf5 definition for number of hydrogen bond donors | |
153 and acceptors. | |
154 | |
155 The current release of MayaChemTools supports identification of two | |
156 types of hydrogen bond donor and acceptor atoms with these names: | |
157 | |
158 HBondsType1 or HydrogenBondsType1 | |
159 HBondsType2 or HydrogenBondsType2 | |
160 | |
161 The names of these hydrogen bond types are rather arbitrary. | |
162 However, their definitions have specific meaning and are as follows: | |
163 | |
164 HydrogenBondsType1 [ Ref 60-61, Ref 65-66 ]: | |
165 | |
166 Donor: NH, NH2, OH - Any N and O with available H | |
167 Acceptor: N[!H], O - Any N without available H and any O | |
168 | |
169 HydrogenBondsType2 [ Ref 91 ]: | |
170 | |
171 Donor: NH, NH2, OH - N and O with available H | |
172 Acceptor: N, O - And N and O | |
173 | |
174 -k, --KeepLargestComponent *Yes | No* | |
175 Calculate physicochemical properties for only the largest component | |
176 in molecule. Possible values: *Yes or No*. Default value: *Yes*. | |
177 | |
178 For molecules containing multiple connected components, | |
179 physicochemical properties can be calculated in two different ways: | |
180 use all connected components or just the largest connected | |
181 component. By default, all atoms except for the largest connected | |
182 component are deleted before calculation of physicochemical | |
183 properties. | |
184 | |
185 -m, --mode *All | RuleOf5 | RuleOf3 | "name1, [name2,...]"* | |
186 Specify physicochemical properties to calculate for SDFile(s): | |
187 calculate all available physical chemical properties; calculate | |
188 properties corresponding to Rule of 5; or use a comma delimited list | |
189 of supported physicochemical properties. Possible values: *All | | |
190 RuleOf5 | RuleOf3 | "name1, [name2,...]"*. | |
191 | |
192 Default value: *MolecularWeight, HeavyAtoms, MolecularVolume, | |
193 RotatableBonds, HydrogenBondDonors, HydrogenBondAcceptors, SLogP, | |
194 TPSA*. These properties are calculated by default. | |
195 | |
196 *RuleOf5* [ Ref 91 ] includes these properties: *MolecularWeight, | |
197 HydrogenBondDonors, HydrogenBondAcceptors, SLogP*. *RuleOf5* states: | |
198 MolecularWeight <= 500, HydrogenBondDonors <= 5, | |
199 HydrogenBondAcceptors <= 10, and logP <= 5. | |
200 | |
201 *RuleOf3* [ Ref 92 ] includes these properties: *MolecularWeight, | |
202 RotatableBonds, HydrogenBondDonors, HydrogenBondAcceptors, SLogP, | |
203 TPSA*. *RuleOf3* states: MolecularWeight <= 300, RotatableBonds <= | |
204 3, HydrogenBondDonors <= 3, HydrogenBondAcceptors <= 3, logP <= 3, | |
205 and TPSA <= 60. | |
206 | |
207 *All* calculates all supported physicochemical properties: | |
208 *MolecularWeight, ExactMass, HeavyAtoms, Rings, AromaticRings, | |
209 MolecularVolume, RotatableBonds, HydrogenBondDonors, | |
210 HydrogenBondAcceptors, SLogP, SMR, TPSA, Fsp3Carbons, Sp3Carbons, | |
211 MolecularComplexity*. | |
212 | |
213 --MolecularComplexity *Name,Value, [Name,Value,...]* | |
214 Parameters to control calculation of molecular complexity: it's a | |
215 comma delimited list of parameter name and value pairs. | |
216 | |
217 Possible parameter names: *MolecularComplexityType, | |
218 AtomIdentifierType, AtomicInvariantsToUse, FunctionalClassesToUse, | |
219 MACCSKeysSize, NeighborhoodRadius, MinPathLength, MaxPathLength, | |
220 UseBondSymbols, MinDistance, MaxDistance, UseTriangleInequality, | |
221 DistanceBinSize, NormalizationMethodology*. | |
222 | |
223 The valid paramater valuse for each parameter name are described in | |
224 the following sections. | |
225 | |
226 The current release of MayaChemTools supports calculation of | |
227 molecular complexity using *MolecularComplexityType* parameter | |
228 corresponding to the number of bits-set or unique keys [ Ref 117-119 | |
229 ] in molecular fingerprints. The valid values for | |
230 *MolecularComplexityType* are: | |
231 | |
232 AtomTypesFingerprints | |
233 ExtendedConnectivityFingerprints | |
234 MACCSKeys | |
235 PathLengthFingerprints | |
236 TopologicalAtomPairsFingerprints | |
237 TopologicalAtomTripletsFingerprints | |
238 TopologicalAtomTorsionsFingerprints | |
239 TopologicalPharmacophoreAtomPairsFingerprints | |
240 TopologicalPharmacophoreAtomTripletsFingerprints | |
241 | |
242 Default value for *MolecularComplexityType*: *MACCSKeys*. | |
243 | |
244 *AtomIdentifierType* parameter name correspods to atom types used | |
245 during generation of fingerprints. The valid values for | |
246 *AtomIdentifierType* are: *AtomicInvariantsAtomTypes, | |
247 DREIDINGAtomTypes, EStateAtomTypes, FunctionalClassAtomTypes, | |
248 MMFF94AtomTypes, SLogPAtomTypes, SYBYLAtomTypes, TPSAAtomTypes, | |
249 UFFAtomTypes*. *AtomicInvariantsAtomTypes* is not supported for | |
250 during the following values of *MolecularComplexityType*: | |
251 *MACCSKeys, TopologicalPharmacophoreAtomPairsFingerprints, | |
252 TopologicalPharmacophoreAtomTripletsFingerprints*. | |
253 *FunctionalClassAtomTypes* is the only valid value for | |
254 *AtomIdentifierType* for topological pharmacophore fingerprints. | |
255 | |
256 Default value for *AtomIdentifierType*: *AtomicInvariantsAtomTypes* | |
257 for all except topological pharmacophore fingerprints where it is | |
258 *FunctionalClassAtomTypes*. | |
259 | |
260 *AtomicInvariantsToUse* parameter name and values are used during | |
261 *AtomicInvariantsAtomTypes* value of parameter *AtomIdentifierType*. | |
262 It's a list of space separated valid atomic invariant atom types. | |
263 | |
264 Possible values for atomic invariants are: *AS, X, BO, LBO, SB, DB, | |
265 TB, H, Ar, RA, FC, MN, SM*. Default value for | |
266 *AtomicInvariantsToUse* parameter are set differently for different | |
267 fingerprints using *MolecularComplexityType* parameter as shown | |
268 below: | |
269 | |
270 MolecularComplexityType AtomicInvariantsToUse | |
271 | |
272 AtomTypesFingerprints AS X BO H FC | |
273 TopologicalAtomPairsFingerprints AS X BO H FC | |
274 TopologicalAtomTripletsFingerprints AS X BO H FC | |
275 TopologicalAtomTorsionsFingerprints AS X BO H FC | |
276 | |
277 ExtendedConnectivityFingerprints AS X BO H FC MN | |
278 PathLengthFingerprints AS | |
279 | |
280 The atomic invariants abbreviations correspond to: | |
281 | |
282 AS = Atom symbol corresponding to element symbol | |
283 | |
284 X<n> = Number of non-hydrogen atom neighbors or heavy atoms | |
285 BO<n> = Sum of bond orders to non-hydrogen atom neighbors or heavy atoms | |
286 LBO<n> = Largest bond order of non-hydrogen atom neighbors or heavy atoms | |
287 SB<n> = Number of single bonds to non-hydrogen atom neighbors or heavy atoms | |
288 DB<n> = Number of double bonds to non-hydrogen atom neighbors or heavy atoms | |
289 TB<n> = Number of triple bonds to non-hydrogen atom neighbors or heavy atoms | |
290 H<n> = Number of implicit and explicit hydrogens for atom | |
291 Ar = Aromatic annotation indicating whether atom is aromatic | |
292 RA = Ring atom annotation indicating whether atom is a ring | |
293 FC<+n/-n> = Formal charge assigned to atom | |
294 MN<n> = Mass number indicating isotope other than most abundant isotope | |
295 SM<n> = Spin multiplicity of atom. Possible values: 1 (singlet), 2 (doublet) or | |
296 3 (triplet) | |
297 | |
298 Atom type generated by AtomTypes::AtomicInvariantsAtomTypes class | |
299 corresponds to: | |
300 | |
301 AS.X<n>.BO<n>.LBO<n>.<SB><n>.<DB><n>.<TB><n>.H<n>.Ar.RA.FC<+n/-n>.MN<n>.SM<n> | |
302 | |
303 Except for AS which is a required atomic invariant in atom types, | |
304 all other atomic invariants are optional. Atom type specification | |
305 doesn't include atomic invariants with zero or undefined values. | |
306 | |
307 In addition to usage of abbreviations for specifying atomic | |
308 invariants, the following descriptive words are also allowed: | |
309 | |
310 X : NumOfNonHydrogenAtomNeighbors or NumOfHeavyAtomNeighbors | |
311 BO : SumOfBondOrdersToNonHydrogenAtoms or SumOfBondOrdersToHeavyAtoms | |
312 LBO : LargestBondOrderToNonHydrogenAtoms or LargestBondOrderToHeavyAtoms | |
313 SB : NumOfSingleBondsToNonHydrogenAtoms or NumOfSingleBondsToHeavyAtoms | |
314 DB : NumOfDoubleBondsToNonHydrogenAtoms or NumOfDoubleBondsToHeavyAtoms | |
315 TB : NumOfTripleBondsToNonHydrogenAtoms or NumOfTripleBondsToHeavyAtoms | |
316 H : NumOfImplicitAndExplicitHydrogens | |
317 Ar : Aromatic | |
318 RA : RingAtom | |
319 FC : FormalCharge | |
320 MN : MassNumber | |
321 SM : SpinMultiplicity | |
322 | |
323 *AtomTypes::AtomicInvariantsAtomTypes* module is used to assign | |
324 atomic invariant atom types. | |
325 | |
326 *FunctionalClassesToUse* parameter name and values are used during | |
327 *FunctionalClassAtomTypes* value of parameter *AtomIdentifierType*. | |
328 It's a list of space separated valid atomic invariant atom types. | |
329 | |
330 Possible values for atom functional classes are: *Ar, CA, H, HBA, | |
331 HBD, Hal, NI, PI, RA*. | |
332 | |
333 Default value for *FunctionalClassesToUse* parameter is set to: | |
334 | |
335 HBD HBA PI NI Ar Hal | |
336 | |
337 for all fingerprints except for the following two | |
338 *MolecularComplexityType* fingerints: | |
339 | |
340 MolecularComplexityType FunctionalClassesToUse | |
341 | |
342 TopologicalPharmacophoreAtomPairsFingerprints HBD HBA P, NI H | |
343 TopologicalPharmacophoreAtomTripletsFingerprints HBD HBA PI NI H Ar | |
344 | |
345 The functional class abbreviations correspond to: | |
346 | |
347 HBD: HydrogenBondDonor | |
348 HBA: HydrogenBondAcceptor | |
349 PI : PositivelyIonizable | |
350 NI : NegativelyIonizable | |
351 Ar : Aromatic | |
352 Hal : Halogen | |
353 H : Hydrophobic | |
354 RA : RingAtom | |
355 CA : ChainAtom | |
356 | |
357 Functional class atom type specification for an atom corresponds to: | |
358 | |
359 Ar.CA.H.HBA.HBD.Hal.NI.PI.RA | |
360 | |
361 *AtomTypes::FunctionalClassAtomTypes* module is used to assign | |
362 functional class atom types. It uses following definitions [ Ref | |
363 60-61, Ref 65-66 ]: | |
364 | |
365 HydrogenBondDonor: NH, NH2, OH | |
366 HydrogenBondAcceptor: N[!H], O | |
367 PositivelyIonizable: +, NH2 | |
368 NegativelyIonizable: -, C(=O)OH, S(=O)OH, P(=O)OH | |
369 | |
370 *MACCSKeysSize* parameter name is only used during *MACCSKeys* value | |
371 of *MolecularComplexityType* and corresponds to the size of MACCS | |
372 key set. Possible values: *166 or 322*. Default value: *166*. | |
373 | |
374 *NeighborhoodRadius* parameter name is only used during | |
375 *ExtendedConnectivityFingerprints* value of | |
376 *MolecularComplexityType* and corresponds to atomic neighborhoods | |
377 radius for generating extended connectivity fingerprints. Possible | |
378 values: positive integer. Default value: *2*. | |
379 | |
380 *MinPathLength* and *MaxPathLength* parameters are only used during | |
381 *PathLengthFingerprints* value of *MolecularComplexityType* and | |
382 correspond to minimum and maximum path lengths to use for generating | |
383 path length fingerprints. Possible values: positive integers. | |
384 Default value: *MinPathLength - 1*; *MaxPathLength - 8*. | |
385 | |
386 *UseBondSymbols* parameter is only used during | |
387 *PathLengthFingerprints* value of *MolecularComplexityType* and | |
388 indicates whether bond symbols are included in atom path strings | |
389 used to generate path length fingerprints. Possible value: *Yes or | |
390 No*. Default value: *Yes*. | |
391 | |
392 *MinDistance* and *MaxDistance* parameters are only used during | |
393 *TopologicalAtomPairsFingerprints* and | |
394 *TopologicalAtomTripletsFingerprints* values of | |
395 *MolecularComplexityType* and correspond to minimum and maximum bond | |
396 distance between atom pairs during topological pharmacophore | |
397 fingerprints. Possible values: positive integers. Default value: | |
398 *MinDistance - 1*; *MaxDistance - 10*. | |
399 | |
400 *UseTriangleInequality* parameter is used during these values for | |
401 *MolecularComplexityType*: *TopologicalAtomTripletsFingerprints* and | |
402 *TopologicalPharmacophoreAtomTripletsFingerprints*. Possible values: | |
403 *Yes or No*. It determines wheter to apply triangle inequality to | |
404 distance triplets. Default value: | |
405 *TopologicalAtomTripletsFingerprints - No*; | |
406 *TopologicalPharmacophoreAtomTripletsFingerprints - Yes*. | |
407 | |
408 *DistanceBinSize* parameter is used during | |
409 *TopologicalPharmacophoreAtomTripletsFingerprints* value of | |
410 *MolecularComplexityType* and correspons to distance bin size used | |
411 for binning distances during generation of topological pharmacophore | |
412 atom triplets fingerprints. Possible value: positive integer. | |
413 Default value: *2*. | |
414 | |
415 *NormalizationMethodology* is only used for these values for | |
416 *MolecularComplexityType*: *ExtendedConnectivityFingerprints*, | |
417 *TopologicalPharmacophoreAtomPairsFingerprints* and | |
418 *TopologicalPharmacophoreAtomTripletsFingerprints*. It corresponds | |
419 to normalization methodology to use for scaling the number of | |
420 bits-set or unique keys during generation of fingerprints. Possible | |
421 values during *ExtendedConnectivityFingerprints*: *None or | |
422 ByHeavyAtomsCount*; Default value: *None*. Possible values during | |
423 topological pharmacophore atom pairs and tripletes fingerprints: | |
424 *None or ByPossibleKeysCount*; Default value: *None*. | |
425 *ByPossibleKeysCount* corresponds to total number of possible | |
426 topological pharmacophore atom pairs or triplets in a molecule. | |
427 | |
428 Examples of *MolecularComplexity* name and value parameters: | |
429 | |
430 MolecularComplexityType,AtomTypesFingerprints,AtomIdentifierType, | |
431 AtomicInvariantsAtomTypes,AtomicInvariantsToUse,AS X BO H FC | |
432 | |
433 MolecularComplexityType,ExtendedConnectivityFingerprints, | |
434 AtomIdentifierType,AtomicInvariantsAtomTypes, | |
435 AtomicInvariantsToUse,AS X BO H FC MN,NeighborhoodRadius,2, | |
436 NormalizationMethodology,None | |
437 | |
438 MolecularComplexityType,MACCSKeys,MACCSKeysSize,166 | |
439 | |
440 MolecularComplexityType,PathLengthFingerprints,AtomIdentifierType, | |
441 AtomicInvariantsAtomTypes,AtomicInvariantsToUse,AS,MinPathLength, | |
442 1,MaxPathLength,8,UseBondSymbols,Yes | |
443 | |
444 MolecularComplexityType,TopologicalAtomPairsFingerprints, | |
445 AtomIdentifierType,AtomicInvariantsAtomTypes,AtomicInvariantsToUse, | |
446 AS X BO H FC,MinDistance,1,MaxDistance,10 | |
447 | |
448 MolecularComplexityType,TopologicalAtomTripletsFingerprints, | |
449 AtomIdentifierType,AtomicInvariantsAtomTypes,AtomicInvariantsToUse, | |
450 AS X BO H FC,MinDistance,1,MaxDistance,10,UseTriangleInequality,No | |
451 | |
452 MolecularComplexityType,TopologicalAtomTorsionsFingerprints, | |
453 AtomIdentifierType,AtomicInvariantsAtomTypes,AtomicInvariantsToUse, | |
454 AS X BO H FC | |
455 | |
456 MolecularComplexityType,TopologicalPharmacophoreAtomPairsFingerprints, | |
457 AtomIdentifierType,FunctionalClassAtomTypes,FunctionalClassesToUse, | |
458 HBD HBA PI NI H,MinDistance,1,MaxDistance,10,NormalizationMethodology, | |
459 None | |
460 | |
461 MolecularComplexityType,TopologicalPharmacophoreAtomTripletsFingerprints, | |
462 AtomIdentifierType,FunctionalClassAtomTypes,FunctionalClassesToUse, | |
463 HBD HBA PI NI H Ar,MinDistance,1,MaxDistance,10,NormalizationMethodology, | |
464 None,UseTriangleInequality,Yes,NormalizationMethodology,None, | |
465 DistanceBinSize,2 | |
466 | |
467 --OutDelim *comma | tab | semicolon* | |
468 Delimiter for output CSV/TSV text file(s). Possible values: *comma, | |
469 tab, or semicolon* Default value: *comma*. | |
470 | |
471 --output *SD | text | both* | |
472 Type of output files to generate. Possible values: *SD, text, or | |
473 both*. Default value: *text*. | |
474 | |
475 -o, --overwrite | |
476 Overwrite existing files. | |
477 | |
478 --Precision *Name,Number,[Name,Number,..]* | |
479 Precision of calculated property values in the output file: it's a | |
480 comma delimited list of property name and precision value pairs. | |
481 Possible property names: *MolecularWeight, ExactMass*. Possible | |
482 values: positive intergers. Default value: *MolecularWeight,2, | |
483 ExactMass,4*. | |
484 | |
485 Examples: | |
486 | |
487 ExactMass,3 | |
488 MolecularWeight,1,ExactMass,2 | |
489 | |
490 -q, --quote *Yes | No* | |
491 Put quote around column values in output CSV/TSV text file(s). | |
492 Possible values: *Yes or No*. Default value: *Yes*. | |
493 | |
494 -r, --root *RootName* | |
495 New file name is generated using the root: <Root>.<Ext>. Default for | |
496 new file names: <SDFileName><PhysicochemicalProperties>.<Ext>. The | |
497 file type determines <Ext> value. The sdf, csv, and tsv <Ext> values | |
498 are used for SD, comma/semicolon, and tab delimited text files, | |
499 respectively.This option is ignored for multiple input files. | |
500 | |
501 --RotatableBonds *Name,Value, [Name,Value,...]* | |
502 Parameters to control calculation of rotatable bonds [ Ref 92 ]: | |
503 it's a comma delimited list of parameter name and value pairs. | |
504 Possible parameter names: *IgnoreTerminalBonds, | |
505 IgnoreBondsToTripleBonds, IgnoreAmideBonds, IgnoreThioamideBonds, | |
506 IgnoreSulfonamideBonds*. Possible parameter values: *Yes or No*. By | |
507 default, value of all parameters is set to *Yes*. | |
508 | |
509 --RuleOf3Violations *Yes | No* | |
510 Specify whether to calculate RuleOf3Violations for SDFile(s). | |
511 Possible values: *Yes or No*. Default value: *No*. | |
512 | |
513 For *Yes* value of RuleOf3Violations, in addition to calculating | |
514 total number of RuleOf3 violations, individual violations for | |
515 compounds are also written to output files. | |
516 | |
517 RuleOf3 [ Ref 92 ] states: MolecularWeight <= 300, RotatableBonds <= | |
518 3, HydrogenBondDonors <= 3, HydrogenBondAcceptors <= 3, logP <= 3, | |
519 and TPSA <= 60. | |
520 | |
521 --RuleOf5Violations *Yes | No* | |
522 Specify whether to calculate RuleOf5Violations for SDFile(s). | |
523 Possible values: *Yes or No*. Default value: *No*. | |
524 | |
525 For *Yes* value of RuleOf5Violations, in addition to calculating | |
526 total number of RuleOf5 violations, individual violations for | |
527 compounds are also written to output files. | |
528 | |
529 RuleOf5 [ Ref 91 ] states: MolecularWeight <= 500, | |
530 HydrogenBondDonors <= 5, HydrogenBondAcceptors <= 10, and logP <= 5. | |
531 | |
532 --TPSA *Name,Value, [Name,Value,...]* | |
533 Parameters to control calculation of TPSA: it's a comma delimited | |
534 list of parameter name and value pairs. Possible parameter names: | |
535 *IgnorePhosphorus, IgnoreSulfur*. Possible parameter values: *Yes or | |
536 No*. By default, value of all parameters is set to *Yes*. | |
537 | |
538 By default, TPSA atom contributions from Phosphorus and Sulfur atoms | |
539 are not included during TPSA calculations. [ Ref 91 ] | |
540 | |
541 -w, --WorkingDir *DirName* | |
542 Location of working directory. Default value: current directory. | |
543 | |
544 EXAMPLES | |
545 To calculate default set of physicochemical properties - | |
546 MolecularWeight, HeavyAtoms, MolecularVolume, RotatableBonds, | |
547 HydrogenBondDonor, HydrogenBondAcceptors, SLogP, TPSA - and generate a | |
548 SamplePhysicochemicalProperties.csv file containing sequential compound | |
549 IDs along with properties data, type: | |
550 | |
551 % CalculatePhysicochemicalProperties.pl -o Sample.sdf | |
552 | |
553 To calculate all available physicochemical properties and generate both | |
554 SampleAllProperties.csv and SampleAllProperties.sdf files containing | |
555 sequential compound IDs in CSV file along with properties data, type: | |
556 | |
557 % CalculatePhysicochemicalProperties.pl -m All --output both | |
558 -r SampleAllProperties -o Sample.sdf | |
559 | |
560 To calculate RuleOf5 physicochemical properties and generate a | |
561 SampleRuleOf5Properties.csv file containing sequential compound IDs | |
562 along with properties data, type: | |
563 | |
564 % CalculatePhysicochemicalProperties.pl -m RuleOf5 | |
565 -r SampleRuleOf5Properties -o Sample.sdf | |
566 | |
567 To calculate RuleOf5 physicochemical properties along with counting | |
568 RuleOf5 violations and generate a SampleRuleOf5Properties.csv file | |
569 containing sequential compound IDs along with properties data, type: | |
570 | |
571 % CalculatePhysicochemicalProperties.pl -m RuleOf5 --RuleOf5Violations Yes | |
572 -r SampleRuleOf5Properties -o Sample.sdf | |
573 | |
574 To calculate RuleOf3 physicochemical properties and generate a | |
575 SampleRuleOf3Properties.csv file containing sequential compound IDs | |
576 along with properties data, type: | |
577 | |
578 % CalculatePhysicochemicalProperties.pl -m RuleOf3 | |
579 -r SampleRuleOf3Properties -o Sample.sdf | |
580 | |
581 To calculate RuleOf3 physicochemical properties along with counting | |
582 RuleOf3 violations and generate a SampleRuleOf3Properties.csv file | |
583 containing sequential compound IDs along with properties data, type: | |
584 | |
585 % CalculatePhysicochemicalProperties.pl -m RuleOf3 --RuleOf3Violations Yes | |
586 -r SampleRuleOf3Properties -o Sample.sdf | |
587 | |
588 To calculate a specific set of physicochemical properties and generate a | |
589 SampleProperties.csv file containing sequential compound IDs along with | |
590 properties data, type: | |
591 | |
592 % CalculatePhysicochemicalProperties.pl -m "Rings,AromaticRings" | |
593 -r SampleProperties -o Sample.sdf | |
594 | |
595 To calculate HydrogenBondDonors and HydrogenBondAcceptors using | |
596 HydrogenBondsType1 definition and generate a SampleProperties.csv file | |
597 containing sequential compound IDs along with properties data, type: | |
598 | |
599 % CalculatePhysicochemicalProperties.pl -m "HydrogenBondDonors,HydrogenBondAcceptors" | |
600 --HydrogenBonds HBondsType1 -r SampleProperties -o Sample.sdf | |
601 | |
602 To calculate TPSA using sulfur and phosphorus atoms along with nitrogen | |
603 and oxygen atoms and generate a SampleProperties.csv file containing | |
604 sequential compound IDs along with properties data, type: | |
605 | |
606 % CalculatePhysicochemicalProperties.pl -m "TPSA" --TPSA "IgnorePhosphorus,No, | |
607 IgnoreSulfur,No" -r SampleProperties -o Sample.sdf | |
608 | |
609 To calculate MolecularComplexity using extendend connectivity | |
610 fingerprints corresponding to atom neighborhood radius of 2 with atomic | |
611 invariant atom types without any scaling and generate a | |
612 SampleProperties.csv file containing sequential compound IDs along with | |
613 properties data, type: | |
614 | |
615 % CalculatePhysicochemicalProperties.pl -m MolecularComplexity --MolecularComplexity | |
616 "MolecularComplexityType,ExtendedConnectivityFingerprints,NeighborhoodRadius,2, | |
617 AtomIdentifierType, AtomicInvariantsAtomTypes, | |
618 AtomicInvariantsToUse,AS X BO H FC MN,NormalizationMethodology,None" | |
619 -r SampleProperties -o Sample.sdf | |
620 | |
621 To calculate RuleOf5 physicochemical properties along with counting | |
622 RuleOf5 violations and generate a SampleRuleOf5Properties.csv file | |
623 containing compound IDs from molecule name line along with properties | |
624 data, type: | |
625 | |
626 % CalculatePhysicochemicalProperties.pl -m RuleOf5 --RuleOf5Violations Yes | |
627 --DataFieldsMode CompoundID --CompoundIDMode MolName | |
628 -r SampleRuleOf5Properties -o Sample.sdf | |
629 | |
630 To calculate all available physicochemical properties and generate a | |
631 SampleAllProperties.csv file containing compound ID using specified data | |
632 field along with along with properties data, type: | |
633 | |
634 % CalculatePhysicochemicalProperties.pl -m All | |
635 --DataFieldsMode CompoundID --CompoundIDMode DataField --CompoundID Mol_ID | |
636 -r SampleAllProperties -o Sample.sdf | |
637 | |
638 To calculate all available physicochemical properties and generate a | |
639 SampleAllProperties.csv file containing compound ID using combination of | |
640 molecule name line and an explicit compound prefix along with properties | |
641 data, type: | |
642 | |
643 % CalculatePhysicochemicalProperties.pl -m All | |
644 --DataFieldsMode CompoundID --CompoundIDMode MolnameOrLabelPrefix | |
645 --CompoundID Cmpd --CompoundIDLabel MolID -r SampleAllProperties | |
646 -o Sample.sdf | |
647 | |
648 To calculate all available physicochemical properties and generate a | |
649 SampleAllProperties.csv file containing specific data fields columns | |
650 along with with properties data, type: | |
651 | |
652 % CalculatePhysicochemicalProperties.pl -m All | |
653 --DataFieldsMode Specify --DataFields Mol_ID -r SampleAllProperties | |
654 -o Sample.sdf | |
655 | |
656 To calculate all available physicochemical properties and generate a | |
657 SampleAllProperties.csv file containing common data fields columns along | |
658 with with properties data, type: | |
659 | |
660 % CalculatePhysicochemicalProperties.pl -m All | |
661 --DataFieldsMode Common -r SampleAllProperties -o Sample.sdf | |
662 | |
663 To calculate all available physicochemical properties and generate both | |
664 SampleAllProperties.csv and CSV files containing all data fields columns | |
665 in CSV files along with with properties data, type: | |
666 | |
667 % CalculatePhysicochemicalProperties.pl -m All | |
668 --DataFieldsMode All --output both -r SampleAllProperties | |
669 -o Sample.sdf | |
670 | |
671 AUTHOR | |
672 Manish Sud <msud@san.rr.com> | |
673 | |
674 SEE ALSO | |
675 ExtractFromSDtFiles.pl, ExtractFromTextFiles.pl, InfoSDFiles.pl, | |
676 InfoTextFiles.pl | |
677 | |
678 COPYRIGHT | |
679 Copyright (C) 2015 Manish Sud. All rights reserved. | |
680 | |
681 This file is part of MayaChemTools. | |
682 | |
683 MayaChemTools is free software; you can redistribute it and/or modify it | |
684 under the terms of the GNU Lesser General Public License as published by | |
685 the Free Software Foundation; either version 3 of the License, or (at | |
686 your option) any later version. | |
687 |