Mercurial > repos > deepakjadmin > r_caret_test
comparison mayachemtool/mayachemtools/docs/modules/txt/MolecularComplexityDescriptors.txt @ 0:68300206e90d draft default tip
Uploaded
| author | deepakjadmin |
|---|---|
| date | Thu, 05 Nov 2015 02:41:30 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:68300206e90d |
|---|---|
| 1 NAME | |
| 2 MolecularComplexityDescriptors | |
| 3 | |
| 4 SYNOPSIS | |
| 5 use MolecularDescriptors::MolecularComplexityDescriptors; | |
| 6 | |
| 7 use MolecularDescriptors::MolecularComplexityDescriptors qw(:all); | |
| 8 | |
| 9 DESCRIPTION | |
| 10 MolecularComplexityDescriptors class provides the following methods: | |
| 11 | |
| 12 new, GenerateDescriptors, GetDescriptorNames, | |
| 13 GetMolecularComplexityTypeAbbreviation, MACCSKeysSize, | |
| 14 SetAtomIdentifierType, SetAtomicInvariantsToUse, SetDistanceBinSize, | |
| 15 SetFunctionalClassesToUse, SetMaxDistance, SetMaxPathLength, | |
| 16 SetMinDistance, SetMinPathLength, SetMolecularComplexityType, | |
| 17 SetNeighborhoodRadius, SetNormalizationMethodology, | |
| 18 StringifyMolecularComplexityDescriptors | |
| 19 | |
| 20 MolecularComplexityDescriptors is derived from MolecularDescriptors | |
| 21 class which in turn is derived from ObjectProperty base class that | |
| 22 provides methods not explicitly defined in | |
| 23 MolecularComplexityDescriptors, MolecularDescriptors or ObjectProperty | |
| 24 classes using Perl's AUTOLOAD functionality. These methods are generated | |
| 25 on-the-fly for a specified object property: | |
| 26 | |
| 27 Set<PropertyName>(<PropertyValue>); | |
| 28 $PropertyValue = Get<PropertyName>(); | |
| 29 Delete<PropertyName>(); | |
| 30 | |
| 31 The current release of MayaChemTools supports calculation of molecular | |
| 32 complexity using *MolecularComplexityType* parameter corresponding to | |
| 33 number of bits-set or unique keys [ Ref 117-119 ] in molecular | |
| 34 fingerprints. The valid values for *MolecularComplexityType* are: | |
| 35 | |
| 36 AtomTypesFingerprints | |
| 37 ExtendedConnectivityFingerprints | |
| 38 MACCSKeys | |
| 39 PathLengthFingerprints | |
| 40 TopologicalAtomPairsFingerprints | |
| 41 TopologicalAtomTripletsFingerprints | |
| 42 TopologicalAtomTorsionsFingerprints | |
| 43 TopologicalPharmacophoreAtomPairsFingerprints | |
| 44 TopologicalPharmacophoreAtomTripletsFingerprints | |
| 45 | |
| 46 Default value for *MolecularComplexityType*: *MACCSKeys*. | |
| 47 | |
| 48 *AtomIdentifierType* parameter name corresponds to atom types used | |
| 49 during generation of fingerprints. The valid values for | |
| 50 *AtomIdentifierType* are: *AtomicInvariantsAtomTypes, DREIDINGAtomTypes, | |
| 51 EStateAtomTypes, FunctionalClassAtomTypes, MMFF94AtomTypes, | |
| 52 SLogPAtomTypes, SYBYLAtomTypes, TPSAAtomTypes, UFFAtomTypes*. | |
| 53 *AtomicInvariantsAtomTypes* is not supported for following values of | |
| 54 *MolecularComplexityType*: *MACCSKeys, | |
| 55 TopologicalPharmacophoreAtomPairsFingerprints, | |
| 56 TopologicalPharmacophoreAtomTripletsFingerprints*. | |
| 57 *FunctionalClassAtomTypes* is the only valid value of | |
| 58 *AtomIdentifierType* for topological pharmacophore fingerprints. | |
| 59 | |
| 60 Default value for *AtomIdentifierType*: *AtomicInvariantsAtomTypes* for | |
| 61 all fingerprints; *FunctionalClassAtomTypes* for topological | |
| 62 pharmacophore fingerprints. | |
| 63 | |
| 64 *AtomicInvariantsToUse* parameter name and values are used during | |
| 65 *AtomicInvariantsAtomTypes* value of parameter *AtomIdentifierType*. | |
| 66 It's a list of space separated valid atomic invariant atom types. | |
| 67 | |
| 68 Possible values for atomic invariants are: *AS, X, BO, LBO, SB, DB, TB, | |
| 69 H, Ar, RA, FC, MN, SM*. Default value for *AtomicInvariantsToUse* | |
| 70 parameter are set differently for different fingerprints using | |
| 71 *MolecularComplexityType* parameter as shown below: | |
| 72 | |
| 73 MolecularComplexityType AtomicInvariantsToUse | |
| 74 | |
| 75 AtomTypesFingerprints AS X BO H FC | |
| 76 TopologicalAtomPairsFingerprints AS X BO H FC | |
| 77 TopologicalAtomTripletsFingerprints AS X BO H FC | |
| 78 TopologicalAtomTorsionsFingerprints AS X BO H FC | |
| 79 | |
| 80 ExtendedConnectivityFingerprints AS X BO H FC MN | |
| 81 PathLengthFingerprints AS | |
| 82 | |
| 83 *FunctionalClassesToUse* parameter name and values are used during | |
| 84 *FunctionalClassAtomTypes* value of parameter *AtomIdentifierType*. It's | |
| 85 a list of space separated valid atomic invariant atom types. | |
| 86 | |
| 87 Possible values for atom functional classes are: *Ar, CA, H, HBA, HBD, | |
| 88 Hal, NI, PI, RA*. | |
| 89 | |
| 90 Default value for *FunctionalClassesToUse* parameter is set to: | |
| 91 | |
| 92 HBD HBA PI NI Ar Hal | |
| 93 | |
| 94 for all fingerprints except for the following two | |
| 95 *MolecularComplexityType* fingerints: | |
| 96 | |
| 97 MolecularComplexityType FunctionalClassesToUse | |
| 98 | |
| 99 TopologicalPharmacophoreAtomPairsFingerprints HBD HBA P, NI H | |
| 100 TopologicalPharmacophoreAtomTripletsFingerprints HBD HBA PI NI H Ar | |
| 101 | |
| 102 *MACCSKeysSize* parameter name is only used during *MACCSKeys* value of | |
| 103 *MolecularComplexityType* and corresponds to size of MACCS key set. | |
| 104 Possible values: *166 or 322*. Default value: *166*. | |
| 105 | |
| 106 *NeighborhoodRadius* parameter name is only used during | |
| 107 *ExtendedConnectivityFingerprints* value of *MolecularComplexityType* | |
| 108 and corresponds to atomic neighborhoods radius for generating extended | |
| 109 connectivity fingerprints. Possible values: positive integer. Default | |
| 110 value: *2*. | |
| 111 | |
| 112 *MinPathLength* and *MaxPathLength* parameters are only used during | |
| 113 *PathLengthFingerprints* value of *MolecularComplexityType* and | |
| 114 correspond to minimum and maximum path lengths to use for generating | |
| 115 path length fingerprints. Possible values: positive integers. Default | |
| 116 value: *MinPathLength - 1*; *MaxPathLength - 8*. | |
| 117 | |
| 118 *UseBondSymbols* parameter is only used during *PathLengthFingerprints* | |
| 119 value of *MolecularComplexityType* and indicates whether bond symbols | |
| 120 are included in atom path strings used to generate path length | |
| 121 fingerprints. Possible value: *Yes or No*. Default value: *Yes*. | |
| 122 | |
| 123 *MinDistance* and *MaxDistance* parameters are only used during | |
| 124 *TopologicalAtomPairsFingerprints* and | |
| 125 *TopologicalAtomTripletsFingerprints* values of | |
| 126 *MolecularComplexityType* and correspond to minimum and maximum bond | |
| 127 distance between atom pairs during topological pharmacophore | |
| 128 fingerprints. Possible values: positive integers. Default value: | |
| 129 *MinDistance - 1*; *MaxDistance - 10*. | |
| 130 | |
| 131 *UseTriangleInequality* parameter is used during these values for | |
| 132 *MolecularComplexityType*: *TopologicalAtomTripletsFingerprints* and | |
| 133 *TopologicalPharmacophoreAtomTripletsFingerprints*. Possible values: | |
| 134 *Yes or No*. It determines wheter to apply triangle inequality to | |
| 135 distance triplets. Default value: *TopologicalAtomTripletsFingerprints - | |
| 136 No*; *TopologicalPharmacophoreAtomTripletsFingerprints - Yes*. | |
| 137 | |
| 138 *DistanceBinSize* parameter is used during | |
| 139 *TopologicalPharmacophoreAtomTripletsFingerprints* value of | |
| 140 *MolecularComplexityType* and corresponds to distance bin size used for | |
| 141 binning distances during generation of topological pharmacophore atom | |
| 142 triplets fingerprints. Possible value: positive integer. Default value: | |
| 143 *2*. | |
| 144 | |
| 145 *NormalizationMethodology* is only used for these values for | |
| 146 *MolecularComplexityType*: *ExtendedConnectivityFingerprints*, | |
| 147 *TopologicalPharmacophoreAtomPairsFingerprints* and | |
| 148 *TopologicalPharmacophoreAtomTripletsFingerprints*. It corresponds to | |
| 149 normalization methodology to use for scaling the number of bits-set or | |
| 150 unique keys during generation of fingerprints. Possible values during | |
| 151 *ExtendedConnectivityFingerprints*: *None or ByHeavyAtomsCount*; Default | |
| 152 value: *None*. Possible values during topological pharmacophore atom | |
| 153 pairs and triplets fingerprints: *None or ByPossibleKeysCount*; Default | |
| 154 value: *None*. *ByPossibleKeysCount* corresponds to total number of | |
| 155 possible topological pharmacophore atom pairs or triplets in a molecule. | |
| 156 | |
| 157 METHODS | |
| 158 new | |
| 159 $NewMolecularComplexityDescriptors = new MolecularDescriptors:: | |
| 160 MolecularComplexityDescriptors( | |
| 161 %NamesAndValues); | |
| 162 | |
| 163 Using specified *MolecularComplexityDescriptors* property names and | |
| 164 values hash, new method creates a new object and returns a reference | |
| 165 to newly created MolecularComplexityDescriptors object. By default, | |
| 166 the following properties are initialized: | |
| 167 | |
| 168 Molecule = '' | |
| 169 Type = 'MolecularComplexity' | |
| 170 MolecularComplexityType = 'MACCSKeys' | |
| 171 AtomIdentifierType = '' | |
| 172 MACCSKeysSize = 166 | |
| 173 NeighborhoodRadius = 2 | |
| 174 MinPathLength = 1 | |
| 175 MaxPathLength = 8 | |
| 176 UseBondSymbols = 1 | |
| 177 MinDistance = 1 | |
| 178 MaxDistance = 10 | |
| 179 UseTriangleInequality = '' | |
| 180 DistanceBinSize = 2 | |
| 181 NormalizationMethodology = 'None' | |
| 182 @DescriptorNames = ('MolecularComplexity') | |
| 183 @DescriptorValues = ('None') | |
| 184 | |
| 185 Examples: | |
| 186 | |
| 187 $MolecularComplexityDescriptors = new MolecularDescriptors:: | |
| 188 MolecularComplexityDescriptors( | |
| 189 'Molecule' => $Molecule); | |
| 190 | |
| 191 $MolecularComplexityDescriptors = new MolecularDescriptors:: | |
| 192 MolecularComplexityDescriptors(); | |
| 193 | |
| 194 $MolecularComplexityDescriptors->SetMolecule($Molecule); | |
| 195 $MolecularComplexityDescriptors->GenerateDescriptors(); | |
| 196 print "MolecularComplexityDescriptors: $MolecularComplexityDescriptors\n"; | |
| 197 | |
| 198 GenerateDescriptors | |
| 199 $MolecularComplexityDescriptors->GenerateDescriptors(); | |
| 200 | |
| 201 Calculates MolecularComplexity value for a molecule and returns | |
| 202 *MolecularComplexityDescriptors*. | |
| 203 | |
| 204 GetDescriptorNames | |
| 205 @DescriptorNames = $MolecularComplexityDescriptors->GetDescriptorNames(); | |
| 206 @DescriptorNames = MolecularDescriptors::MolecularComplexityDescriptors:: | |
| 207 GetDescriptorNames(); | |
| 208 | |
| 209 Returns all available descriptor names as an array. | |
| 210 | |
| 211 GetMolecularComplexityTypeAbbreviation | |
| 212 $Abbrev = $MolecularComplexityDescriptors-> | |
| 213 GetMolecularComplexityTypeAbbreviation(); | |
| 214 $Abbrev = MolecularDescriptors::MolecularComplexityDescriptors:: | |
| 215 GetMolecularComplexityTypeAbbreviation($ComplexityType); | |
| 216 | |
| 217 Returns abbreviation for a specified molecular complexity type or | |
| 218 corresponding to *MolecularComplexityDescriptors* object. | |
| 219 | |
| 220 SetMACCSKeysSize | |
| 221 $MolecularComplexityDescriptors->MACCSKeysSize($Size); | |
| 222 | |
| 223 Sets MACCS keys size and returns *MolecularComplexityDescriptors*. | |
| 224 | |
| 225 SetAtomIdentifierType | |
| 226 $MolecularComplexityDescriptors->SetAtomIdentifierType($IdentifierType); | |
| 227 | |
| 228 Sets atom *IdentifierType* to use during fingerprints generation | |
| 229 corresponding to *MolecularComplexityType* and returns | |
| 230 *MolecularComplexityDescriptors*. | |
| 231 | |
| 232 Possible values: *AtomicInvariantsAtomTypes, DREIDINGAtomTypes, | |
| 233 EStateAtomTypes, FunctionalClassAtomTypes, MMFF94AtomTypes, | |
| 234 SLogPAtomTypes, SYBYLAtomTypes, TPSAAtomTypes, UFFAtomTypes*. | |
| 235 | |
| 236 SetAtomicInvariantsToUse | |
| 237 $MolecularComplexityDescriptors->SetAtomicInvariantsToUse($ValuesRef); | |
| 238 $MolecularComplexityDescriptors->SetAtomicInvariantsToUse(@Values); | |
| 239 | |
| 240 Sets atomic invariants to use during *AtomicInvariantsAtomTypes* | |
| 241 value of *AtomIdentifierType* for fingerprints generation and | |
| 242 returns *MolecularComplexityDescriptors*. | |
| 243 | |
| 244 Possible values for atomic invariants are: *AS, X, BO, LBO, SB, DB, | |
| 245 TB, H, Ar, RA, FC, MN, SM*. Default value [ Ref 24 ]: | |
| 246 *AS,X,BO,H,FC,MN*. | |
| 247 | |
| 248 The atomic invariants abbreviations correspond to: | |
| 249 | |
| 250 AS = Atom symbol corresponding to element symbol | |
| 251 | |
| 252 X<n> = Number of non-hydrogen atom neighbors or heavy atoms | |
| 253 BO<n> = Sum of bond orders to non-hydrogen atom neighbors or heavy atoms | |
| 254 LBO<n> = Largest bond order of non-hydrogen atom neighbors or heavy atoms | |
| 255 SB<n> = Number of single bonds to non-hydrogen atom neighbors or heavy atoms | |
| 256 DB<n> = Number of double bonds to non-hydrogen atom neighbors or heavy atoms | |
| 257 TB<n> = Number of triple bonds to non-hydrogen atom neighbors or heavy atoms | |
| 258 H<n> = Number of implicit and explicit hydrogens for atom | |
| 259 Ar = Aromatic annotation indicating whether atom is aromatic | |
| 260 RA = Ring atom annotation indicating whether atom is a ring | |
| 261 FC<+n/-n> = Formal charge assigned to atom | |
| 262 MN<n> = Mass number indicating isotope other than most abundant isotope | |
| 263 SM<n> = Spin multiplicity of atom. Possible values: 1 (singlet), 2 (doublet) or | |
| 264 3 (triplet) | |
| 265 | |
| 266 Atom type generated by AtomTypes::AtomicInvariantsAtomTypes class | |
| 267 corresponds to: | |
| 268 | |
| 269 AS.X<n>.BO<n>.LBO<n>.<SB><n>.<DB><n>.<TB><n>.H<n>.Ar.RA.FC<+n/-n>.MN<n>.SM<n> | |
| 270 | |
| 271 Except for AS which is a required atomic invariant in atom types, | |
| 272 all other atomic invariants are optional. Atom type specification | |
| 273 doesn't include atomic invariants with zero or undefined values. | |
| 274 | |
| 275 In addition to usage of abbreviations for specifying atomic | |
| 276 invariants, the following descriptive words are also allowed: | |
| 277 | |
| 278 X : NumOfNonHydrogenAtomNeighbors or NumOfHeavyAtomNeighbors | |
| 279 BO : SumOfBondOrdersToNonHydrogenAtoms or SumOfBondOrdersToHeavyAtoms | |
| 280 LBO : LargestBondOrderToNonHydrogenAtoms or LargestBondOrderToHeavyAtoms | |
| 281 SB : NumOfSingleBondsToNonHydrogenAtoms or NumOfSingleBondsToHeavyAtoms | |
| 282 DB : NumOfDoubleBondsToNonHydrogenAtoms or NumOfDoubleBondsToHeavyAtoms | |
| 283 TB : NumOfTripleBondsToNonHydrogenAtoms or NumOfTripleBondsToHeavyAtoms | |
| 284 H : NumOfImplicitAndExplicitHydrogens | |
| 285 Ar : Aromatic | |
| 286 RA : RingAtom | |
| 287 FC : FormalCharge | |
| 288 MN : MassNumber | |
| 289 SM : SpinMultiplicity | |
| 290 | |
| 291 *AtomTypes::AtomicInvariantsAtomTypes* module is used to assign | |
| 292 atomic invariant atom types. | |
| 293 | |
| 294 SetDistanceBinSize | |
| 295 $MolecularComplexityDescriptors->SetDistanceBinSize($BinSize); | |
| 296 | |
| 297 Sets distance bin size used to bin distances between atom pairs in | |
| 298 atom triplets for topological pharmacophore atom triplets | |
| 299 fingerprints generation and returns | |
| 300 *MolecularComplexityDescriptors*. | |
| 301 | |
| 302 SetFunctionalClassesToUse | |
| 303 $MolecularComplexityDescriptors->SetFunctionalClassesToUse($ValuesRef); | |
| 304 $MolecularComplexityDescriptors->SetFunctionalClassesToUse(@Values); | |
| 305 | |
| 306 Sets functional classes invariants to use during | |
| 307 *FunctionalClassAtomTypes* value of *AtomIdentifierType* for | |
| 308 fingerprints generation and returns | |
| 309 *MolecularComplexityDescriptors*. | |
| 310 | |
| 311 Possible values for atom functional classes are: *Ar, CA, H, HBA, | |
| 312 HBD, Hal, NI, PI, RA*. Default value [ Ref 24 ]: | |
| 313 *HBD,HBA,PI,NI,Ar,Hal*. | |
| 314 | |
| 315 The functional class abbreviations correspond to: | |
| 316 | |
| 317 HBD: HydrogenBondDonor | |
| 318 HBA: HydrogenBondAcceptor | |
| 319 PI : PositivelyIonizable | |
| 320 NI : NegativelyIonizable | |
| 321 Ar : Aromatic | |
| 322 Hal : Halogen | |
| 323 H : Hydrophobic | |
| 324 RA : RingAtom | |
| 325 CA : ChainAtom | |
| 326 | |
| 327 Functional class atom type specification for an atom corresponds to: | |
| 328 | |
| 329 Ar.CA.H.HBA.HBD.Hal.NI.PI.RA or None | |
| 330 | |
| 331 *AtomTypes::FunctionalClassAtomTypes* module is used to assign | |
| 332 functional class atom types. It uses following definitions [ Ref | |
| 333 60-61, Ref 65-66 ]: | |
| 334 | |
| 335 HydrogenBondDonor: NH, NH2, OH | |
| 336 HydrogenBondAcceptor: N[!H], O | |
| 337 PositivelyIonizable: +, NH2 | |
| 338 NegativelyIonizable: -, C(=O)OH, S(=O)OH, P(=O)OH | |
| 339 | |
| 340 SetMaxDistance | |
| 341 $MolecularComplexityDescriptors->SetMaxDistance($MaxDistance); | |
| 342 | |
| 343 Sets maximum distance to use during topological atom pairs and | |
| 344 triplets fingerprints generation and returns | |
| 345 *MolecularComplexityDescriptors*. | |
| 346 | |
| 347 SetMaxPathLength | |
| 348 $MolecularComplexityDescriptors->SetMaxPathLength($Length); | |
| 349 | |
| 350 Sets maximum path length to use during path length fingerprints | |
| 351 generation and returns *MolecularComplexityDescriptors*. | |
| 352 | |
| 353 SetMinDistance | |
| 354 $MolecularComplexityDescriptors->SetMinDistance($MinDistance); | |
| 355 | |
| 356 Sets minimum distance to use during topological atom pairs and | |
| 357 triplets fingerprints generation and returns | |
| 358 *MolecularComplexityDescriptors*. | |
| 359 | |
| 360 SetMinPathLength | |
| 361 $MolecularComplexityDescriptors->SetMinPathLength($MinPathLength); | |
| 362 | |
| 363 Sets minimum path length to use during path length fingerprints | |
| 364 generation and returns *MolecularComplexityDescriptors*. | |
| 365 | |
| 366 SetMolecularComplexityType | |
| 367 $MolecularComplexityDescriptors->SetMolecularComplexityType($ComplexityType); | |
| 368 | |
| 369 Sets molecular complexity type to use for calculating its value and | |
| 370 returns *MolecularComplexityDescriptors*. | |
| 371 | |
| 372 SetNeighborhoodRadius | |
| 373 $MolecularComplexityDescriptors->SetNeighborhoodRadius($Radius); | |
| 374 | |
| 375 Sets neighborhood radius to use during extended connectivity | |
| 376 fingerprints generation and returns | |
| 377 *MolecularComplexityDescriptors*. | |
| 378 | |
| 379 SetNormalizationMethodology | |
| 380 $MolecularComplexityDescriptors->SetNormalizationMethodology($Methodology); | |
| 381 | |
| 382 Sets normalization methodology to use during calculation of | |
| 383 molecular complexity corresponding to extended connectivity, | |
| 384 topological pharmacophore atom pairs and tripletes fingerprints | |
| 385 returns *MolecularComplexityDescriptors*. | |
| 386 | |
| 387 StringifyMolecularComplexityDescriptors | |
| 388 $String = $MolecularComplexityDescriptors-> | |
| 389 StringifyMolecularComplexityDescriptors(); | |
| 390 | |
| 391 Returns a string containing information about | |
| 392 *MolecularComplexityDescriptors* object. | |
| 393 | |
| 394 AUTHOR | |
| 395 Manish Sud <msud@san.rr.com> | |
| 396 | |
| 397 SEE ALSO | |
| 398 MolecularDescriptors.pm, MolecularDescriptorsGenerator.pm | |
| 399 | |
| 400 COPYRIGHT | |
| 401 Copyright (C) 2015 Manish Sud. All rights reserved. | |
| 402 | |
| 403 This file is part of MayaChemTools. | |
| 404 | |
| 405 MayaChemTools is free software; you can redistribute it and/or modify it | |
| 406 under the terms of the GNU Lesser General Public License as published by | |
| 407 the Free Software Foundation; either version 3 of the License, or (at | |
| 408 your option) any later version. | |
| 409 |
