Mercurial > repos > deepakjadmin > mayatool3_test2
comparison docs/scripts/txt/TopologicalPharmacophoreAtomTripletsFingerprints.txt @ 0:4816e4a8ae95 draft default tip
Uploaded
| author | deepakjadmin |
|---|---|
| date | Wed, 20 Jan 2016 09:23:18 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:4816e4a8ae95 |
|---|---|
| 1 NAME | |
| 2 TopologicalPharmacophoreAtomTripletsFingerprints.pl - Generate | |
| 3 topological pharmacophore atom triplets fingerprints for SD files | |
| 4 | |
| 5 SYNOPSIS | |
| 6 TopologicalPharmacophoreAtomTripletsFingerprints.pl SDFile(s)... | |
| 7 | |
| 8 TopologicalPharmacophoreAtomTripletsFingerprints.pl [--AromaticityModel | |
| 9 *AromaticityModelType*] [--AtomTripletsSetSizeToUse *ArbitrarySize | | |
| 10 FixedSize*] [-a, --AtomTypesToUse *"AtomType1, AtomType2..."*] | |
| 11 [--AtomTypesWeight *"AtomType1, Weight1, AtomType2, Weight2..."*] | |
| 12 [--CompoundID *DataFieldName or LabelPrefixString*] [--CompoundIDLabel | |
| 13 *text*] [--CompoundIDMode] [--DataFields *"FieldLabel1, | |
| 14 FieldLabel2,..."*] [-d, --DataFieldsMode *All | Common | Specify | | |
| 15 CompoundID*] [--DistanceBinSize *number*] [-f, --Filter *Yes | No*] | |
| 16 [--FingerprintsLabelMode *FingerprintsLabelOnly | | |
| 17 FingerprintsLabelWithIDs*] [--FingerprintsLabel *text*] [-h, --help] | |
| 18 [-k, --KeepLargestComponent *Yes | No*] [--MinDistance *number*] | |
| 19 [--MaxDistance *number*] [--OutDelim *comma | tab | semicolon*] | |
| 20 [--output *SD | FP | text | all*] [-o, --overwrite] [-q, --quote *Yes | | |
| 21 No*] [-r, --root *RootName*] [-u, --UseTriangleInequality *Yes | No*] | |
| 22 [-v, --VectorStringFormat *ValuesString, IDsAndValuesString | | |
| 23 IDsAndValuesPairsString | ValuesAndIDsString | ValuesAndIDsPairsString*] | |
| 24 [-w, --WorkingDir dirname] SDFile(s)... | |
| 25 | |
| 26 DESCRIPTION | |
| 27 Generate topological pharmacophore atom triplets fingerprints [ Ref 66, | |
| 28 Ref 68-71 ] for *SDFile(s)* and create appropriate SD, FP or CSV/TSV | |
| 29 text file(s) containing fingerprints vector strings corresponding to | |
| 30 molecular fingerprints. | |
| 31 | |
| 32 Multiple SDFile names are separated by spaces. The valid file extensions | |
| 33 are *.sdf* and *.sd*. All other file names are ignored. All the SD files | |
| 34 in a current directory can be specified either by **.sdf* or the current | |
| 35 directory name. | |
| 36 | |
| 37 Based on the values specified for --AtomTypesToUse, pharmacophore atom | |
| 38 types are assigned to all non-hydrogen atoms in a molecule and a | |
| 39 distance matrix is generated. Using --MinDistance, --MaxDistance, and | |
| 40 --DistanceBinSize values, a binned distance matrix is generated with | |
| 41 lower bound on the distance bin as the distance in distance matrix; the | |
| 42 lower bound on the distance bin is also used as the distance between | |
| 43 atom pairs for generation of atom triplet identifiers. | |
| 44 | |
| 45 A pharmacophore atom triplets basis set is generated for all unique atom | |
| 46 triplets constituting atom pairs binned distances between --MinDistance | |
| 47 and --MaxDistance. The value of --UseTriangleInequality determines | |
| 48 whether the triangle inequality test is applied during generation of | |
| 49 atom triplets basis set. The lower distance bound, along with specified | |
| 50 pharmacophore types, is used during generation of atom triplet IDs. | |
| 51 | |
| 52 Let: | |
| 53 | |
| 54 P = Valid pharmacophore atom type | |
| 55 | |
| 56 Px = Pharmacophore atom x | |
| 57 Py = Pharmacophore atom y | |
| 58 Pz = Pharmacophore atom z | |
| 59 | |
| 60 Dmin = Minimum distance corresponding to number of bonds between two atoms | |
| 61 Dmax = Maximum distance corresponding to number of bonds between two atoms | |
| 62 D = Distance corresponding to number of bonds between two atom | |
| 63 | |
| 64 Bsize = Distance bin size | |
| 65 Nbins = Number of distance bins | |
| 66 | |
| 67 Dxy = Distance or lower bound of binned distance between Px and Py | |
| 68 Dxz = Distance or lower bound of binned distance between Px and Pz | |
| 69 Dyz = Distance or lower bound of binned distance between Py and Pz | |
| 70 | |
| 71 Then: | |
| 72 | |
| 73 PxDyz-PyDxz-PzDxy = Pharmacophore atom triplet IDs for atom types Px, | |
| 74 Py, and Pz | |
| 75 | |
| 76 For example: H1-H1-H1, H2-HBA-H2 and so on | |
| 77 | |
| 78 For default values of Dmin = 1 , Dmax = 10 and Bsize = 2: | |
| 79 | |
| 80 the number of distance bins, Nbins = 5, are: | |
| 81 | |
| 82 [1, 2] [3, 4] [5, 6] [7, 8] [9 10] | |
| 83 | |
| 84 and atom triplet basis set size is 2692. | |
| 85 | |
| 86 Atom triplet basis set size for various values of Dmin, Dmax and Bsize in | |
| 87 conjunction with usage of triangle inequality is: | |
| 88 | |
| 89 Dmin Dmax Bsize UseTriangleInequality TripletBasisSetSize | |
| 90 1 10 2 No 4960 | |
| 91 1 10 2 Yes 2692 [ Default ] | |
| 92 2 12 2 No 8436 | |
| 93 2 12 2 Yes 4494 | |
| 94 | |
| 95 Using binned distance matrix and pharmacohore atom types, occurrence of | |
| 96 unique pharmacohore atom triplets is counted. | |
| 97 | |
| 98 The final pharmacophore atom triples count along with atom pair | |
| 99 identifiers involving all non-hydrogen atoms constitute pharmacophore | |
| 100 topological atom triplets fingerprints of the molecule. | |
| 101 | |
| 102 For *ArbitrarySize* value of --AtomTripletsSetSizeToUse option, the | |
| 103 fingerprint vector correspond to only those topological pharmacophore | |
| 104 atom triplets which are present and have non-zero count. However, for | |
| 105 *FixedSize* value of --AtomTripletsSetSizeToUse option, the fingerprint | |
| 106 vector contains all possible valid topological pharmacophore atom | |
| 107 triplets with both zero and non-zero count values. | |
| 108 | |
| 109 Example of *SD* file containing topological pharmacophore atom triplets | |
| 110 fingerprints string data: | |
| 111 | |
| 112 ... ... | |
| 113 ... ... | |
| 114 $$$$ | |
| 115 ... ... | |
| 116 ... ... | |
| 117 ... ... | |
| 118 41 44 0 0 0 0 0 0 0 0999 V2000 | |
| 119 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 | |
| 120 ... ... | |
| 121 2 3 1 0 0 0 0 | |
| 122 ... ... | |
| 123 M END | |
| 124 > <CmpdID> | |
| 125 Cmpd1 | |
| 126 | |
| 127 > <TopologicalPharmacophoreAtomTripletsFingerprints> | |
| 128 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:ArbitrarySize: | |
| 129 MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesString;Ar1- | |
| 130 Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1 | |
| 131 -H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA1 H1- | |
| 132 HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 Ar1-...; | |
| 133 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23 | |
| 134 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1 | |
| 135 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ... | |
| 136 | |
| 137 $$$$ | |
| 138 ... ... | |
| 139 ... ... | |
| 140 | |
| 141 Example of *FP* file containing topological pharmacophore atom triplets | |
| 142 fingerprints string data: | |
| 143 | |
| 144 # | |
| 145 # Package = MayaChemTools 7.4 | |
| 146 # Release Date = Oct 21, 2010 | |
| 147 # | |
| 148 # TimeStamp = Fri Mar 11 15:38:58 2011 | |
| 149 # | |
| 150 # FingerprintsStringType = FingerprintsVector | |
| 151 # | |
| 152 # Description = TopologicalPharmacophoreAtomTriplets:ArbitrarySize:M... | |
| 153 # VectorStringFormat = IDsAndValuesString | |
| 154 # VectorValuesType = NumericalValues | |
| 155 # | |
| 156 Cmpd1 696;Ar1-Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1...;;46 106... | |
| 157 Cmpd2 251;H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-H1-NI1...;4 1 3 1 1 2 2... | |
| 158 ... ... | |
| 159 ... .. | |
| 160 | |
| 161 Example of CSV *Text* file containing topological pharmacophore atom | |
| 162 triplets fingerprints string data: | |
| 163 | |
| 164 "CompoundID","TopologicalPharmacophoreAtomTripletsFingerprints" | |
| 165 "Cmpd1","FingerprintsVector;TopologicalPharmacophoreAtomTriplets:Arbitr | |
| 166 arySize:MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesStri | |
| 167 ng;Ar1-Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HB | |
| 168 A1 Ar1-H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA | |
| 169 1 H1-HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 A...; | |
| 170 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23 | |
| 171 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1 | |
| 172 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ... | |
| 173 ... ... | |
| 174 ... ... | |
| 175 | |
| 176 The current release of MayaChemTools generates the following types of | |
| 177 topological pharmacophore atom triplets fingerprints vector strings: | |
| 178 | |
| 179 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:ArbitrarySize: | |
| 180 MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesString;Ar1- | |
| 181 Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1 | |
| 182 -H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA1 H1- | |
| 183 HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 Ar1-...; | |
| 184 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23 | |
| 185 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1 | |
| 186 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ... | |
| 187 | |
| 188 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD | |
| 189 istance1:MaxDistance10;2692;OrderedNumericalValues;ValuesString;46 106 | |
| 190 8 3 0 0 83 11 4 0 0 0 1 0 0 0 0 0 0 0 0 21 5 3 0 0 1 2 2 0 0 1 0 0 0 | |
| 191 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 101 18 11 0 0 145 132 26 | |
| 192 14 0 0 23 28 3 3 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 45 10 4 0 | |
| 193 0 16 20 7 5 1 0 3 4 5 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 5 ... | |
| 194 | |
| 195 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD | |
| 196 istance1:MaxDistance10;2692;OrderedNumericalValues;IDsAndValuesString; | |
| 197 Ar1-Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-Ar1-NI1 Ar1-Ar1-P | |
| 198 I1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1-H1-HBD1 Ar1-H1-NI1 Ar1-H1-PI1 Ar1-HBA1-HB | |
| 199 A1 Ar1-HBA1-HBD1 Ar1-HBA1-NI1 Ar1-HBA1-PI1 Ar1-HBD1-HBD1 Ar1-HBD1-...; | |
| 200 46 106 8 3 0 0 83 11 4 0 0 0 1 0 0 0 0 0 0 0 0 21 5 3 0 0 1 2 2 0 0 1 | |
| 201 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 101 18 11 0 0 145 | |
| 202 132 26 14 0 0 23 28 3 3 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 ... | |
| 203 | |
| 204 OPTIONS | |
| 205 --AromaticityModel *MDLAromaticityModel | TriposAromaticityModel | | |
| 206 MMFFAromaticityModel | ChemAxonBasicAromaticityModel | | |
| 207 ChemAxonGeneralAromaticityModel | DaylightAromaticityModel | | |
| 208 MayaChemToolsAromaticityModel* | |
| 209 Specify aromaticity model to use during detection of aromaticity. | |
| 210 Possible values in the current release are: *MDLAromaticityModel, | |
| 211 TriposAromaticityModel, MMFFAromaticityModel, | |
| 212 ChemAxonBasicAromaticityModel, ChemAxonGeneralAromaticityModel, | |
| 213 DaylightAromaticityModel or MayaChemToolsAromaticityModel*. Default | |
| 214 value: *MayaChemToolsAromaticityModel*. | |
| 215 | |
| 216 The supported aromaticity model names along with model specific | |
| 217 control parameters are defined in AromaticityModelsData.csv, which | |
| 218 is distributed with the current release and is available under | |
| 219 lib/data directory. Molecule.pm module retrieves data from this file | |
| 220 during class instantiation and makes it available to method | |
| 221 DetectAromaticity for detecting aromaticity corresponding to a | |
| 222 specific model. | |
| 223 | |
| 224 --AtomTripletsSetSizeToUse *ArbitrarySize | FixedSize* | |
| 225 Atom triplets set size to use during generation of topological | |
| 226 pharmacophore atom triplets fingerprints. | |
| 227 | |
| 228 Possible values: *ArbitrarySize | FixedSize*; Default value: | |
| 229 *ArbitrarySize*. | |
| 230 | |
| 231 For *ArbitrarySize* value of --AtomTripletsSetSizeToUse option, the | |
| 232 fingerprint vector correspond to only those topological | |
| 233 pharmacophore atom triplets which are present and have non-zero | |
| 234 count. However, for *FixedSize* value of --AtomTripletsSetSizeToUse | |
| 235 option, the fingerprint vector contains all possible valid | |
| 236 topological pharmacophore atom triplets with both zero and non-zero | |
| 237 count values. | |
| 238 | |
| 239 -a, --AtomTypesToUse *"AtomType1,AtomType2,..."* | |
| 240 Pharmacophore atom types to use during generation of topological | |
| 241 phramacophore atom triplets. It's a list of comma separated valid | |
| 242 pharmacophore atom types. | |
| 243 | |
| 244 Possible values for pharmacophore atom types are: *Ar, CA, H, HBA, | |
| 245 HBD, Hal, NI, PI, RA*. Default value [ Ref 71 ] : | |
| 246 *HBD,HBA,PI,NI,H,Ar*. | |
| 247 | |
| 248 The pharmacophore atom types abbreviations correspond to: | |
| 249 | |
| 250 HBD: HydrogenBondDonor | |
| 251 HBA: HydrogenBondAcceptor | |
| 252 PI : PositivelyIonizable | |
| 253 NI : NegativelyIonizable | |
| 254 Ar : Aromatic | |
| 255 Hal : Halogen | |
| 256 H : Hydrophobic | |
| 257 RA : RingAtom | |
| 258 CA : ChainAtom | |
| 259 | |
| 260 *AtomTypes::FunctionalClassAtomTypes* module is used to assign | |
| 261 pharmacophore atom types. It uses following definitions [ Ref 60-61, | |
| 262 Ref 65-66 ]: | |
| 263 | |
| 264 HydrogenBondDonor: NH, NH2, OH | |
| 265 HydrogenBondAcceptor: N[!H], O | |
| 266 PositivelyIonizable: +, NH2 | |
| 267 NegativelyIonizable: -, C(=O)OH, S(=O)OH, P(=O)OH | |
| 268 | |
| 269 --CompoundID *DataFieldName or LabelPrefixString* | |
| 270 This value is --CompoundIDMode specific and indicates how compound | |
| 271 ID is generated. | |
| 272 | |
| 273 For *DataField* value of --CompoundIDMode option, it corresponds to | |
| 274 datafield label name whose value is used as compound ID; otherwise, | |
| 275 it's a prefix string used for generating compound IDs like | |
| 276 LabelPrefixString<Number>. Default value, *Cmpd*, generates compound | |
| 277 IDs which look like Cmpd<Number>. | |
| 278 | |
| 279 Examples for *DataField* value of --CompoundIDMode: | |
| 280 | |
| 281 MolID | |
| 282 ExtReg | |
| 283 | |
| 284 Examples for *LabelPrefix* or *MolNameOrLabelPrefix* value of | |
| 285 --CompoundIDMode: | |
| 286 | |
| 287 Compound | |
| 288 | |
| 289 The value specified above generates compound IDs which correspond to | |
| 290 Compound<Number> instead of default value of Cmpd<Number>. | |
| 291 | |
| 292 --CompoundIDLabel *text* | |
| 293 Specify compound ID column label for CSV/TSV text file(s) used | |
| 294 during *CompoundID* value of --DataFieldsMode option. Default value: | |
| 295 *CompoundID*. | |
| 296 | |
| 297 --CompoundIDMode *DataField | MolName | LabelPrefix | | |
| 298 MolNameOrLabelPrefix* | |
| 299 Specify how to generate compound IDs and write to FP or CSV/TSV text | |
| 300 file(s) along with generated fingerprints for *FP | text | all* | |
| 301 values of --output option: use a *SDFile(s)* datafield value; use | |
| 302 molname line from *SDFile(s)*; generate a sequential ID with | |
| 303 specific prefix; use combination of both MolName and LabelPrefix | |
| 304 with usage of LabelPrefix values for empty molname lines. | |
| 305 | |
| 306 Possible values: *DataField | MolName | LabelPrefix | | |
| 307 MolNameOrLabelPrefix*. Default value: *LabelPrefix*. | |
| 308 | |
| 309 For *MolNameAndLabelPrefix* value of --CompoundIDMode, molname line | |
| 310 in *SDFile(s)* takes precedence over sequential compound IDs | |
| 311 generated using *LabelPrefix* and only empty molname values are | |
| 312 replaced with sequential compound IDs. | |
| 313 | |
| 314 This is only used for *CompoundID* value of --DataFieldsMode option. | |
| 315 | |
| 316 --DataFields *"FieldLabel1,FieldLabel2,..."* | |
| 317 Comma delimited list of *SDFiles(s)* data fields to extract and | |
| 318 write to CSV/TSV text file(s) along with generated fingerprints for | |
| 319 *text | all* values of --output option. | |
| 320 | |
| 321 This is only used for *Specify* value of --DataFieldsMode option. | |
| 322 | |
| 323 Examples: | |
| 324 | |
| 325 Extreg | |
| 326 MolID,CompoundName | |
| 327 | |
| 328 -d, --DataFieldsMode *All | Common | Specify | CompoundID* | |
| 329 Specify how data fields in *SDFile(s)* are transferred to output | |
| 330 CSV/TSV text file(s) along with generated fingerprints for *text | | |
| 331 all* values of --output option: transfer all SD data field; transfer | |
| 332 SD data files common to all compounds; extract specified data | |
| 333 fields; generate a compound ID using molname line, a compound | |
| 334 prefix, or a combination of both. Possible values: *All | Common | | |
| 335 specify | CompoundID*. Default value: *CompoundID*. | |
| 336 | |
| 337 --DistanceBinSize *number* | |
| 338 Distance bin size used to bin distances between atom pairs in atom | |
| 339 triplets. Default value: *2*. Valid values: positive integers. | |
| 340 | |
| 341 For default --MinDistance and --MaxDistance values of 1 and 10 with | |
| 342 --DistanceBinSize of 2 [ Ref 70 ], the following 5 distance bins are | |
| 343 generated: | |
| 344 | |
| 345 [1, 2] [3, 4] [5, 6] [7, 8] [9 10] | |
| 346 | |
| 347 The lower distance bound on the distance bin is uses to bin the | |
| 348 distance between atom pairs in atom triplets. So in the previous | |
| 349 example, atom pairs with distances 1 and 2 fall in first distance | |
| 350 bin, atom pairs with distances 3 and 4 fall in second distance bin | |
| 351 and so on. | |
| 352 | |
| 353 In order to distribute distance bins of equal size, the last bin is | |
| 354 allowed to go past --MaxDistance by up to distance bin size. For | |
| 355 example, --MinDistance and --MaxDistance values of 2 and 10 with | |
| 356 --DistanceBinSize of 2 generates the following 6 distance bins: | |
| 357 | |
| 358 [2, 3] [4, 5] [6, 7] [8, 9] [10 11] | |
| 359 | |
| 360 -f, --Filter *Yes | No* | |
| 361 Specify whether to check and filter compound data in SDFile(s). | |
| 362 Possible values: *Yes or No*. Default value: *Yes*. | |
| 363 | |
| 364 By default, compound data is checked before calculating fingerprints | |
| 365 and compounds containing atom data corresponding to non-element | |
| 366 symbols or no atom data are ignored. | |
| 367 | |
| 368 --FingerprintsLabelMode *FingerprintsLabelOnly | | |
| 369 FingerprintsLabelWithIDs* | |
| 370 Specify how fingerprints label is generated in conjunction with | |
| 371 --FingerprintsLabel option value: use fingerprints label generated | |
| 372 only by --FingerprintsLabel option value or append topological atom | |
| 373 pair count value IDs to --FingerprintsLabel option value. | |
| 374 | |
| 375 Possible values: *FingerprintsLabelOnly | FingerprintsLabelWithIDs*. | |
| 376 Default value: *FingerprintsLabelOnly*. | |
| 377 | |
| 378 Topological atom pairs IDs appended to --FingerprintsLabel value | |
| 379 during *FingerprintsLabelWithIDs* values of --FingerprintsLabelMode | |
| 380 correspond to atom pair count values in fingerprint vector string. | |
| 381 | |
| 382 *FingerprintsLabelWithIDs* value of --FingerprintsLabelMode is | |
| 383 ignored during *ArbitrarySize* value of --AtomTripletsSetSizeToUse | |
| 384 option and topological atom triplets IDs not appended to the label. | |
| 385 | |
| 386 --FingerprintsLabel *text* | |
| 387 SD data label or text file column label to use for fingerprints | |
| 388 string in output SD or CSV/TSV text file(s) specified by --output. | |
| 389 Default value: *TopologicalPharmacophoreAtomTripletsFingerprints*. | |
| 390 | |
| 391 -h, --help | |
| 392 Print this help message. | |
| 393 | |
| 394 -k, --KeepLargestComponent *Yes | No* | |
| 395 Generate fingerprints for only the largest component in molecule. | |
| 396 Possible values: *Yes or No*. Default value: *Yes*. | |
| 397 | |
| 398 For molecules containing multiple connected components, fingerprints | |
| 399 can be generated in two different ways: use all connected components | |
| 400 or just the largest connected component. By default, all atoms | |
| 401 except for the largest connected component are deleted before | |
| 402 generation of fingerprints. | |
| 403 | |
| 404 --MinDistance *number* | |
| 405 Minimum bond distance between atom pairs corresponding to atom | |
| 406 triplets for generating topological pharmacophore atom triplets. | |
| 407 Default value: *1*. Valid values: positive integers and less than | |
| 408 --MaxDistance. | |
| 409 | |
| 410 --MaxDistance *number* | |
| 411 Maximum bond distance between atom pairs corresponding to atom | |
| 412 triplets for generating topological pharmacophore atom triplets. | |
| 413 Default value: *10*. Valid values: positive integers and greater | |
| 414 than --MinDistance. | |
| 415 | |
| 416 --OutDelim *comma | tab | semicolon* | |
| 417 Delimiter for output CSV/TSV text file(s). Possible values: *comma, | |
| 418 tab, or semicolon* Default value: *comma*. | |
| 419 | |
| 420 --output *SD | FP | text | all* | |
| 421 Type of output files to generate. Possible values: *SD, FP, text, or | |
| 422 all*. Default value: *text*. | |
| 423 | |
| 424 -o, --overwrite | |
| 425 Overwrite existing files. | |
| 426 | |
| 427 -q, --quote *Yes | No* | |
| 428 Put quote around column values in output CSV/TSV text file(s). | |
| 429 Possible values: *Yes or No*. Default value: *Yes*. | |
| 430 | |
| 431 -r, --root *RootName* | |
| 432 New file name is generated using the root: <Root>.<Ext>. Default for | |
| 433 new file names: | |
| 434 <SDFileName><TopologicalPharmacophoreAtomTripletsFP>.<Ext>. The file | |
| 435 type determines <Ext> value. The sdf, fpf, csv, and tsv <Ext> values | |
| 436 are used for SD, FP, comma/semicolon, and tab delimited text files, | |
| 437 respectively.This option is ignored for multiple input files. | |
| 438 | |
| 439 -u, --UseTriangleInequality *Yes | No* | |
| 440 Specify whether to imply triangle distance inequality test to | |
| 441 distances between atom pairs in atom triplets during generation of | |
| 442 atom triplets basis set generation. Possible values: *Yes or No*. | |
| 443 Default value: *Yes*. | |
| 444 | |
| 445 Triangle distance inequality test implies that distance or binned | |
| 446 distance between any two atom pairs in an atom triplet must be less | |
| 447 than the sum of distances or binned distances between other two | |
| 448 atoms pairs and greater than the difference of their distances. | |
| 449 | |
| 450 For atom triplet PxDyz-PyDxz-PzDxy to satisfy triangle inequality: | |
| 451 | |
| 452 Dyz > |Dxz - Dxy| and Dyz < Dxz + Dxy | |
| 453 Dxz > |Dyz - Dxy| and Dyz < Dyz + Dxy | |
| 454 Dxy > |Dyz - Dxz| and Dxy < Dyz + Dxz | |
| 455 | |
| 456 -v, --VectorStringFormat *ValuesString, IDsAndValuesString | | |
| 457 IDsAndValuesPairsString | ValuesAndIDsString | ValuesAndIDsPairsString* | |
| 458 Format of fingerprints vector string data in output SD, FP or | |
| 459 CSV/TSV text file(s) specified by --output option. Possible values: | |
| 460 *ValuesString, IDsAndValuesString | IDsAndValuesPairsString | | |
| 461 ValuesAndIDsString | ValuesAndIDsPairsString*. Defaultvalue: | |
| 462 *ValuesString*. | |
| 463 | |
| 464 Default value during *FixedSize* value of --AtomTripletsSetSizeToUse | |
| 465 option: *ValuesString*. Default value during *ArbitrarySize* value | |
| 466 of --AtomTripletsSetSizeToUse option: *IDsAndValuesString*. | |
| 467 | |
| 468 *ValuesString* option value is not allowed for *ArbitrarySize* value | |
| 469 of --AtomTripletsSetSizeToUse option. | |
| 470 | |
| 471 Examples: | |
| 472 | |
| 473 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:ArbitrarySize: | |
| 474 MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesString;Ar1- | |
| 475 Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1 | |
| 476 -H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA1 H1- | |
| 477 HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 Ar1-...; | |
| 478 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23 | |
| 479 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1 | |
| 480 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ... | |
| 481 | |
| 482 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD | |
| 483 istance1:MaxDistance10;2692;OrderedNumericalValues;ValuesString;46 106 | |
| 484 8 3 0 0 83 11 4 0 0 0 1 0 0 0 0 0 0 0 0 21 5 3 0 0 1 2 2 0 0 1 0 0 0 | |
| 485 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 101 18 11 0 0 145 132 26 | |
| 486 14 0 0 23 28 3 3 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 45 10 4 0 | |
| 487 0 16 20 7 5 1 0 3 4 5 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 5 ... | |
| 488 | |
| 489 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD | |
| 490 istance1:MaxDistance10;2692;OrderedNumericalValues;ValuesAndIDsPairsSt | |
| 491 ring;46 Ar1-Ar1-Ar1 106 Ar1-Ar1-H1 8 Ar1-Ar1-HBA1 3 Ar1-Ar1-HBD1 0 Ar1 | |
| 492 -Ar1-NI1 0 Ar1-Ar1-PI1 83 Ar1-H1-H1 11 Ar1-H1-HBA1 4 Ar1-H1-HBD1 0 Ar1 | |
| 493 -H1-NI1 0 Ar1-H1-PI1 0 Ar1-HBA1-HBA1 1 Ar1-HBA1-HBD1 0 Ar1-HBA1-NI1 0 | |
| 494 Ar1-HBA1-PI1 0 Ar1-HBD1-HBD1 0 Ar1-HBD1-NI1 0 Ar1-HBD1-PI1 0 Ar1-NI... | |
| 495 | |
| 496 -w, --WorkingDir *DirName* | |
| 497 Location of working directory. Default value: current directory. | |
| 498 | |
| 499 EXAMPLES | |
| 500 To generate topological pharmacophore atom triplets fingerprints of | |
| 501 arbitrary size corresponding to 5 distance bins spanning distances from | |
| 502 1 through 10 using default atoms with distances satisfying triangle | |
| 503 inequality and create a SampleTPATFP.csv file containing sequential | |
| 504 compound IDs along with fingerprints vector strings data in ValuesString | |
| 505 format, type: | |
| 506 | |
| 507 % TopologicalPharmacophoreAtomTripletsFingerprints.pl -r SampleTPATFP | |
| 508 -o Sample.sdf | |
| 509 | |
| 510 To generate topological pharmacophore atom triplets fingerprints of | |
| 511 fixed size corresponding to 5 distance bins spanning distances from 1 | |
| 512 through 10 using default atoms with distances satisfying triangle | |
| 513 inequality and create a SampleTPATFP.csv file containing sequential | |
| 514 compound IDs along with fingerprints vector strings data in ValuesString | |
| 515 format, type: | |
| 516 | |
| 517 % TopologicalPharmacophoreAtomTripletsFingerprints.pl | |
| 518 --AtomTripletsSetSizeToUse FixedSize -r SampleTPATFP -o Sample.sdf | |
| 519 | |
| 520 To generate topological pharmacophore atom triplets fingerprints of | |
| 521 arbitrary size corresponding to 5 distance bins spanning distances from | |
| 522 1 through 10 using default atoms with distances satisfying triangle | |
| 523 inequality and create SampleTPATFP.sdf, SampleTPATFP.fpf and | |
| 524 SampleTPATFP.csv files with CSV file containing sequential compound IDs | |
| 525 along with fingerprints vector strings data in ValuesString format, | |
| 526 type: | |
| 527 | |
| 528 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --output all | |
| 529 -r SampleTPATFP -o Sample.sdf | |
| 530 | |
| 531 To generate topological pharmacophore atom triplets fingerprints of | |
| 532 arbitrary size corresponding to 5 distance bins spanning distances from | |
| 533 1 through 10 using default atoms with distances satisfying triangle | |
| 534 inequality and create a SampleTPATFP.csv file containing sequential | |
| 535 compound IDs along with fingerprints vector strings data in ValuesString | |
| 536 format and atom triplets IDs in the fingerprint data column label | |
| 537 starting with Fingerprints, type: | |
| 538 | |
| 539 % TopologicalPharmacophoreAtomTripletsFingerprints.pl | |
| 540 --FingerprintsLabelMode FingerprintsLabelWithIDs --FingerprintsLabel | |
| 541 Fingerprints -r SampleTPATFP -o Sample.sdf | |
| 542 | |
| 543 To generate topological pharmacophore atom triplets fingerprints of | |
| 544 arbitrary size corresponding to 5 distance bins spanning distances from | |
| 545 1 through 10 using default atoms with distances not satisfying triangle | |
| 546 inequality and create a SampleTPATFP.csv file containing sequential | |
| 547 compound IDs along with fingerprints vector strings data in ValuesString | |
| 548 format, type: | |
| 549 | |
| 550 % TopologicalPharmacophoreAtomTripletsFingerprints.pl | |
| 551 --UseTriangleInequality No -r SampleTPATFP -o Sample.sdf | |
| 552 | |
| 553 To generate topological pharmacophore atom triplets fingerprints of | |
| 554 arbitrary size corresponding to 6 distance bins spanning distances from | |
| 555 1 through 12 using default atoms with distances satisfying triangle | |
| 556 inequality and create a SampleTPATFP.csv file containing sequential | |
| 557 compound IDs along with fingerprints vector strings data in ValuesString | |
| 558 format, type: | |
| 559 | |
| 560 % TopologicalPharmacophoreAtomTripletsFingerprints.pl | |
| 561 --UseTriangleInequality Yes --MinDistance 1 --MaxDistance 12 | |
| 562 --DistanceBinSIze 2 -r SampleTPATFP -o Sample.sdf | |
| 563 | |
| 564 To generate topological pharmacophore atom triplets fingerprints of | |
| 565 arbitrary size corresponding to 6 distance bins spanning distances from | |
| 566 1 through 12 using "HBD,HBA,PI, NI, H, Ar" atoms with distances | |
| 567 satisfying triangle inequality and create a SampleTPATFP.csv file | |
| 568 containing sequential compound IDs along with fingerprints vector | |
| 569 strings data in ValuesString format, type: | |
| 570 | |
| 571 % TopologicalPharmacophoreAtomTripletsFingerprints.pl | |
| 572 --AtomTypesToUse "HBD,HBA,PI,NI,H,Ar" --UseTriangleInequality Yes | |
| 573 --MinDistance 1 --MaxDistance 12 --DistanceBinSIze 2 | |
| 574 --VectorStringFormat ValuesString -r SampleTPATFP -o Sample.sdf | |
| 575 | |
| 576 To generate topological pharmacophore atom triplets fingerprints of | |
| 577 arbitrary size corresponding to 5 distance bins spanning distances from | |
| 578 1 through 10 using default atoms with distances satisfying triangle | |
| 579 inequality and create a SampleTPATFP.csv file containing sequential | |
| 580 compound IDs from molecule name line along with fingerprints vector | |
| 581 strings data in ValuesString format, type: | |
| 582 | |
| 583 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode | |
| 584 CompoundID -CompoundIDMode MolName -r SampleTPATFP -o Sample.sdf | |
| 585 | |
| 586 To generate topological pharmacophore atom triplets fingerprints of | |
| 587 arbitrary size corresponding to 5 distance bins spanning distances from | |
| 588 1 through 10 using default atoms with distances satisfying triangle | |
| 589 inequality and create a SampleTPATFP.csv file containing sequential | |
| 590 compound IDs using specified data field along with fingerprints vector | |
| 591 strings data in ValuesString format, type: | |
| 592 | |
| 593 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode | |
| 594 CompoundID -CompoundIDMode DataField --CompoundID Mol_ID | |
| 595 -r SampleTPATFP -o Sample.sdf | |
| 596 | |
| 597 To generate topological pharmacophore atom triplets fingerprints of | |
| 598 arbitrary size corresponding to 5 distance bins spanning distances from | |
| 599 1 through 10 using default atoms with distances satisfying triangle | |
| 600 inequality and create a SampleTPATFP.csv file containing sequential | |
| 601 compound IDs using combination of molecule name line and an explicit | |
| 602 compound prefix along with fingerprints vector strings data, type: | |
| 603 | |
| 604 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode | |
| 605 CompoundID -CompoundIDMode MolnameOrLabelPrefix | |
| 606 --CompoundID Cmpd --CompoundIDLabel MolID -r SampleSampleTPATFP | |
| 607 -o Sample.sdf | |
| 608 | |
| 609 To generate topological pharmacophore atom triplets fingerprints of | |
| 610 arbitrary size corresponding to 5 distance bins spanning distances from | |
| 611 1 through 10 using default atoms with distances satisfying triangle | |
| 612 inequality and create a SampleTPATFP.csv file containing specific data | |
| 613 fields columns along with fingerprints vector strings data, type: | |
| 614 | |
| 615 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode | |
| 616 Specify --DataFields Mol_ID -r SampleTPATFP -o Sample.sdf | |
| 617 | |
| 618 To generate topological pharmacophore atom triplets fingerprints of | |
| 619 arbitrary size corresponding to 5 distance bins spanning distances from | |
| 620 1 through 10 using default atoms with distances satisfying triangle | |
| 621 inequality and create a SampleTPATFP.csv file containing common data | |
| 622 fields columns along with fingerprints vector strings data, type: | |
| 623 | |
| 624 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode | |
| 625 Common -r SampleTPATFP -o Sample.sdf | |
| 626 | |
| 627 To generate topological pharmacophore atom triplets fingerprints of | |
| 628 arbitrary size corresponding to 5 distance bins spanning distances from | |
| 629 1 through 10 using default atoms with distances satisfying triangle | |
| 630 inequality and create SampleTPATFP.sdf, SampleTPATFP.fpf and | |
| 631 SampleTPATFP.csv files containing all data fields columns in CSV file | |
| 632 along with fingerprints data, type: | |
| 633 | |
| 634 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode | |
| 635 All --output all -r SampleTPATFP -o Sample.sdf | |
| 636 | |
| 637 AUTHOR | |
| 638 Manish Sud <msud@san.rr.com> | |
| 639 | |
| 640 SEE ALSO | |
| 641 InfoFingerprintsFiles.pl, SimilarityMatricesFingerprints.pl, | |
| 642 AtomNeighborhoodsFingerprints.pl, ExtendedConnectivityFingerprints.pl, | |
| 643 MACCSKeysFingerprints.pl, PathLengthFingerprints.pl, | |
| 644 TopologicalAtomPairsFingerprints.pl, | |
| 645 TopologicalAtomTorsionsFingerprints.pl, | |
| 646 TopologicalPharmacophoreAtomPairsFingerprints.pl | |
| 647 | |
| 648 COPYRIGHT | |
| 649 Copyright (C) 2015 Manish Sud. All rights reserved. | |
| 650 | |
| 651 This file is part of MayaChemTools. | |
| 652 | |
| 653 MayaChemTools is free software; you can redistribute it and/or modify it | |
| 654 under the terms of the GNU Lesser General Public License as published by | |
| 655 the Free Software Foundation; either version 3 of the License, or (at | |
| 656 your option) any later version. | |
| 657 |
