comparison docs/scripts/txt/TopologicalPharmacophoreAtomTripletsFingerprints.txt @ 0:4816e4a8ae95 draft default tip

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 09:23:18 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:4816e4a8ae95
1 NAME
2 TopologicalPharmacophoreAtomTripletsFingerprints.pl - Generate
3 topological pharmacophore atom triplets fingerprints for SD files
4
5 SYNOPSIS
6 TopologicalPharmacophoreAtomTripletsFingerprints.pl SDFile(s)...
7
8 TopologicalPharmacophoreAtomTripletsFingerprints.pl [--AromaticityModel
9 *AromaticityModelType*] [--AtomTripletsSetSizeToUse *ArbitrarySize |
10 FixedSize*] [-a, --AtomTypesToUse *"AtomType1, AtomType2..."*]
11 [--AtomTypesWeight *"AtomType1, Weight1, AtomType2, Weight2..."*]
12 [--CompoundID *DataFieldName or LabelPrefixString*] [--CompoundIDLabel
13 *text*] [--CompoundIDMode] [--DataFields *"FieldLabel1,
14 FieldLabel2,..."*] [-d, --DataFieldsMode *All | Common | Specify |
15 CompoundID*] [--DistanceBinSize *number*] [-f, --Filter *Yes | No*]
16 [--FingerprintsLabelMode *FingerprintsLabelOnly |
17 FingerprintsLabelWithIDs*] [--FingerprintsLabel *text*] [-h, --help]
18 [-k, --KeepLargestComponent *Yes | No*] [--MinDistance *number*]
19 [--MaxDistance *number*] [--OutDelim *comma | tab | semicolon*]
20 [--output *SD | FP | text | all*] [-o, --overwrite] [-q, --quote *Yes |
21 No*] [-r, --root *RootName*] [-u, --UseTriangleInequality *Yes | No*]
22 [-v, --VectorStringFormat *ValuesString, IDsAndValuesString |
23 IDsAndValuesPairsString | ValuesAndIDsString | ValuesAndIDsPairsString*]
24 [-w, --WorkingDir dirname] SDFile(s)...
25
26 DESCRIPTION
27 Generate topological pharmacophore atom triplets fingerprints [ Ref 66,
28 Ref 68-71 ] for *SDFile(s)* and create appropriate SD, FP or CSV/TSV
29 text file(s) containing fingerprints vector strings corresponding to
30 molecular fingerprints.
31
32 Multiple SDFile names are separated by spaces. The valid file extensions
33 are *.sdf* and *.sd*. All other file names are ignored. All the SD files
34 in a current directory can be specified either by **.sdf* or the current
35 directory name.
36
37 Based on the values specified for --AtomTypesToUse, pharmacophore atom
38 types are assigned to all non-hydrogen atoms in a molecule and a
39 distance matrix is generated. Using --MinDistance, --MaxDistance, and
40 --DistanceBinSize values, a binned distance matrix is generated with
41 lower bound on the distance bin as the distance in distance matrix; the
42 lower bound on the distance bin is also used as the distance between
43 atom pairs for generation of atom triplet identifiers.
44
45 A pharmacophore atom triplets basis set is generated for all unique atom
46 triplets constituting atom pairs binned distances between --MinDistance
47 and --MaxDistance. The value of --UseTriangleInequality determines
48 whether the triangle inequality test is applied during generation of
49 atom triplets basis set. The lower distance bound, along with specified
50 pharmacophore types, is used during generation of atom triplet IDs.
51
52 Let:
53
54 P = Valid pharmacophore atom type
55
56 Px = Pharmacophore atom x
57 Py = Pharmacophore atom y
58 Pz = Pharmacophore atom z
59
60 Dmin = Minimum distance corresponding to number of bonds between two atoms
61 Dmax = Maximum distance corresponding to number of bonds between two atoms
62 D = Distance corresponding to number of bonds between two atom
63
64 Bsize = Distance bin size
65 Nbins = Number of distance bins
66
67 Dxy = Distance or lower bound of binned distance between Px and Py
68 Dxz = Distance or lower bound of binned distance between Px and Pz
69 Dyz = Distance or lower bound of binned distance between Py and Pz
70
71 Then:
72
73 PxDyz-PyDxz-PzDxy = Pharmacophore atom triplet IDs for atom types Px,
74 Py, and Pz
75
76 For example: H1-H1-H1, H2-HBA-H2 and so on
77
78 For default values of Dmin = 1 , Dmax = 10 and Bsize = 2:
79
80 the number of distance bins, Nbins = 5, are:
81
82 [1, 2] [3, 4] [5, 6] [7, 8] [9 10]
83
84 and atom triplet basis set size is 2692.
85
86 Atom triplet basis set size for various values of Dmin, Dmax and Bsize in
87 conjunction with usage of triangle inequality is:
88
89 Dmin Dmax Bsize UseTriangleInequality TripletBasisSetSize
90 1 10 2 No 4960
91 1 10 2 Yes 2692 [ Default ]
92 2 12 2 No 8436
93 2 12 2 Yes 4494
94
95 Using binned distance matrix and pharmacohore atom types, occurrence of
96 unique pharmacohore atom triplets is counted.
97
98 The final pharmacophore atom triples count along with atom pair
99 identifiers involving all non-hydrogen atoms constitute pharmacophore
100 topological atom triplets fingerprints of the molecule.
101
102 For *ArbitrarySize* value of --AtomTripletsSetSizeToUse option, the
103 fingerprint vector correspond to only those topological pharmacophore
104 atom triplets which are present and have non-zero count. However, for
105 *FixedSize* value of --AtomTripletsSetSizeToUse option, the fingerprint
106 vector contains all possible valid topological pharmacophore atom
107 triplets with both zero and non-zero count values.
108
109 Example of *SD* file containing topological pharmacophore atom triplets
110 fingerprints string data:
111
112 ... ...
113 ... ...
114 $$$$
115 ... ...
116 ... ...
117 ... ...
118 41 44 0 0 0 0 0 0 0 0999 V2000
119 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
120 ... ...
121 2 3 1 0 0 0 0
122 ... ...
123 M END
124 > <CmpdID>
125 Cmpd1
126
127 > <TopologicalPharmacophoreAtomTripletsFingerprints>
128 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:ArbitrarySize:
129 MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesString;Ar1-
130 Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1
131 -H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA1 H1-
132 HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 Ar1-...;
133 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23
134 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1
135 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ...
136
137 $$$$
138 ... ...
139 ... ...
140
141 Example of *FP* file containing topological pharmacophore atom triplets
142 fingerprints string data:
143
144 #
145 # Package = MayaChemTools 7.4
146 # Release Date = Oct 21, 2010
147 #
148 # TimeStamp = Fri Mar 11 15:38:58 2011
149 #
150 # FingerprintsStringType = FingerprintsVector
151 #
152 # Description = TopologicalPharmacophoreAtomTriplets:ArbitrarySize:M...
153 # VectorStringFormat = IDsAndValuesString
154 # VectorValuesType = NumericalValues
155 #
156 Cmpd1 696;Ar1-Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1...;;46 106...
157 Cmpd2 251;H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-H1-NI1...;4 1 3 1 1 2 2...
158 ... ...
159 ... ..
160
161 Example of CSV *Text* file containing topological pharmacophore atom
162 triplets fingerprints string data:
163
164 "CompoundID","TopologicalPharmacophoreAtomTripletsFingerprints"
165 "Cmpd1","FingerprintsVector;TopologicalPharmacophoreAtomTriplets:Arbitr
166 arySize:MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesStri
167 ng;Ar1-Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HB
168 A1 Ar1-H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA
169 1 H1-HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 A...;
170 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23
171 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1
172 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ...
173 ... ...
174 ... ...
175
176 The current release of MayaChemTools generates the following types of
177 topological pharmacophore atom triplets fingerprints vector strings:
178
179 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:ArbitrarySize:
180 MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesString;Ar1-
181 Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1
182 -H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA1 H1-
183 HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 Ar1-...;
184 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23
185 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1
186 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ...
187
188 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD
189 istance1:MaxDistance10;2692;OrderedNumericalValues;ValuesString;46 106
190 8 3 0 0 83 11 4 0 0 0 1 0 0 0 0 0 0 0 0 21 5 3 0 0 1 2 2 0 0 1 0 0 0
191 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 101 18 11 0 0 145 132 26
192 14 0 0 23 28 3 3 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 45 10 4 0
193 0 16 20 7 5 1 0 3 4 5 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 5 ...
194
195 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD
196 istance1:MaxDistance10;2692;OrderedNumericalValues;IDsAndValuesString;
197 Ar1-Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-Ar1-NI1 Ar1-Ar1-P
198 I1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1-H1-HBD1 Ar1-H1-NI1 Ar1-H1-PI1 Ar1-HBA1-HB
199 A1 Ar1-HBA1-HBD1 Ar1-HBA1-NI1 Ar1-HBA1-PI1 Ar1-HBD1-HBD1 Ar1-HBD1-...;
200 46 106 8 3 0 0 83 11 4 0 0 0 1 0 0 0 0 0 0 0 0 21 5 3 0 0 1 2 2 0 0 1
201 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 101 18 11 0 0 145
202 132 26 14 0 0 23 28 3 3 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 ...
203
204 OPTIONS
205 --AromaticityModel *MDLAromaticityModel | TriposAromaticityModel |
206 MMFFAromaticityModel | ChemAxonBasicAromaticityModel |
207 ChemAxonGeneralAromaticityModel | DaylightAromaticityModel |
208 MayaChemToolsAromaticityModel*
209 Specify aromaticity model to use during detection of aromaticity.
210 Possible values in the current release are: *MDLAromaticityModel,
211 TriposAromaticityModel, MMFFAromaticityModel,
212 ChemAxonBasicAromaticityModel, ChemAxonGeneralAromaticityModel,
213 DaylightAromaticityModel or MayaChemToolsAromaticityModel*. Default
214 value: *MayaChemToolsAromaticityModel*.
215
216 The supported aromaticity model names along with model specific
217 control parameters are defined in AromaticityModelsData.csv, which
218 is distributed with the current release and is available under
219 lib/data directory. Molecule.pm module retrieves data from this file
220 during class instantiation and makes it available to method
221 DetectAromaticity for detecting aromaticity corresponding to a
222 specific model.
223
224 --AtomTripletsSetSizeToUse *ArbitrarySize | FixedSize*
225 Atom triplets set size to use during generation of topological
226 pharmacophore atom triplets fingerprints.
227
228 Possible values: *ArbitrarySize | FixedSize*; Default value:
229 *ArbitrarySize*.
230
231 For *ArbitrarySize* value of --AtomTripletsSetSizeToUse option, the
232 fingerprint vector correspond to only those topological
233 pharmacophore atom triplets which are present and have non-zero
234 count. However, for *FixedSize* value of --AtomTripletsSetSizeToUse
235 option, the fingerprint vector contains all possible valid
236 topological pharmacophore atom triplets with both zero and non-zero
237 count values.
238
239 -a, --AtomTypesToUse *"AtomType1,AtomType2,..."*
240 Pharmacophore atom types to use during generation of topological
241 phramacophore atom triplets. It's a list of comma separated valid
242 pharmacophore atom types.
243
244 Possible values for pharmacophore atom types are: *Ar, CA, H, HBA,
245 HBD, Hal, NI, PI, RA*. Default value [ Ref 71 ] :
246 *HBD,HBA,PI,NI,H,Ar*.
247
248 The pharmacophore atom types abbreviations correspond to:
249
250 HBD: HydrogenBondDonor
251 HBA: HydrogenBondAcceptor
252 PI : PositivelyIonizable
253 NI : NegativelyIonizable
254 Ar : Aromatic
255 Hal : Halogen
256 H : Hydrophobic
257 RA : RingAtom
258 CA : ChainAtom
259
260 *AtomTypes::FunctionalClassAtomTypes* module is used to assign
261 pharmacophore atom types. It uses following definitions [ Ref 60-61,
262 Ref 65-66 ]:
263
264 HydrogenBondDonor: NH, NH2, OH
265 HydrogenBondAcceptor: N[!H], O
266 PositivelyIonizable: +, NH2
267 NegativelyIonizable: -, C(=O)OH, S(=O)OH, P(=O)OH
268
269 --CompoundID *DataFieldName or LabelPrefixString*
270 This value is --CompoundIDMode specific and indicates how compound
271 ID is generated.
272
273 For *DataField* value of --CompoundIDMode option, it corresponds to
274 datafield label name whose value is used as compound ID; otherwise,
275 it's a prefix string used for generating compound IDs like
276 LabelPrefixString<Number>. Default value, *Cmpd*, generates compound
277 IDs which look like Cmpd<Number>.
278
279 Examples for *DataField* value of --CompoundIDMode:
280
281 MolID
282 ExtReg
283
284 Examples for *LabelPrefix* or *MolNameOrLabelPrefix* value of
285 --CompoundIDMode:
286
287 Compound
288
289 The value specified above generates compound IDs which correspond to
290 Compound<Number> instead of default value of Cmpd<Number>.
291
292 --CompoundIDLabel *text*
293 Specify compound ID column label for CSV/TSV text file(s) used
294 during *CompoundID* value of --DataFieldsMode option. Default value:
295 *CompoundID*.
296
297 --CompoundIDMode *DataField | MolName | LabelPrefix |
298 MolNameOrLabelPrefix*
299 Specify how to generate compound IDs and write to FP or CSV/TSV text
300 file(s) along with generated fingerprints for *FP | text | all*
301 values of --output option: use a *SDFile(s)* datafield value; use
302 molname line from *SDFile(s)*; generate a sequential ID with
303 specific prefix; use combination of both MolName and LabelPrefix
304 with usage of LabelPrefix values for empty molname lines.
305
306 Possible values: *DataField | MolName | LabelPrefix |
307 MolNameOrLabelPrefix*. Default value: *LabelPrefix*.
308
309 For *MolNameAndLabelPrefix* value of --CompoundIDMode, molname line
310 in *SDFile(s)* takes precedence over sequential compound IDs
311 generated using *LabelPrefix* and only empty molname values are
312 replaced with sequential compound IDs.
313
314 This is only used for *CompoundID* value of --DataFieldsMode option.
315
316 --DataFields *"FieldLabel1,FieldLabel2,..."*
317 Comma delimited list of *SDFiles(s)* data fields to extract and
318 write to CSV/TSV text file(s) along with generated fingerprints for
319 *text | all* values of --output option.
320
321 This is only used for *Specify* value of --DataFieldsMode option.
322
323 Examples:
324
325 Extreg
326 MolID,CompoundName
327
328 -d, --DataFieldsMode *All | Common | Specify | CompoundID*
329 Specify how data fields in *SDFile(s)* are transferred to output
330 CSV/TSV text file(s) along with generated fingerprints for *text |
331 all* values of --output option: transfer all SD data field; transfer
332 SD data files common to all compounds; extract specified data
333 fields; generate a compound ID using molname line, a compound
334 prefix, or a combination of both. Possible values: *All | Common |
335 specify | CompoundID*. Default value: *CompoundID*.
336
337 --DistanceBinSize *number*
338 Distance bin size used to bin distances between atom pairs in atom
339 triplets. Default value: *2*. Valid values: positive integers.
340
341 For default --MinDistance and --MaxDistance values of 1 and 10 with
342 --DistanceBinSize of 2 [ Ref 70 ], the following 5 distance bins are
343 generated:
344
345 [1, 2] [3, 4] [5, 6] [7, 8] [9 10]
346
347 The lower distance bound on the distance bin is uses to bin the
348 distance between atom pairs in atom triplets. So in the previous
349 example, atom pairs with distances 1 and 2 fall in first distance
350 bin, atom pairs with distances 3 and 4 fall in second distance bin
351 and so on.
352
353 In order to distribute distance bins of equal size, the last bin is
354 allowed to go past --MaxDistance by up to distance bin size. For
355 example, --MinDistance and --MaxDistance values of 2 and 10 with
356 --DistanceBinSize of 2 generates the following 6 distance bins:
357
358 [2, 3] [4, 5] [6, 7] [8, 9] [10 11]
359
360 -f, --Filter *Yes | No*
361 Specify whether to check and filter compound data in SDFile(s).
362 Possible values: *Yes or No*. Default value: *Yes*.
363
364 By default, compound data is checked before calculating fingerprints
365 and compounds containing atom data corresponding to non-element
366 symbols or no atom data are ignored.
367
368 --FingerprintsLabelMode *FingerprintsLabelOnly |
369 FingerprintsLabelWithIDs*
370 Specify how fingerprints label is generated in conjunction with
371 --FingerprintsLabel option value: use fingerprints label generated
372 only by --FingerprintsLabel option value or append topological atom
373 pair count value IDs to --FingerprintsLabel option value.
374
375 Possible values: *FingerprintsLabelOnly | FingerprintsLabelWithIDs*.
376 Default value: *FingerprintsLabelOnly*.
377
378 Topological atom pairs IDs appended to --FingerprintsLabel value
379 during *FingerprintsLabelWithIDs* values of --FingerprintsLabelMode
380 correspond to atom pair count values in fingerprint vector string.
381
382 *FingerprintsLabelWithIDs* value of --FingerprintsLabelMode is
383 ignored during *ArbitrarySize* value of --AtomTripletsSetSizeToUse
384 option and topological atom triplets IDs not appended to the label.
385
386 --FingerprintsLabel *text*
387 SD data label or text file column label to use for fingerprints
388 string in output SD or CSV/TSV text file(s) specified by --output.
389 Default value: *TopologicalPharmacophoreAtomTripletsFingerprints*.
390
391 -h, --help
392 Print this help message.
393
394 -k, --KeepLargestComponent *Yes | No*
395 Generate fingerprints for only the largest component in molecule.
396 Possible values: *Yes or No*. Default value: *Yes*.
397
398 For molecules containing multiple connected components, fingerprints
399 can be generated in two different ways: use all connected components
400 or just the largest connected component. By default, all atoms
401 except for the largest connected component are deleted before
402 generation of fingerprints.
403
404 --MinDistance *number*
405 Minimum bond distance between atom pairs corresponding to atom
406 triplets for generating topological pharmacophore atom triplets.
407 Default value: *1*. Valid values: positive integers and less than
408 --MaxDistance.
409
410 --MaxDistance *number*
411 Maximum bond distance between atom pairs corresponding to atom
412 triplets for generating topological pharmacophore atom triplets.
413 Default value: *10*. Valid values: positive integers and greater
414 than --MinDistance.
415
416 --OutDelim *comma | tab | semicolon*
417 Delimiter for output CSV/TSV text file(s). Possible values: *comma,
418 tab, or semicolon* Default value: *comma*.
419
420 --output *SD | FP | text | all*
421 Type of output files to generate. Possible values: *SD, FP, text, or
422 all*. Default value: *text*.
423
424 -o, --overwrite
425 Overwrite existing files.
426
427 -q, --quote *Yes | No*
428 Put quote around column values in output CSV/TSV text file(s).
429 Possible values: *Yes or No*. Default value: *Yes*.
430
431 -r, --root *RootName*
432 New file name is generated using the root: <Root>.<Ext>. Default for
433 new file names:
434 <SDFileName><TopologicalPharmacophoreAtomTripletsFP>.<Ext>. The file
435 type determines <Ext> value. The sdf, fpf, csv, and tsv <Ext> values
436 are used for SD, FP, comma/semicolon, and tab delimited text files,
437 respectively.This option is ignored for multiple input files.
438
439 -u, --UseTriangleInequality *Yes | No*
440 Specify whether to imply triangle distance inequality test to
441 distances between atom pairs in atom triplets during generation of
442 atom triplets basis set generation. Possible values: *Yes or No*.
443 Default value: *Yes*.
444
445 Triangle distance inequality test implies that distance or binned
446 distance between any two atom pairs in an atom triplet must be less
447 than the sum of distances or binned distances between other two
448 atoms pairs and greater than the difference of their distances.
449
450 For atom triplet PxDyz-PyDxz-PzDxy to satisfy triangle inequality:
451
452 Dyz > |Dxz - Dxy| and Dyz < Dxz + Dxy
453 Dxz > |Dyz - Dxy| and Dyz < Dyz + Dxy
454 Dxy > |Dyz - Dxz| and Dxy < Dyz + Dxz
455
456 -v, --VectorStringFormat *ValuesString, IDsAndValuesString |
457 IDsAndValuesPairsString | ValuesAndIDsString | ValuesAndIDsPairsString*
458 Format of fingerprints vector string data in output SD, FP or
459 CSV/TSV text file(s) specified by --output option. Possible values:
460 *ValuesString, IDsAndValuesString | IDsAndValuesPairsString |
461 ValuesAndIDsString | ValuesAndIDsPairsString*. Defaultvalue:
462 *ValuesString*.
463
464 Default value during *FixedSize* value of --AtomTripletsSetSizeToUse
465 option: *ValuesString*. Default value during *ArbitrarySize* value
466 of --AtomTripletsSetSizeToUse option: *IDsAndValuesString*.
467
468 *ValuesString* option value is not allowed for *ArbitrarySize* value
469 of --AtomTripletsSetSizeToUse option.
470
471 Examples:
472
473 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:ArbitrarySize:
474 MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesString;Ar1-
475 Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1
476 -H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA1 H1-
477 HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 Ar1-...;
478 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23
479 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1
480 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ...
481
482 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD
483 istance1:MaxDistance10;2692;OrderedNumericalValues;ValuesString;46 106
484 8 3 0 0 83 11 4 0 0 0 1 0 0 0 0 0 0 0 0 21 5 3 0 0 1 2 2 0 0 1 0 0 0
485 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 101 18 11 0 0 145 132 26
486 14 0 0 23 28 3 3 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 45 10 4 0
487 0 16 20 7 5 1 0 3 4 5 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 5 ...
488
489 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD
490 istance1:MaxDistance10;2692;OrderedNumericalValues;ValuesAndIDsPairsSt
491 ring;46 Ar1-Ar1-Ar1 106 Ar1-Ar1-H1 8 Ar1-Ar1-HBA1 3 Ar1-Ar1-HBD1 0 Ar1
492 -Ar1-NI1 0 Ar1-Ar1-PI1 83 Ar1-H1-H1 11 Ar1-H1-HBA1 4 Ar1-H1-HBD1 0 Ar1
493 -H1-NI1 0 Ar1-H1-PI1 0 Ar1-HBA1-HBA1 1 Ar1-HBA1-HBD1 0 Ar1-HBA1-NI1 0
494 Ar1-HBA1-PI1 0 Ar1-HBD1-HBD1 0 Ar1-HBD1-NI1 0 Ar1-HBD1-PI1 0 Ar1-NI...
495
496 -w, --WorkingDir *DirName*
497 Location of working directory. Default value: current directory.
498
499 EXAMPLES
500 To generate topological pharmacophore atom triplets fingerprints of
501 arbitrary size corresponding to 5 distance bins spanning distances from
502 1 through 10 using default atoms with distances satisfying triangle
503 inequality and create a SampleTPATFP.csv file containing sequential
504 compound IDs along with fingerprints vector strings data in ValuesString
505 format, type:
506
507 % TopologicalPharmacophoreAtomTripletsFingerprints.pl -r SampleTPATFP
508 -o Sample.sdf
509
510 To generate topological pharmacophore atom triplets fingerprints of
511 fixed size corresponding to 5 distance bins spanning distances from 1
512 through 10 using default atoms with distances satisfying triangle
513 inequality and create a SampleTPATFP.csv file containing sequential
514 compound IDs along with fingerprints vector strings data in ValuesString
515 format, type:
516
517 % TopologicalPharmacophoreAtomTripletsFingerprints.pl
518 --AtomTripletsSetSizeToUse FixedSize -r SampleTPATFP -o Sample.sdf
519
520 To generate topological pharmacophore atom triplets fingerprints of
521 arbitrary size corresponding to 5 distance bins spanning distances from
522 1 through 10 using default atoms with distances satisfying triangle
523 inequality and create SampleTPATFP.sdf, SampleTPATFP.fpf and
524 SampleTPATFP.csv files with CSV file containing sequential compound IDs
525 along with fingerprints vector strings data in ValuesString format,
526 type:
527
528 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --output all
529 -r SampleTPATFP -o Sample.sdf
530
531 To generate topological pharmacophore atom triplets fingerprints of
532 arbitrary size corresponding to 5 distance bins spanning distances from
533 1 through 10 using default atoms with distances satisfying triangle
534 inequality and create a SampleTPATFP.csv file containing sequential
535 compound IDs along with fingerprints vector strings data in ValuesString
536 format and atom triplets IDs in the fingerprint data column label
537 starting with Fingerprints, type:
538
539 % TopologicalPharmacophoreAtomTripletsFingerprints.pl
540 --FingerprintsLabelMode FingerprintsLabelWithIDs --FingerprintsLabel
541 Fingerprints -r SampleTPATFP -o Sample.sdf
542
543 To generate topological pharmacophore atom triplets fingerprints of
544 arbitrary size corresponding to 5 distance bins spanning distances from
545 1 through 10 using default atoms with distances not satisfying triangle
546 inequality and create a SampleTPATFP.csv file containing sequential
547 compound IDs along with fingerprints vector strings data in ValuesString
548 format, type:
549
550 % TopologicalPharmacophoreAtomTripletsFingerprints.pl
551 --UseTriangleInequality No -r SampleTPATFP -o Sample.sdf
552
553 To generate topological pharmacophore atom triplets fingerprints of
554 arbitrary size corresponding to 6 distance bins spanning distances from
555 1 through 12 using default atoms with distances satisfying triangle
556 inequality and create a SampleTPATFP.csv file containing sequential
557 compound IDs along with fingerprints vector strings data in ValuesString
558 format, type:
559
560 % TopologicalPharmacophoreAtomTripletsFingerprints.pl
561 --UseTriangleInequality Yes --MinDistance 1 --MaxDistance 12
562 --DistanceBinSIze 2 -r SampleTPATFP -o Sample.sdf
563
564 To generate topological pharmacophore atom triplets fingerprints of
565 arbitrary size corresponding to 6 distance bins spanning distances from
566 1 through 12 using "HBD,HBA,PI, NI, H, Ar" atoms with distances
567 satisfying triangle inequality and create a SampleTPATFP.csv file
568 containing sequential compound IDs along with fingerprints vector
569 strings data in ValuesString format, type:
570
571 % TopologicalPharmacophoreAtomTripletsFingerprints.pl
572 --AtomTypesToUse "HBD,HBA,PI,NI,H,Ar" --UseTriangleInequality Yes
573 --MinDistance 1 --MaxDistance 12 --DistanceBinSIze 2
574 --VectorStringFormat ValuesString -r SampleTPATFP -o Sample.sdf
575
576 To generate topological pharmacophore atom triplets fingerprints of
577 arbitrary size corresponding to 5 distance bins spanning distances from
578 1 through 10 using default atoms with distances satisfying triangle
579 inequality and create a SampleTPATFP.csv file containing sequential
580 compound IDs from molecule name line along with fingerprints vector
581 strings data in ValuesString format, type:
582
583 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode
584 CompoundID -CompoundIDMode MolName -r SampleTPATFP -o Sample.sdf
585
586 To generate topological pharmacophore atom triplets fingerprints of
587 arbitrary size corresponding to 5 distance bins spanning distances from
588 1 through 10 using default atoms with distances satisfying triangle
589 inequality and create a SampleTPATFP.csv file containing sequential
590 compound IDs using specified data field along with fingerprints vector
591 strings data in ValuesString format, type:
592
593 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode
594 CompoundID -CompoundIDMode DataField --CompoundID Mol_ID
595 -r SampleTPATFP -o Sample.sdf
596
597 To generate topological pharmacophore atom triplets fingerprints of
598 arbitrary size corresponding to 5 distance bins spanning distances from
599 1 through 10 using default atoms with distances satisfying triangle
600 inequality and create a SampleTPATFP.csv file containing sequential
601 compound IDs using combination of molecule name line and an explicit
602 compound prefix along with fingerprints vector strings data, type:
603
604 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode
605 CompoundID -CompoundIDMode MolnameOrLabelPrefix
606 --CompoundID Cmpd --CompoundIDLabel MolID -r SampleSampleTPATFP
607 -o Sample.sdf
608
609 To generate topological pharmacophore atom triplets fingerprints of
610 arbitrary size corresponding to 5 distance bins spanning distances from
611 1 through 10 using default atoms with distances satisfying triangle
612 inequality and create a SampleTPATFP.csv file containing specific data
613 fields columns along with fingerprints vector strings data, type:
614
615 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode
616 Specify --DataFields Mol_ID -r SampleTPATFP -o Sample.sdf
617
618 To generate topological pharmacophore atom triplets fingerprints of
619 arbitrary size corresponding to 5 distance bins spanning distances from
620 1 through 10 using default atoms with distances satisfying triangle
621 inequality and create a SampleTPATFP.csv file containing common data
622 fields columns along with fingerprints vector strings data, type:
623
624 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode
625 Common -r SampleTPATFP -o Sample.sdf
626
627 To generate topological pharmacophore atom triplets fingerprints of
628 arbitrary size corresponding to 5 distance bins spanning distances from
629 1 through 10 using default atoms with distances satisfying triangle
630 inequality and create SampleTPATFP.sdf, SampleTPATFP.fpf and
631 SampleTPATFP.csv files containing all data fields columns in CSV file
632 along with fingerprints data, type:
633
634 % TopologicalPharmacophoreAtomTripletsFingerprints.pl --DataFieldsMode
635 All --output all -r SampleTPATFP -o Sample.sdf
636
637 AUTHOR
638 Manish Sud <msud@san.rr.com>
639
640 SEE ALSO
641 InfoFingerprintsFiles.pl, SimilarityMatricesFingerprints.pl,
642 AtomNeighborhoodsFingerprints.pl, ExtendedConnectivityFingerprints.pl,
643 MACCSKeysFingerprints.pl, PathLengthFingerprints.pl,
644 TopologicalAtomPairsFingerprints.pl,
645 TopologicalAtomTorsionsFingerprints.pl,
646 TopologicalPharmacophoreAtomPairsFingerprints.pl
647
648 COPYRIGHT
649 Copyright (C) 2015 Manish Sud. All rights reserved.
650
651 This file is part of MayaChemTools.
652
653 MayaChemTools is free software; you can redistribute it and/or modify it
654 under the terms of the GNU Lesser General Public License as published by
655 the Free Software Foundation; either version 3 of the License, or (at
656 your option) any later version.
657