0
|
1 NAME
|
|
2 ExtendedConnectivityFingerprints.pl - Generate extended connectivity
|
|
3 fingerprints for SD files
|
|
4
|
|
5 SYNOPSIS
|
|
6 ExtendedConnectivityFingerprints.pl SDFile(s)...
|
|
7
|
|
8 ExtendedConnectivityFingerprints.pl [--AromaticityModel
|
|
9 *AromaticityModelType*] [-a, --AtomIdentifierType
|
|
10 *AtomicInvariantsAtomTypes*] [--AtomicInvariantsToUse
|
|
11 *"AtomicInvariant,AtomicInvariant..."*] [--FunctionalClassesToUse
|
|
12 *"FunctionalClass1,FunctionalClass2..."*] [--BitsOrder *Ascending |
|
|
13 Descending*] [-b, --BitStringFormat *BinaryString | HexadecimalString*]
|
|
14 [--CompoundID *DataFieldName or LabelPrefixString*] [--CompoundIDLabel
|
|
15 *text*] [--CompoundIDMode] [--DataFields
|
|
16 *"FieldLabel1,FieldLabel2,..."*] [-d, --DataFieldsMode *All | Common |
|
|
17 Specify | CompoundID*] [-f, --Filter *Yes | No*] [--FingerprintsLabel
|
|
18 *text*] [-h, --help] [-k, --KeepLargestComponent *Yes | No*] [-m, --mode
|
|
19 *ExtendedConnectivity | ExtendedConnecticityCount |
|
|
20 ExtendedConnecticityBits*] [-n, --NeighborhoodRadius *number*]
|
|
21 [--OutDelim *comma | tab | semicolon*] [--output *SD | FP | text | all*]
|
|
22 [-o, --overwrite] [-q, --quote *Yes | No*] [-r, --root *RootName*] [-s,
|
|
23 --size *number*] [--UsePerlCoreRandom *Yes | No*] [-v,
|
|
24 --VectorStringFormat *IDsAndValuesString | IDsAndValuesPairsString |
|
|
25 ValuesAndIDsString | ValuesAndIDsPairsString*] [-w, --WorkingDir
|
|
26 dirname] SDFile(s)...
|
|
27
|
|
28 DESCRIPTION
|
|
29 Generate extended connectivity fingerprints [ Ref 48, Ref 52 ] for
|
|
30 *SDFile(s)* and create appropriate SD, FP or CSV/TSV text file(s)
|
|
31 containing fingerprints vector strings corresponding to molecular
|
|
32 fingerprints.
|
|
33
|
|
34 Multiple SDFile names are separated by spaces. The valid file extensions
|
|
35 are *.sdf* and *.sd*. All other file names are ignored. All the SD files
|
|
36 in a current directory can be specified either by **.sdf* or the current
|
|
37 directory name.
|
|
38
|
|
39 The current release of MayaChemTools supports generation of extended
|
|
40 connectivity fingerprints corresponding to following -a,
|
|
41 --AtomIdentifierTypes:
|
|
42
|
|
43 AtomicInvariantsAtomTypes, DREIDINGAtomTypes, EStateAtomTypes,
|
|
44 FunctionalClassAtomTypes, MMFF94AtomTypes, SLogPAtomTypes,
|
|
45 SYBYLAtomTypes, TPSAAtomTypes, UFFAtomTypes
|
|
46
|
|
47 Based on values specified for -a, --AtomIdentifierType,
|
|
48 --AtomicInvariantsToUse and --FunctionalClassesToUse, initial atom types
|
|
49 are assigned to all non-hydrogen atoms in a molecule and these atom
|
|
50 types strings are converted into initial atom identifier integers using
|
|
51 TextUtil::HashCode function. The duplicate atom identifiers are removed.
|
|
52
|
|
53 For -n, --NeighborhoodRadius value of *0*, the initial set of unique
|
|
54 atom identifiers comprises the molecule fingerprints. Otherwise, atom
|
|
55 neighborhoods are generated for each non-hydrogen atom up to specified
|
|
56 -n, --NeighborhoodRadius value. For each non-hydrogen central atom at a
|
|
57 specific radius, its neighbors at next radius level along with their
|
|
58 bond orders and previously calculated atom identifiers are collected
|
|
59 which in turn are used to generate a new integer atom identifier; the
|
|
60 bond orders and atom identifier pairs list is first sorted by bond order
|
|
61 followed by atom identifiers to make these values graph invariant.
|
|
62
|
|
63 After integer atom identifiers have been generated for all non-hydrogen
|
|
64 atoms at all specified neighborhood radii, the duplicate integer atom
|
|
65 identifiers corresponding to same hash code value generated using
|
|
66 TextUtil::HashCode are tracked by keeping the atom identifiers at lower
|
|
67 radius. Additionally, all structurally duplicate integer atom
|
|
68 identifiers at each specified radius are also tracked by identifying
|
|
69 equivalent atoms and bonds corresponding to substructures used for
|
|
70 generating atom identifier and keeping integer atom identifier with
|
|
71 lowest value.
|
|
72
|
|
73 For *ExtendedConnnectivity* value of fingerprints -m, --mode, the
|
|
74 duplicate identifiers are removed from the list and the unique atom
|
|
75 identifiers constitute the extended connectivity fingerprints of a
|
|
76 molecule.
|
|
77
|
|
78 For *ExtendedConnnectivityCount* value of fingerprints -m, --mode, the
|
|
79 occurrence of each unique atom identifiers appears is counted and the
|
|
80 unique atom identifiers along with their count constitute the extended
|
|
81 connectivity fingerprints of a molecule.
|
|
82
|
|
83 For *ExtendedConnectivityBits* value of fingerprints -m, --mode, the
|
|
84 unique atom identifiers are used as a random number seed to generate a
|
|
85 random integer value between 0 and --Size which in turn is used to set
|
|
86 corresponding bits in the fingerprint bit-vector string.
|
|
87
|
|
88 Example of *SD* file containing extended connectivity fingerprints
|
|
89 string data:
|
|
90
|
|
91 ... ...
|
|
92 ... ...
|
|
93 $$$$
|
|
94 ... ...
|
|
95 ... ...
|
|
96 ... ...
|
|
97 41 44 0 0 0 0 0 0 0 0999 V2000
|
|
98 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
|
|
99 ... ...
|
|
100 2 3 1 0 0 0 0
|
|
101 ... ...
|
|
102 M END
|
|
103 > <CmpdID>
|
|
104 Cmpd1
|
|
105
|
|
106 > <ExtendedConnectivityFingerprints>
|
|
107 FingerprintsVector;ExtendedConnectivity:AtomicInvariantsAtomTypes:Radiu
|
|
108 s2;60;AlphaNumericalValues;ValuesString;73555770 333564680 352413391 66
|
|
109 6191900 1001270906 1371674323 1481469939 1977749791 2006158649 21414087
|
|
110 99 49532520 64643108 79385615 96062769 273726379 564565671 855141035 90
|
|
111 6706094 988546669 1018231313 1032696425 1197507444 1331250018 133853...
|
|
112
|
|
113 $$$$
|
|
114 ... ...
|
|
115 ... ...
|
|
116
|
|
117 Example of *FP* file containing extended connectivity fingerprints
|
|
118 string data:
|
|
119
|
|
120 #
|
|
121 # Package = MayaChemTools 7.4
|
|
122 # Release Date = Oct 21, 2010
|
|
123 #
|
|
124 # TimeStamp = Fri Mar 11 14:43:57 2011
|
|
125 #
|
|
126 # FingerprintsStringType = FingerprintsVector
|
|
127 #
|
|
128 # Description = ExtendedConnectivity:AtomicInvariantsAtomTypes:Radius2
|
|
129 # VectorStringFormat = ValuesString
|
|
130 # VectorValuesType = AlphaNumericalValues
|
|
131 #
|
|
132 Cmpd1 60;73555770 333564680 352413391 666191900 1001270906 137167432...
|
|
133 Cmpd2 41;73555770 333564680 666191900 1142173602 1363635752 14814699...
|
|
134 ... ...
|
|
135 ... ..
|
|
136
|
|
137 Example of CSV *Text* file containing extended connectivity fingerprints
|
|
138 string data:
|
|
139
|
|
140 "CompoundID","ExtendedConnectivityFingerprints"
|
|
141 "Cmpd1","FingerprintsVector;ExtendedConnectivity:AtomicInvariantsAtomTy
|
|
142 pes:Radius2;60;AlphaNumericalValues;ValuesString;73555770 333564680 352
|
|
143 413391 666191900 1001270906 1371674323 1481469939 1977749791 2006158649
|
|
144 2141408799 49532520 64643108 79385615 96062769 273726379 564565671 8551
|
|
145 41035 906706094 988546669 1018231313 1032696425 1197507444 13312500..."
|
|
146 ... ...
|
|
147 ... ...
|
|
148
|
|
149 The current release of MayaChemTools generates the following types of
|
|
150 extended connectivity fingerprints vector strings:
|
|
151
|
|
152 FingerprintsVector;ExtendedConnectivity:AtomicInvariantsAtomTypes:Radi
|
|
153 us2;60;AlphaNumericalValues;ValuesString;73555770 333564680 352413391
|
|
154 666191900 1001270906 1371674323 1481469939 1977749791 2006158649 21414
|
|
155 08799 49532520 64643108 79385615 96062769 273726379 564565671 85514103
|
|
156 5 906706094 988546669 1018231313 1032696425 1197507444 1331250018 1338
|
|
157 532734 1455473691 1607485225 1609687129 1631614296 1670251330 17303...
|
|
158
|
|
159 FingerprintsVector;ExtendedConnectivityCount:AtomicInvariantsAtomTypes
|
|
160 :Radius2;60;NumericalValues;IDsAndValuesString;73555770 333564680 3524
|
|
161 13391 666191900 1001270906 1371674323 1481469939 1977749791 2006158649
|
|
162 2141408799 49532520 64643108 79385615 96062769 273726379 564565671...;
|
|
163 3 2 1 1 14 1 2 10 4 3 1 1 1 1 2 1 2 1 1 1 2 3 1 1 2 1 3 3 8 2 2 2 6 2
|
|
164 1 2 1 1 2 1 1 1 2 1 1 2 1 2 1 1 1 1 1 1 1 1 1 2 1 1
|
|
165
|
|
166 FingerprintsBitVector;ExtendedConnectivityBits:AtomicInvariantsAtomTyp
|
|
167 es:Radius2;1024;BinaryString;Ascending;0000000000000000000000000000100
|
|
168 0000000001010000000110000011000000000000100000000000000000000000100001
|
|
169 1000000110000000000000000000000000010011000000000000000000000000010000
|
|
170 0000000000000000000000000010000000000000000001000000000000000000000000
|
|
171 0000000000010000100001000000000000101000000000000000100000000000000...
|
|
172
|
|
173 FingerprintsBitVector;ExtendedConnectivityBits:AtomicInvariantsAtomTyp
|
|
174 es:Radius2;1024;HexadecimalString;Ascending;000000010050c0600800000803
|
|
175 0300000091000004000000020000100000000124008200020000000040020000000000
|
|
176 2080000000820040010020000000008040000000000080001000000000400000000000
|
|
177 4040000090000061010000000800200000000000001400000000020080000000000020
|
|
178 00008020200000408000
|
|
179
|
|
180 FingerprintsVector;ExtendedConnectivity:FunctionalClassAtomTypes:Radiu
|
|
181 s2;57;AlphaNumericalValues;ValuesString;24769214 508787397 850393286 8
|
|
182 62102353 981185303 1231636850 1649386610 1941540674 263599683 32920567
|
|
183 1 571109041 639579325 683993318 723853089 810600886 885767127 90326012
|
|
184 7 958841485 981022393 1126908698 1152248391 1317567065 1421489994 1455
|
|
185 632544 1557272891 1826413669 1983319256 2015750777 2029559552 20404...
|
|
186
|
|
187 FingerprintsVector;ExtendedConnectivityCount:FunctionalClassAtomTypes:
|
|
188 Radius2;57;NumericalValues;IDsAndValuesString;24769214 508787397 85039
|
|
189 3286 862102353 981185303 1231636850 1649386610 1941540674 263599683 32
|
|
190 9205671 571109041 639579325 683993318 723853089 810600886 885767127...;
|
|
191 1 1 1 10 2 22 3 1 3 3 1 1 1 3 2 2 1 2 2 2 3 1 1 1 1 1 14 1 1 1 1 1 1 2
|
|
192 1 2 1 1 2 2 1 1 2 1 1 1 2 1 1 2 1 1 1 1 1 1 1
|
|
193
|
|
194 FingerprintsBitVector;ExtendedConnectivityBits:FunctionalClassAtomType
|
|
195 s:Radius2;1024;BinaryString;Ascending;00000000000000000000100000000000
|
|
196 0000000001000100000000001000000000000000000000000000000000101000000010
|
|
197 0000001000000000010000000000000000000000000000000000000000000000000100
|
|
198 0000000000001000000000000001000000000001001000000000000000000000000000
|
|
199 0000000000000000100000000000001000000000000000000000000000000000000...
|
|
200
|
|
201 FingerprintsVector;ExtendedConnectivity:DREIDINGAtomTypes:Radius2;56;A
|
|
202 lphaNumericalValues;ValuesString;280305427 357928343 721790579 1151822
|
|
203 898 1207111054 1380963747 1568213839 1603445250 4559268 55012922 18094
|
|
204 0813 335715751 534801009 684609658 829361048 972945982 999881534 10076
|
|
205 55741 1213692591 1222032501 1224517934 1235687794 1244268533 152812070
|
|
206 0 1629595024 1856308891 1978806036 2001865095 2096549435 172675415 ...
|
|
207
|
|
208 FingerprintsVector;ExtendedConnectivity:EStateAtomTypes:Radius2;62;Alp
|
|
209 haNumericalValues;ValuesString;25189973 528584866 662581668 671034184
|
|
210 926543080 1347067490 1738510057 1759600920 2034425745 2097234755 21450
|
|
211 44754 96779665 180364292 341712110 345278822 386540408 387387308 50430
|
|
212 1706 617094135 771528807 957666640 997798220 1158349170 1291258082 134
|
|
213 1138533 1395329837 1420277211 1479584608 1486476397 1487556246 1566...
|
|
214
|
|
215 FingerprintsVector;ExtendedConnectivity:MMFF94AtomTypes:Radius2;64;Alp
|
|
216 haNumericalValues;ValuesString;224051550 746527773 998750766 103704190
|
|
217 2 1239701709 1248384926 1259447756 1521678386 1631549126 1909437580 20
|
|
218 37095052 2104274756 2117729376 8770364 31445800 81450228 314289324 344
|
|
219 041929 581773587 638555787 692022098 811840536 929651561 936421792 988
|
|
220 636432 1048624296 1054288509 1369487579 1454058929 1519352190 17271...
|
|
221
|
|
222 FingerprintsVector;ExtendedConnectivity:SLogPAtomTypes:Radius2;71;Alph
|
|
223 aNumericalValues;ValuesString;78989290 116507218 489454042 888737940 1
|
|
224 162561799 1241797255 1251494264 1263717127 1471206899 1538061784 17654
|
|
225 07295 1795036542 1809833874 2020454493 2055310842 2117729376 11868981
|
|
226 56731842 149505242 184525155 196984339 288181334 481409282 556716568 6
|
|
227 41915747 679881756 721736571 794256218 908276640 992898760 10987549...
|
|
228
|
|
229 FingerprintsVector;ExtendedConnectivity:SYBYLAtomTypes:Radius2;58;Alph
|
|
230 aNumericalValues;ValuesString;199957044 313356892 455463968 465982819
|
|
231 1225318176 1678585943 1883366064 1963811677 2117729376 113784599 19153
|
|
232 8837 196629033 263865277 416380653 477036669 681527491 730724924 90906
|
|
233 5537 1021959189 1133014972 1174311016 1359441203 1573452838 1661585138
|
|
234 1668649038 1684198062 1812312554 1859266290 1891651106 2072549404 ...
|
|
235
|
|
236 FingerprintsVector;ExtendedConnectivity:TPSAAtomTypes:Radius2;47;Alpha
|
|
237 NumericalValues;ValuesString;20818206 259344053 862102353 1331904542 1
|
|
238 700688206 265614156 363161397 681332588 810600886 885767127 950172500
|
|
239 951454814 1059668746 1247054493 1382302230 1399502637 1805025917 19189
|
|
240 39561 2114677228 2126402271 8130483 17645742 32278373 149975755 160327
|
|
241 654 256360355 279492740 291251259 317592700 333763396 972105960 101...
|
|
242
|
|
243 FingerprintsVector;ExtendedConnectivity:UFFAtomTypes:Radius2;56;AlphaN
|
|
244 umericalValues;ValuesString;280305427 357928343 721790579 1151822898 1
|
|
245 207111054 1380963747 1568213839 1603445250 4559268 55012922 180940813
|
|
246 335715751 534801009 684609658 829361048 972945982 999881534 1007655741
|
|
247 1213692591 1222032501 1224517934 1235687794 1244268533 1528120700 162
|
|
248 9595024 1856308891 1978806036 2001865095 2096549435 172675415 18344...
|
|
249
|
|
250 OPTIONS
|
|
251 --AromaticityModel *MDLAromaticityModel | TriposAromaticityModel |
|
|
252 MMFFAromaticityModel | ChemAxonBasicAromaticityModel |
|
|
253 ChemAxonGeneralAromaticityModel | DaylightAromaticityModel |
|
|
254 MayaChemToolsAromaticityModel*
|
|
255 Specify aromaticity model to use during detection of aromaticity.
|
|
256 Possible values in the current release are: *MDLAromaticityModel,
|
|
257 TriposAromaticityModel, MMFFAromaticityModel,
|
|
258 ChemAxonBasicAromaticityModel, ChemAxonGeneralAromaticityModel,
|
|
259 DaylightAromaticityModel or MayaChemToolsAromaticityModel*. Default
|
|
260 value: *MayaChemToolsAromaticityModel*.
|
|
261
|
|
262 The supported aromaticity model names along with model specific
|
|
263 control parameters are defined in AromaticityModelsData.csv, which
|
|
264 is distributed with the current release and is available under
|
|
265 lib/data directory. Molecule.pm module retrieves data from this file
|
|
266 during class instantiation and makes it available to method
|
|
267 DetectAromaticity for detecting aromaticity corresponding to a
|
|
268 specific model.
|
|
269
|
|
270 -a, --AtomIdentifierType *AtomicInvariantsAtomTypes |
|
|
271 FunctionalClassAtomTypes | DREIDINGAtomTypes | EStateAtomTypes |
|
|
272 MMFF94AtomTypes | SLogPAtomTypes | SYBYLAtomTypes | TPSAAtomTypes |
|
|
273 UFFAtomTypes*
|
|
274 Specify atom identifier type to use for assignment of initial atom
|
|
275 identifier to non-hydrogen atoms during calculation of extended
|
|
276 connectivity fingerprints [ Ref 48, Ref 52]. Possible values in the
|
|
277 current release are: *AtomicInvariantsAtomTypes,
|
|
278 FunctionalClassAtomTypes, DREIDINGAtomTypes, EStateAtomTypes,
|
|
279 MMFF94AtomTypes, SLogPAtomTypes, SYBYLAtomTypes, TPSAAtomTypes,
|
|
280 UFFAtomTypes*. Default value: *AtomicInvariantsAtomTypes*.
|
|
281
|
|
282 --AtomicInvariantsToUse *"AtomicInvariant,AtomicInvariant..."*
|
|
283 This value is used during *AtomicInvariantsAtomTypes* value of a,
|
|
284 --AtomIdentifierType option. It's a list of comma separated valid
|
|
285 atomic invariant atom types.
|
|
286
|
|
287 Possible values for atomic invarians are: *AS, X, BO, LBO, SB, DB,
|
|
288 TB, H, Ar, RA, FC, MN, SM*. Default value [ Ref 24 ]:
|
|
289 *AS,X,BO,H,FC,MN*.
|
|
290
|
|
291 The atomic invariants abbreviations correspond to:
|
|
292
|
|
293 AS = Atom symbol corresponding to element symbol
|
|
294
|
|
295 X<n> = Number of non-hydrogen atom neighbors or heavy atoms
|
|
296 BO<n> = Sum of bond orders to non-hydrogen atom neighbors or heavy atoms
|
|
297 LBO<n> = Largest bond order of non-hydrogen atom neighbors or heavy atoms
|
|
298 SB<n> = Number of single bonds to non-hydrogen atom neighbors or heavy atoms
|
|
299 DB<n> = Number of double bonds to non-hydrogen atom neighbors or heavy atoms
|
|
300 TB<n> = Number of triple bonds to non-hydrogen atom neighbors or heavy atoms
|
|
301 H<n> = Number of implicit and explicit hydrogens for atom
|
|
302 Ar = Aromatic annotation indicating whether atom is aromatic
|
|
303 RA = Ring atom annotation indicating whether atom is a ring
|
|
304 FC<+n/-n> = Formal charge assigned to atom
|
|
305 MN<n> = Mass number indicating isotope other than most abundant isotope
|
|
306 SM<n> = Spin multiplicity of atom. Possible values: 1 (singlet), 2 (doublet) or
|
|
307 3 (triplet)
|
|
308
|
|
309 Atom type generated by AtomTypes::AtomicInvariantsAtomTypes class
|
|
310 corresponds to:
|
|
311
|
|
312 AS.X<n>.BO<n>.LBO<n>.<SB><n>.<DB><n>.<TB><n>.H<n>.Ar.RA.FC<+n/-n>.MN<n>.SM<n>
|
|
313
|
|
314 Except for AS which is a required atomic invariant in atom types,
|
|
315 all other atomic invariants are optional. Atom type specification
|
|
316 doesn't include atomic invariants with zero or undefined values.
|
|
317
|
|
318 In addition to usage of abbreviations for specifying atomic
|
|
319 invariants, the following descriptive words are also allowed:
|
|
320
|
|
321 X : NumOfNonHydrogenAtomNeighbors or NumOfHeavyAtomNeighbors
|
|
322 BO : SumOfBondOrdersToNonHydrogenAtoms or SumOfBondOrdersToHeavyAtoms
|
|
323 LBO : LargestBondOrderToNonHydrogenAtoms or LargestBondOrderToHeavyAtoms
|
|
324 SB : NumOfSingleBondsToNonHydrogenAtoms or NumOfSingleBondsToHeavyAtoms
|
|
325 DB : NumOfDoubleBondsToNonHydrogenAtoms or NumOfDoubleBondsToHeavyAtoms
|
|
326 TB : NumOfTripleBondsToNonHydrogenAtoms or NumOfTripleBondsToHeavyAtoms
|
|
327 H : NumOfImplicitAndExplicitHydrogens
|
|
328 Ar : Aromatic
|
|
329 RA : RingAtom
|
|
330 FC : FormalCharge
|
|
331 MN : MassNumber
|
|
332 SM : SpinMultiplicity
|
|
333
|
|
334 *AtomTypes::AtomicInvariantsAtomTypes* module is used to assign
|
|
335 atomic invariant atom types.
|
|
336
|
|
337 --BitsOrder *Ascending | Descending*
|
|
338 Bits order to use during generation of fingerprints bit-vector
|
|
339 string for *ExtendedConnectivityBits* value of -m, --mode option.
|
|
340 Possible values: *Ascending, Descending*. Default: *Ascending*.
|
|
341
|
|
342 *Ascending* bit order which corresponds to first bit in each byte as
|
|
343 the lowest bit as opposed to the highest bit.
|
|
344
|
|
345 Internally, bits are stored in *Ascending* order using Perl vec
|
|
346 function. Regardless of machine order, big-endian or little-endian,
|
|
347 vec function always considers first string byte as the lowest byte
|
|
348 and first bit within each byte as the lowest bit.
|
|
349
|
|
350 -b, --BitStringFormat *BinaryString | HexadecimalString*
|
|
351 Format of fingerprints bit-vector string data in output SD, FP or
|
|
352 CSV/TSV text file(s) specified by --output used during
|
|
353 *ExtendedConnectivityBits* value of -m, --mode option. Possible
|
|
354 values: *BinaryString, HexadecimalString*. Default value:
|
|
355 *BinaryString*.
|
|
356
|
|
357 *BinaryString* corresponds to an ASCII string containing 1s and 0s.
|
|
358 *HexadecimalString* contains bit values in ASCII hexadecimal format.
|
|
359
|
|
360 Examples:
|
|
361
|
|
362 FingerprintsBitVector;ExtendedConnectivityBits:AtomicInvariantsAtomTyp
|
|
363 es:Radius2;1024;BinaryString;Ascending;0000000000000000000000000000100
|
|
364 0000000001010000000110000011000000000000100000000000000000000000100001
|
|
365 1000000110000000000000000000000000010011000000000000000000000000010000
|
|
366 0000000000000000000000000010000000000000000001000000000000000000000000
|
|
367 0000000000010000100001000000000000101000000000000000100000000000000...
|
|
368
|
|
369 FingerprintsBitVector;ExtendedConnectivityBits:FunctionalClassAtomType
|
|
370 s:Radius2;1024;BinaryString;Ascending;00000000000000000000100000000000
|
|
371 0000000001000100000000001000000000000000000000000000000000101000000010
|
|
372 0000001000000000010000000000000000000000000000000000000000000000000100
|
|
373 0000000000001000000000000001000000000001001000000000000000000000000000
|
|
374 0000000000000000100000000000001000000000000000000000000000000000000...
|
|
375
|
|
376 --FunctionalClassesToUse *"FunctionalClass1,FunctionalClass2..."*
|
|
377 This value is used during *FunctionalClassAtomTypes* value of a,
|
|
378 --AtomIdentifierType option. It's a list of comma separated valid
|
|
379 functional classes.
|
|
380
|
|
381 Possible values for atom functional classes are: *Ar, CA, H, HBA,
|
|
382 HBD, Hal, NI, PI, RA*. Default value [ Ref 24 ]:
|
|
383 *HBD,HBA,PI,NI,Ar,Hal*.
|
|
384
|
|
385 The functional class abbreviations correspond to:
|
|
386
|
|
387 HBD: HydrogenBondDonor
|
|
388 HBA: HydrogenBondAcceptor
|
|
389 PI : PositivelyIonizable
|
|
390 NI : NegativelyIonizable
|
|
391 Ar : Aromatic
|
|
392 Hal : Halogen
|
|
393 H : Hydrophobic
|
|
394 RA : RingAtom
|
|
395 CA : ChainAtom
|
|
396
|
|
397 Functional class atom type specification for an atom corresponds to:
|
|
398
|
|
399 Ar.CA.H.HBA.HBD.Hal.NI.PI.RA
|
|
400
|
|
401 *AtomTypes::FunctionalClassAtomTypes* module is used to assign
|
|
402 functional class atom types. It uses following definitions [ Ref
|
|
403 60-61, Ref 65-66 ]:
|
|
404
|
|
405 HydrogenBondDonor: NH, NH2, OH
|
|
406 HydrogenBondAcceptor: N[!H], O
|
|
407 PositivelyIonizable: +, NH2
|
|
408 NegativelyIonizable: -, C(=O)OH, S(=O)OH, P(=O)OH
|
|
409
|
|
410 --CompoundID *DataFieldName or LabelPrefixString*
|
|
411 This value is --CompoundIDMode specific and indicates how compound
|
|
412 ID is generated.
|
|
413
|
|
414 For *DataField* value of --CompoundIDMode option, it corresponds to
|
|
415 datafield label name whose value is used as compound ID; otherwise,
|
|
416 it's a prefix string used for generating compound IDs like
|
|
417 LabelPrefixString<Number>. Default value, *Cmpd*, generates compound
|
|
418 IDs which look like Cmpd<Number>.
|
|
419
|
|
420 Examples for *DataField* value of --CompoundIDMode:
|
|
421
|
|
422 MolID
|
|
423 ExtReg
|
|
424
|
|
425 Examples for *LabelPrefix* or *MolNameOrLabelPrefix* value of
|
|
426 --CompoundIDMode:
|
|
427
|
|
428 Compound
|
|
429
|
|
430 The value specified above generates compound IDs which correspond to
|
|
431 Compound<Number> instead of default value of Cmpd<Number>.
|
|
432
|
|
433 --CompoundIDLabel *text*
|
|
434 Specify compound ID column label for FP or CSV/TSV text file(s) used
|
|
435 during *CompoundID* value of --DataFieldsMode option. Default:
|
|
436 *CompoundID*.
|
|
437
|
|
438 --CompoundIDMode *DataField | MolName | LabelPrefix |
|
|
439 MolNameOrLabelPrefix*
|
|
440 Specify how to generate compound IDs and write to FP or CSV/TSV text
|
|
441 file(s) along with generated fingerprints for *FP | text | all*
|
|
442 values of --output option: use a *SDFile(s)* datafield value; use
|
|
443 molname line from *SDFile(s)*; generate a sequential ID with
|
|
444 specific prefix; use combination of both MolName and LabelPrefix
|
|
445 with usage of LabelPrefix values for empty molname lines.
|
|
446
|
|
447 Possible values: *DataField | MolName | LabelPrefix |
|
|
448 MolNameOrLabelPrefix*. Default: *LabelPrefix*.
|
|
449
|
|
450 For *MolNameAndLabelPrefix* value of --CompoundIDMode, molname line
|
|
451 in *SDFile(s)* takes precedence over sequential compound IDs
|
|
452 generated using *LabelPrefix* and only empty molname values are
|
|
453 replaced with sequential compound IDs.
|
|
454
|
|
455 This is only used for *CompoundID* value of --DataFieldsMode option.
|
|
456
|
|
457 --DataFields *"FieldLabel1,FieldLabel2,..."*
|
|
458 Comma delimited list of *SDFiles(s)* data fields to extract and
|
|
459 write to CSV/TSV text file(s) along with generated fingerprints for
|
|
460 *text | all* values of --output option.
|
|
461
|
|
462 This is only used for *Specify* value of --DataFieldsMode option.
|
|
463
|
|
464 Examples:
|
|
465
|
|
466 Extreg
|
|
467 MolID,CompoundName
|
|
468
|
|
469 -d, --DataFieldsMode *All | Common | Specify | CompoundID*
|
|
470 Specify how data fields in *SDFile(s)* are transferred to output
|
|
471 CSV/TSV text file(s) along with generated fingerprints for *text |
|
|
472 all* values of --output option: transfer all SD data field; transfer
|
|
473 SD data files common to all compounds; extract specified data
|
|
474 fields; generate a compound ID using molname line, a compound
|
|
475 prefix, or a combination of both. Possible values: *All | Common |
|
|
476 specify | CompoundID*. Default value: *CompoundID*.
|
|
477
|
|
478 -f, --Filter *Yes | No*
|
|
479 Specify whether to check and filter compound data in SDFile(s).
|
|
480 Possible values: *Yes or No*. Default value: *Yes*.
|
|
481
|
|
482 By default, compound data is checked before calculating fingerprints
|
|
483 and compounds containing atom data corresponding to non-element
|
|
484 symbols or no atom data are ignored.
|
|
485
|
|
486 --FingerprintsLabel *text*
|
|
487 SD data label or text file column label to use for fingerprints
|
|
488 string in output SD or CSV/TSV text file(s) specified by --output.
|
|
489 Default value: *ExtendedConnectivityFingerprints*.
|
|
490
|
|
491 -h, --help
|
|
492 Print this help message.
|
|
493
|
|
494 -k, --KeepLargestComponent *Yes | No*
|
|
495 Generate fingerprints for only the largest component in molecule.
|
|
496 Possible values: *Yes or No*. Default value: *Yes*.
|
|
497
|
|
498 For molecules containing multiple connected components, fingerprints
|
|
499 can be generated in two different ways: use all connected components
|
|
500 or just the largest connected component. By default, all atoms
|
|
501 except for the largest connected component are deleted before
|
|
502 generation of fingerprints.
|
|
503
|
|
504 -m, --mode *ExtendedConnectivity | ExtendedConnectivityCount |
|
|
505 ExtendedConnectivityBits*
|
|
506 Specify type of extended connectivity fingerprints to generate for
|
|
507 molecules in *SDFile(s)*. Possible values: *ExtendedConnectivity,
|
|
508 ExtendedConnecticityCount or ExtendedConnectivityBits*. Default
|
|
509 value: *ExtendedConnectivity*.
|
|
510
|
|
511 For *ExtendedConnnectivity* value of fingerprints -m, --mode, a
|
|
512 fingerprint vector containing unique atom identifiers constitute the
|
|
513 extended connectivity fingerprints of a molecule.
|
|
514
|
|
515 For *ExtendedConnnectivityCount* value of fingerprints -m, --mode, a
|
|
516 fingerprint vector containing unique atom identifiers along with
|
|
517 their count constitute the extended connectivity fingerprints of a
|
|
518 molecule.
|
|
519
|
|
520 For *ExtendedConnnectivityBits* value of fingerprints -m, --mode, a
|
|
521 fingerprint bit vector indicating presence/absence of structurally
|
|
522 unique atom identifiers constitute the extended connectivity
|
|
523 fingerprints of a molecule.
|
|
524
|
|
525 -n, --NeighborhoodRadius *number*
|
|
526 Atomic neighborhood radius for generating extended connectivity
|
|
527 neighborhoods. Default value: *2*. Valid values: >= 0. Neighborhood
|
|
528 radius of zero correspond to just the list of non-hydrogen atoms.
|
|
529
|
|
530 Default value of *2* for atomic neighborhood radius generates
|
|
531 extended connectivity fingerprints corresponding to path length or
|
|
532 diameter value of *4* [ Ref 52b ].
|
|
533
|
|
534 --OutDelim *comma | tab | semicolon*
|
|
535 Delimiter for output CSV/TSV text file(s). Possible values: *comma,
|
|
536 tab, or semicolon* Default value: *comma*.
|
|
537
|
|
538 --output *SD | FP | text | all*
|
|
539 Type of output files to generate. Possible values: *SD, FP, text, or
|
|
540 all*. Default value: *text*.
|
|
541
|
|
542 -o, --overwrite
|
|
543 Overwrite existing files.
|
|
544
|
|
545 -q, --quote *Yes | No*
|
|
546 Put quote around column values in output CSV/TSV text file(s).
|
|
547 Possible values: *Yes or No*. Default value: *Yes*.
|
|
548
|
|
549 -r, --root *RootName*
|
|
550 New file name is generated using the root: <Root>.<Ext>. Default for
|
|
551 new file names: <SDFileName><ExtendedConnectivityFP>.<Ext>. The file
|
|
552 type determines <Ext> value. The sdf, fpf, csv, and tsv <Ext> values
|
|
553 are used for SD, FP, comma/semicolon, and tab delimited text files,
|
|
554 respectively.This option is ignored for multiple input files.
|
|
555
|
|
556 -s, --size *number*
|
|
557 Size of bit-vector to use during generation of fingerprints
|
|
558 bit-vector string for *ExtendedConnectivityBits* value of -m,
|
|
559 --mode. Default value: *1024*. Valid values correspond to any
|
|
560 positive integer which satisfies the following criteria: power of 2,
|
|
561 >= 32 and <= 2 ** 32.
|
|
562
|
|
563 Examples:
|
|
564
|
|
565 512
|
|
566 1024
|
|
567 2048
|
|
568
|
|
569 --UsePerlCoreRandom *Yes | No*
|
|
570 Specify whether to use Perl CORE::rand or MayaChemTools
|
|
571 MathUtil::random function during random number generation for
|
|
572 setting bits in fingerprints bit-vector strings. Possible values:
|
|
573 *Yes or No*. Default value: *Yes*.
|
|
574
|
|
575 *No* value option for --UsePerlCoreRandom allows the generation of
|
|
576 fingerprints bit-vector strings which are same across different
|
|
577 platforms.
|
|
578
|
|
579 The random number generator implemented in MayaChemTools is a
|
|
580 variant of linear congruential generator (LCG) as described by
|
|
581 Miller et al. [ Ref 120 ]. It is also referred to as Lehmer random
|
|
582 number generator or Park-Miller random number generator.
|
|
583
|
|
584 Unlike Perl's core random number generator function rand, the random
|
|
585 number generator implemented in MayaChemTools, MathUtil::random,
|
|
586 generates consistent random values across different platforms for a
|
|
587 specific random seed and leads to generation of portable
|
|
588 fingerprints bit-vector strings.
|
|
589
|
|
590 -v, --VectorStringFormat *ValuesString | IDsAndValuesString |
|
|
591 IDsAndValuesPairsString | ValuesAndIDsString | ValuesAndIDsPairsString*
|
|
592 Format of fingerprints vector string data in output SD, FP or
|
|
593 CSV/TSV text file(s) specified by --output used during
|
|
594 <ExtendedConnectivityCount> value of -m, --mode option. Possible
|
|
595 values: *ValuesString, IDsAndValuesString | IDsAndValuesPairsString
|
|
596 | ValuesAndIDsString | ValuesAndIDsPairsString*.
|
|
597
|
|
598 Default value during <ExtendedConnectivityCount> value of -m, --mode
|
|
599 option: *IDsAndValuesString*.
|
|
600
|
|
601 Default value during <ExtendedConnectivity> value of -m, --mode
|
|
602 option: *ValuesString*.
|
|
603
|
|
604 Examples:
|
|
605
|
|
606 FingerprintsVector;ExtendedConnectivity:AtomicInvariantsAtomTypes:Radi
|
|
607 us2;60;AlphaNumericalValues;ValuesString;73555770 333564680 352413391
|
|
608 666191900 1001270906 1371674323 1481469939 1977749791 2006158649 21414
|
|
609 08799 49532520 64643108 79385615 96062769 273726379 564565671 85514103
|
|
610 5 906706094 988546669 1018231313 1032696425 1197507444 1331250018 1338
|
|
611 532734 1455473691 1607485225 1609687129 1631614296 1670251330 17303...
|
|
612
|
|
613 FingerprintsVector;ExtendedConnectivityCount:AtomicInvariantsAtomTypes
|
|
614 :Radius2;60;NumericalValues;IDsAndValuesString;73555770 333564680 3524
|
|
615 13391 666191900 1001270906 1371674323 1481469939 1977749791 2006158649
|
|
616 2141408799 49532520 64643108 79385615 96062769 273726379 564565671...;
|
|
617 3 2 1 1 14 1 2 10 4 3 1 1 1 1 2 1 2 1 1 1 2 3 1 1 2 1 3 3 8 2 2 2 6 2
|
|
618 1 2 1 1 2 1 1 1 2 1 1 2 1 2 1 1 1 1 1 1 1 1 1 2 1 1
|
|
619
|
|
620 -w, --WorkingDir *DirName*
|
|
621 Location of working directory. Default: current directory.
|
|
622
|
|
623 EXAMPLES
|
|
624 To generate extended connectivity fingerprints corresponding to
|
|
625 neighborhood radius up to 2 using atomic invariants atom types in vector
|
|
626 string format and create a SampleECAIFP.csv file containing sequential
|
|
627 compound IDs along with fingerprints vector strings data, type:
|
|
628
|
|
629 % ExtendedConnectivityFingerprints.pl -r SampleECAIFP -o Sample.sdf
|
|
630
|
|
631 To generate extended connectivity count fingerprints corresponding to
|
|
632 neighborhood radius up to 2 using atomic invariants atom types in vector
|
|
633 string format and create a SampleECAIFP.csv file containing sequential
|
|
634 compound IDs along with fingerprints vector strings data, type:
|
|
635
|
|
636 % ExtendedConnectivityFingerprints.pl -m ExtendedConnectivityCount
|
|
637 -r SampleECAIFP -o Sample.sdf
|
|
638
|
|
639 To generate extended connectivity bits fingerprints as hexadecimal
|
|
640 bit-string corresponding to neighborhood radius up to 2 using atomic
|
|
641 invariants atom types in vector string format and create a
|
|
642 SampleECAIFP.csv file containing sequential compound IDs along with
|
|
643 fingerprints vector strings data, type:
|
|
644
|
|
645 % ExtendedConnectivityFingerprints.pl -m ExtendedConnectivityBits
|
|
646 -r SampleECAIFP -o Sample.sdf
|
|
647
|
|
648 To generate extended connectivity bits fingerprints as binary bit-string
|
|
649 corresponding to neighborhood radius up to 2 using atomic invariants
|
|
650 atom types in vector string format and create a SampleECAIFP.csv file
|
|
651 containing sequential compound IDs along with fingerprints vector
|
|
652 strings data, type:
|
|
653
|
|
654 % ExtendedConnectivityFingerprints.pl -m ExtendedConnectivityBits
|
|
655 --BitStringFormat BinaryString -r SampleECAIFP -o Sample.sdf
|
|
656
|
|
657 To generate extended connectivity fingerprints corresponding to
|
|
658 neighborhood radius up to 2 using atomic invariants atom types in vector
|
|
659 string format and create SampleECAIFP.sdf, SampleECAIFP.fpf and
|
|
660 SampleECAIFP.csv files containing sequential compound IDs in CSV file
|
|
661 along with fingerprints vector strings data, type:
|
|
662
|
|
663 % ExtendedConnectivityFingerprints.pl --output all -r SampleECAIFP
|
|
664 -o Sample.sdf
|
|
665
|
|
666 To generate extended connectivity count fingerprints corresponding to
|
|
667 neighborhood radius up to 2 using atomic invariants atom types in vector
|
|
668 string format and create SampleECAIFP.sdf, SampleECAIFP.fpf and
|
|
669 SampleECAIFP.csv files containing sequential compound IDs in CSV file
|
|
670 along with fingerprints vector strings data, type:
|
|
671
|
|
672 % ExtendedConnectivityFingerprints.pl -m ExtendedConnectivityCount
|
|
673 --output all -r SampleECAIFP -o Sample.sdf
|
|
674
|
|
675 To generate extended connectivity fingerprints corresponding to
|
|
676 neighborhood radius up to 2 using functional class atom types in vector
|
|
677 string format and create a SampleECFCFP.csv file containing sequential
|
|
678 compound IDs along with fingerprints vector strings data, type:
|
|
679
|
|
680 % ExtendedConnectivityFingerprints.pl -a FunctionalClassAtomTypes
|
|
681 -r SampleECFCFP -o Sample.sdf
|
|
682
|
|
683 To generate extended connectivity fingerprints corresponding to
|
|
684 neighborhood radius up to 2 using DREIDING atom types in vector string
|
|
685 format and create a SampleECFP.csv file containing sequential compound
|
|
686 IDs along with fingerprints vector strings data, type:
|
|
687
|
|
688 % ExtendedConnectivityFingerprints.pl -a DREIDINGAtomTypes
|
|
689 -r SampleECFP -o Sample.sdf
|
|
690
|
|
691 To generate extended connectivity fingerprints corresponding to
|
|
692 neighborhood radius up to 2 using E-state atom types in vector string
|
|
693 format and create a SampleECFP.csv file containing sequential compound
|
|
694 IDs along with fingerprints vector strings data, type:
|
|
695
|
|
696 % ExtendedConnectivityFingerprints.pl -a EStateAtomTypes
|
|
697 -r SampleECFP -o Sample.sdf
|
|
698
|
|
699 To generate extended connectivity fingerprints corresponding to
|
|
700 neighborhood radius up to 2 using MMFF94 atom types in vector string
|
|
701 format and create a SampleECFP.csv file containing sequential compound
|
|
702 IDs along with fingerprints vector strings data, type:
|
|
703
|
|
704 % ExtendedConnectivityFingerprints.pl -a MMFF94AtomTypes
|
|
705 -r SampleECFP -o Sample.sdf
|
|
706
|
|
707 To generate extended connectivity fingerprints corresponding to
|
|
708 neighborhood radius up to 2 using SLogP atom types in vector string
|
|
709 format and create a SampleECFP.csv file containing sequential compound
|
|
710 IDs along with fingerprints vector strings data, type:
|
|
711
|
|
712 % ExtendedConnectivityFingerprints.pl -a SLogPAtomTypes
|
|
713 -r SampleECFP -o Sample.sdf
|
|
714
|
|
715 To generate extended connectivity fingerprints corresponding to
|
|
716 neighborhood radius up to 2 using SYBYL atom types in vector string
|
|
717 format and create a SampleECFP.csv file containing sequential compound
|
|
718 IDs along with fingerprints vector strings data, type:
|
|
719
|
|
720 % ExtendedConnectivityFingerprints.pl -a SYBYLAtomTypes
|
|
721 -r SampleECFP -o Sample.sdf
|
|
722
|
|
723 To generate extended connectivity fingerprints corresponding to
|
|
724 neighborhood radius up to 2 using TPSA atom types in vector string
|
|
725 format and create a SampleECFP.csv file containing sequential compound
|
|
726 IDs along with fingerprints vector strings data, type:
|
|
727
|
|
728 % ExtendedConnectivityFingerprints.pl -a TPSAAtomTypes
|
|
729 -r SampleECFP -o Sample.sdf
|
|
730
|
|
731 To generate extended connectivity fingerprints corresponding to
|
|
732 neighborhood radius up to 2 using UFF atom types in vector string format
|
|
733 and create a SampleECFP.csv file containing sequential compound IDs
|
|
734 along with fingerprints vector strings data, type:
|
|
735
|
|
736 % ExtendedConnectivityFingerprints.pl -a UFFAtomTypes
|
|
737 -r SampleECFP -o Sample.sdf
|
|
738
|
|
739 To generate extended connectivity fingerprints corresponding to
|
|
740 neighborhood radius up to 3 using atomic invariants atom types in vector
|
|
741 string format and create a SampleECAIFP.csv file containing sequential
|
|
742 compound IDs along with fingerprints vector strings data, type:
|
|
743
|
|
744 % ExtendedConnectivityFingerprints.pl -a AtomicInvariantsAtomTypes -n 3
|
|
745 -r SampleECAIFP -o Sample.sdf
|
|
746
|
|
747 To generate extended connectivity fingerprints corresponding to
|
|
748 neighborhood radius up to 3 using functional class atom types in vector
|
|
749 string format and create a SampleECFCFP.csv file containing sequential
|
|
750 compound IDs along with fingerprints vector strings data, type:
|
|
751
|
|
752 % ExtendedConnectivityFingerprints.pl -a FunctionalClassAtomTypes -n 3
|
|
753 -r SampleECFCFP -o Sample.sdf
|
|
754
|
|
755 To generate extended connectivity fingerprints corresponding to
|
|
756 neighborhood radius up to 2 using only AS,X atomic invariants atom types
|
|
757 in vector string format and create a SampleECAIFP.csv file containing
|
|
758 sequential compound IDs along with fingerprints vector strings data,
|
|
759 type:
|
|
760
|
|
761 % ExtendedConnectivityFingerprints.pl -a AtomicInvariantsAtomTypes
|
|
762 --AtomicInvariantsToUse "AS,X" -r SampleECAIFP -o Sample.sdf
|
|
763
|
|
764 To generate extended connectivity fingerprints corresponding to
|
|
765 neighborhood radius up to 2 using only HBD,HBA functional class atom
|
|
766 types in vector string format and create a SampleECFCFP.csv file
|
|
767 containing sequential compound IDs along with fingerprints vector
|
|
768 strings data, type:
|
|
769
|
|
770 % ExtendedConnectivityFingerprints.pl -a FunctionalClassAtomTypes
|
|
771 --FunctionalClassesToUse "HBD,HBA" -r SampleECFCFP -o Sample.sdf
|
|
772
|
|
773 To generate extended connectivity fingerprints corresponding to
|
|
774 neighborhood radius up to 2 using atomic invariants atom types in vector
|
|
775 string format and create a SampleECAIFP.csv file containing compound ID
|
|
776 from molecule name line along with fingerprints vector strings data,
|
|
777 type:
|
|
778
|
|
779 % ExtendedConnectivityFingerprints.pl -a AtomicInvariantsAtomTypes
|
|
780 --DataFieldsMode CompoundID -CompoundIDMode MolName
|
|
781 -r SampleECAIFP -o Sample.sdf
|
|
782
|
|
783 To generate extended connectivity fingerprints corresponding to
|
|
784 neighborhood radius up to 2 using functional class atom types in vector
|
|
785 string format and create a SampleECFCFP.csv file containing compound IDs
|
|
786 using specified data field along with fingerprints vector strings data,
|
|
787 type:
|
|
788
|
|
789 % ExtendedConnectivityFingerprints.pl -a FunctionalClassAtomTypes
|
|
790 --DataFieldsMode CompoundID -CompoundIDMode DataField --CompoundID Mol_ID
|
|
791 -r SampleECFCFP -o Sample.sdf
|
|
792
|
|
793 To generate extended connectivity fingerprints corresponding to
|
|
794 neighborhood radius up to 2 using atomic invariants atom types in vector
|
|
795 string format and create a SampleECAIFP.tsv file containing compound ID
|
|
796 using combination of molecule name line and an explicit compound prefix
|
|
797 along with fingerprints vector strings data, type:
|
|
798
|
|
799 % ExtendedConnectivityFingerprints.pl -a AtomicInvariantsAtomTypes
|
|
800 --DataFieldsMode CompoundID -CompoundIDMode MolnameOrLabelPrefix
|
|
801 --CompoundID Cmpd --CompoundIDLabel MolID -r SampleECAIFP -o Sample.sdf
|
|
802
|
|
803 To generate extended connectivity fingerprints corresponding to
|
|
804 neighborhood radius up to 2 using functional class atom types in vector
|
|
805 string format and create a SampleECFCFP.csv file containing specific
|
|
806 data fields columns along with fingerprints vector strings data, type:
|
|
807
|
|
808 % ExtendedConnectivityFingerprints.pl -a FunctionalClassAtomTypes
|
|
809 --DataFieldsMode Specify --DataFields Mol_ID -r SampleECFCFP
|
|
810 -o Sample.sdf
|
|
811
|
|
812 To generate extended connectivity fingerprints corresponding to
|
|
813 neighborhood radius up to 2 using atomic invariants atom types in vector
|
|
814 string format and create a SampleECAIFP.tsv file containing common data
|
|
815 fields columns along with fingerprints vector strings data, type:
|
|
816
|
|
817 % ExtendedConnectivityFingerprints.pl -a AtomicInvariantsAtomTypes
|
|
818 --DataFieldsMode Common -r SampleECAIFP -o Sample.sdf
|
|
819
|
|
820 To generate extended connectivity fingerprints corresponding to
|
|
821 neighborhood radius up to 2 using functional class atom types in vector
|
|
822 string format and create SampleECFCFP.sdf, SampleECFCFP.fpf and
|
|
823 SampleECFCFP.csv files containing all data fields columns in CSV file
|
|
824 along with fingerprints vector strings data, type:
|
|
825
|
|
826 % ExtendedConnectivityFingerprints.pl -a FunctionalClassAtomTypes
|
|
827 --DataFieldsMode All --output all -r SampleECFCFP
|
|
828 -o Sample.sdf
|
|
829
|
|
830 AUTHOR
|
|
831 Manish Sud <msud@san.rr.com>
|
|
832
|
|
833 SEE ALSO
|
|
834 InfoFingerprintsFiles.pl, SimilarityMatricesFingerprints.pl,
|
|
835 AtomNeighborhoodsFingerprints.pl, MACCSKeysFingerprints.pl,
|
|
836 PathLengthFingerprints.pl, TopologicalAtomPairsFingerprints.pl,
|
|
837 TopologicalAtomTorsionsFingerprints.pl,
|
|
838 TopologicalPharmacophoreAtomPairsFingerprints.pl,
|
|
839 TopologicalPharmacophoreAtomTripletsFingerprints.pl
|
|
840
|
|
841 COPYRIGHT
|
|
842 Copyright (C) 2015 Manish Sud. All rights reserved.
|
|
843
|
|
844 This file is part of MayaChemTools.
|
|
845
|
|
846 MayaChemTools is free software; you can redistribute it and/or modify it
|
|
847 under the terms of the GNU Lesser General Public License as published by
|
|
848 the Free Software Foundation; either version 3 of the License, or (at
|
|
849 your option) any later version.
|
|
850
|