0
|
1 NAME
|
|
2 FingerprintsVector
|
|
3
|
|
4 SYNOPSIS
|
|
5 use Fingerprints::FingerprintsVector;
|
|
6
|
|
7 use Fingerprints::FingerprintsVector qw(:all);
|
|
8
|
|
9 DESCRIPTION
|
|
10 FingerprintsVector class provides the following methods:
|
|
11
|
|
12 new, AddValueIDs, AddValues, CityBlockDistanceCoefficient,
|
|
13 CosineSimilarityCoefficient, CzekanowskiSimilarityCoefficient,
|
|
14 DiceSimilarityCoefficient, EuclideanDistanceCoefficient, GetDescription,
|
|
15 GetFingerprintsVectorString, GetID, GetIDsAndValuesPairsString,
|
|
16 GetIDsAndValuesString, GetNumOfNonZeroValues, GetNumOfValueIDs,
|
|
17 GetNumOfValues, GetSupportedDistanceAndSimilarityCoefficients,
|
|
18 GetSupportedDistanceCoefficients, GetSupportedSimilarityCoefficients,
|
|
19 GetType, GetValue, GetValueID, GetValueIDs, GetValueIDsString,
|
|
20 GetValues, GetValuesAndIDsPairsString, GetValuesAndIDsString,
|
|
21 GetValuesString, GetVectorType, HammingDistanceCoefficient,
|
|
22 IsFingerprintsVector, JaccardSimilarityCoefficient,
|
|
23 ManhattanDistanceCoefficient, NewFromIDsAndValuesPairsString,
|
|
24 NewFromIDsAndValuesString, NewFromValuesAndIDsPairsString,
|
|
25 NewFromValuesAndIDsString, NewFromValuesString,
|
|
26 OchiaiSimilarityCoefficient, SetDescription, SetID, SetType, SetValue,
|
|
27 SetValueID, SetValueIDs, SetValues, SetVectorType,
|
|
28 SoergelDistanceCoefficient, SorensonSimilarityCoefficient,
|
|
29 StringifyFingerprintsVector, TanimotoSimilarityCoefficient
|
|
30
|
|
31 The methods available to create fingerprints vector from strings and to
|
|
32 calculate similarity and distance coefficients between two vectors can
|
|
33 also be invoked as class functions.
|
|
34
|
|
35 FingerprintsVector class provides support to perform comparison between
|
|
36 vectors containing three different types of values:
|
|
37
|
|
38 Type I: OrderedNumericalValues
|
|
39
|
|
40 o Size of two vectors are same
|
|
41 o Vectors contain real values in a specific order. For example: MACCS keys
|
|
42 count, Topological pharmacophore atom pairs and so on.
|
|
43
|
|
44 Type II: UnorderedNumericalValues
|
|
45
|
|
46 o Size of two vectors might not be same
|
|
47 o Vectors contain unordered real value identified by value IDs. For example:
|
|
48 Topological atom pairs, Topological atom torsions and so on
|
|
49
|
|
50 Type III: AlphaNumericalValues
|
|
51
|
|
52 o Size of two vectors might not be same
|
|
53 o Vectors contain unordered alphanumerical values. For example: Extended
|
|
54 connectivity fingerprints, atom neighborhood fingerprints.
|
|
55
|
|
56 Before performing similarity or distance calculations between vectors
|
|
57 containing UnorderedNumericalValues or AlphaNumericalValues, the vectors
|
|
58 are transformed into vectors containing unique OrderedNumericalValues
|
|
59 using value IDs for UnorderedNumericalValues and values itself for
|
|
60 AlphaNumericalValues.
|
|
61
|
|
62 Three forms of similarity and distance calculation between two vectors,
|
|
63 specified using CalculationMode option, are supported: *AlgebraicForm,
|
|
64 BinaryForm or SetTheoreticForm*.
|
|
65
|
|
66 For *BinaryForm*, the ordered list of processed final vector values
|
|
67 containing the value or count of each unique value type is simply
|
|
68 converted into a binary vector containing 1s and 0s corresponding to
|
|
69 presence or absence of values before calculating similarity or distance
|
|
70 between two vectors.
|
|
71
|
|
72 For two fingerprint vectors A and B of same size containing
|
|
73 OrderedNumericalValues, let:
|
|
74
|
|
75 N = Number values in A or B
|
|
76
|
|
77 Xa = Values of vector A
|
|
78 Xb = Values of vector B
|
|
79
|
|
80 Xai = Value of ith element in A
|
|
81 Xbi = Value of ith element in B
|
|
82
|
|
83 SUM = Sum of i over N values
|
|
84
|
|
85 For SetTheoreticForm of calculation between two vectors, let:
|
|
86
|
|
87 SetIntersectionXaXb = SUM ( MIN ( Xai, Xbi ) )
|
|
88 SetDifferenceXaXb = SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN ( Xai, Xbi ) )
|
|
89
|
|
90 For BinaryForm of calculation between two vectors, let:
|
|
91
|
|
92 Na = Number of bits set to "1" in A = SUM ( Xai )
|
|
93 Nb = Number of bits set to "1" in B = SUM ( Xbi )
|
|
94 Nc = Number of bits set to "1" in both A and B = SUM ( Xai * Xbi )
|
|
95 Nd = Number of bits set to "0" in both A and B
|
|
96 = SUM ( 1 - Xai - Xbi + Xai * Xbi)
|
|
97
|
|
98 N = Number of bits set to "1" or "0" in A or B = Size of A or B = Na + Nb - Nc + Nd
|
|
99
|
|
100 Additionally, for BinaryForm various values also correspond to:
|
|
101
|
|
102 Na = | Xa |
|
|
103 Nb = | Xb |
|
|
104 Nc = | SetIntersectionXaXb |
|
|
105 Nd = N - | SetDifferenceXaXb |
|
|
106
|
|
107 | SetDifferenceXaXb | = N - Nd = Na + Nb - Nc + Nd - Nd = Na + Nb - Nc
|
|
108 = | Xa | + | Xb | - | SetIntersectionXaXb |
|
|
109
|
|
110 Various similarity and distance coefficients [ Ref 40, Ref 62, Ref 64 ]
|
|
111 for a pair of vectors A and B in *AlgebraicForm, BinaryForm and
|
|
112 SetTheoreticForm* are defined as follows:
|
|
113
|
|
114 CityBlockDistance: ( same as HammingDistance and ManhattanDistance)
|
|
115
|
|
116 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) )
|
|
117
|
|
118 *BinaryForm*: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc
|
|
119
|
|
120 *SetTheoreticForm*: | SetDifferenceXaXb | - | SetIntersectionXaXb | =
|
|
121 SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )
|
|
122
|
|
123 CosineSimilarity: ( same as OchiaiSimilarityCoefficient)
|
|
124
|
|
125 *AlgebraicForm*: SUM ( Xai * Xbi ) / SQRT ( SUM ( Xai ** 2) * SUM ( Xbi
|
|
126 ** 2) )
|
|
127
|
|
128 *BinaryForm*: Nc / SQRT ( Na * Nb)
|
|
129
|
|
130 *SetTheoreticForm*: | SetIntersectionXaXb | / SQRT ( |Xa| * |Xb| ) = SUM
|
|
131 ( MIN ( Xai, Xbi ) ) / SQRT ( SUM ( Xai ) * SUM ( Xbi ) )
|
|
132
|
|
133 CzekanowskiSimilarity: ( same as DiceSimilarity and SorensonSimilarity)
|
|
134
|
|
135 *AlgebraicForm*: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) + SUM
|
|
136 ( Xbi **2 ) )
|
|
137
|
|
138 *BinaryForm*: 2 * Nc / ( Na + Nb )
|
|
139
|
|
140 *SetTheoreticForm*: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) = 2 *
|
|
141 ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )
|
|
142
|
|
143 DiceSimilarity: ( same as CzekanowskiSimilarity and SorensonSimilarity)
|
|
144
|
|
145 *AlgebraicForm*: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) + SUM
|
|
146 ( Xbi **2 ) )
|
|
147
|
|
148 *BinaryForm*: 2 * Nc / ( Na + Nb )
|
|
149
|
|
150 *SetTheoreticForm*: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) = 2 *
|
|
151 ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )
|
|
152
|
|
153 EuclideanDistance:
|
|
154
|
|
155 *AlgebraicForm*: SQRT ( SUM ( ( ( Xai - Xbi ) ** 2 ) ) )
|
|
156
|
|
157 *BinaryForm*: SQRT ( ( Na - Nc ) + ( Nb - Nc ) ) = SQRT ( Na + Nb - 2 *
|
|
158 Nc )
|
|
159
|
|
160 *SetTheoreticForm*: SQRT ( | SetDifferenceXaXb | - | SetIntersectionXaXb
|
|
161 | ) = SQRT ( SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) )
|
|
162 ) )
|
|
163
|
|
164 HammingDistance: ( same as CityBlockDistance and ManhattanDistance)
|
|
165
|
|
166 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) )
|
|
167
|
|
168 *BinaryForm*: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc
|
|
169
|
|
170 *SetTheoreticForm*: | SetDifferenceXaXb | - | SetIntersectionXaXb | =
|
|
171 SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )
|
|
172
|
|
173 JaccardSimilarity: ( same as TanimotoSimilarity)
|
|
174
|
|
175 *AlgebraicForm*: SUM ( Xai * Xbi ) / ( SUM ( Xai ** 2 ) + SUM ( Xbi ** 2
|
|
176 ) - SUM ( Xai * Xbi ) )
|
|
177
|
|
178 *BinaryForm*: Nc / ( ( Na - Nc ) + ( Nb - Nc ) + Nc ) = Nc / ( Na + Nb -
|
|
179 Nc )
|
|
180
|
|
181 *SetTheoreticForm*: | SetIntersectionXaXb | / | SetDifferenceXaXb | =
|
|
182 SUM ( MIN ( Xai, Xbi ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN (
|
|
183 Xai, Xbi ) ) )
|
|
184
|
|
185 ManhattanDistance: ( same as CityBlockDistance and HammingDistance)
|
|
186
|
|
187 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) )
|
|
188
|
|
189 *BinaryForm*: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc
|
|
190
|
|
191 *SetTheoreticForm*: | SetDifferenceXaXb | - | SetIntersectionXaXb | =
|
|
192 SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )
|
|
193
|
|
194 OchiaiSimilarity: ( same as CosineSimilarity)
|
|
195
|
|
196 *AlgebraicForm*: SUM ( Xai * Xbi ) / SQRT ( SUM ( Xai ** 2) * SUM ( Xbi
|
|
197 ** 2) )
|
|
198
|
|
199 *BinaryForm*: Nc / SQRT ( Na * Nb)
|
|
200
|
|
201 *SetTheoreticForm*: | SetIntersectionXaXb | / SQRT ( |Xa| * |Xb| ) = SUM
|
|
202 ( MIN ( Xai, Xbi ) ) / SQRT ( SUM ( Xai ) * SUM ( Xbi ) )
|
|
203
|
|
204 SorensonSimilarity: ( same as CzekanowskiSimilarity and DiceSimilarity)
|
|
205
|
|
206 *AlgebraicForm*: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) + SUM
|
|
207 ( Xbi **2 ) )
|
|
208
|
|
209 *BinaryForm*: 2 * Nc / ( Na + Nb )
|
|
210
|
|
211 *SetTheoreticForm*: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) = 2 *
|
|
212 ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )
|
|
213
|
|
214 SoergelDistance:
|
|
215
|
|
216 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) ) / SUM ( MAX ( Xai, Xbi ) )
|
|
217
|
|
218 *BinaryForm*: 1 - Nc / ( Na + Nb - Nc ) = ( Na + Nb - 2 * Nc ) / ( Na +
|
|
219 Nb - Nc )
|
|
220
|
|
221 *SetTheoreticForm*: ( | SetDifferenceXaXb | - | SetIntersectionXaXb | )
|
|
222 / | SetDifferenceXaXb | = ( SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN
|
|
223 ( Xai, Xbi ) ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN ( Xai, Xbi
|
|
224 ) ) )
|
|
225
|
|
226 TanimotoSimilarity: ( same as JaccardSimilarity)
|
|
227
|
|
228 *AlgebraicForm*: SUM ( Xai * Xbi ) / ( SUM ( Xai ** 2 ) + SUM ( Xbi ** 2
|
|
229 ) - SUM ( Xai * Xbi ) )
|
|
230
|
|
231 *BinaryForm*: Nc / ( ( Na - Nc ) + ( Nb - Nc ) + Nc ) = Nc / ( Na + Nb -
|
|
232 Nc )
|
|
233
|
|
234 *SetTheoreticForm*: | SetIntersectionXaXb | / | SetDifferenceXaXb | =
|
|
235 SUM ( MIN ( Xai, Xbi ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN (
|
|
236 Xai, Xbi ) ) )
|
|
237
|
|
238 METHODS
|
|
239 new
|
|
240 $FPVector = new Fingerprints::FingerprintsVector(%NamesAndValues);
|
|
241
|
|
242 Using specified *FingerprintsVector* property names and values hash,
|
|
243 new method creates a new object and returns a reference to newly
|
|
244 created FingerprintsVectorsVector object. By default, the following
|
|
245 properties are initialized:
|
|
246
|
|
247 Type = ''
|
|
248 @{Values} = ()
|
|
249 @{ValuesIDs} = ()
|
|
250
|
|
251 Examples:
|
|
252
|
|
253 $FPVector = new Fingerprints::FingerprintsVector('Type' => 'OrderedNumericalValues',
|
|
254 'Values' => [1, 2, 3, 4]);
|
|
255 $FPVector = new Fingerprints::FingerprintsVector('Type' => 'NumericalValues',
|
|
256 'Values' => [10, 22, 33, 44],
|
|
257 'ValueIDs' => ['ID1', 'ID2', 'ID3', 'ID4']);
|
|
258 $FPVector = new Fingerprints::FingerprintsVector('Type' => 'AlphaNumericalValues',
|
|
259 'Values' => ['a1', 2, 'a3', 4]);
|
|
260
|
|
261 AddValueIDs
|
|
262 $FingerprintsVector->AddValueIDs($ValueIDsRef);
|
|
263 $FingerprintsVector->AddValueIDs(@ValueIDs);
|
|
264
|
|
265 Adds specified *ValueIDs* to *FingerprintsVector* and returns
|
|
266 *FingerprintsVector*.
|
|
267
|
|
268 AddValues
|
|
269 $FingerprintsVector->AddValues($ValuesRef);
|
|
270 $FingerprintsVector->AddValues(@Values);
|
|
271 $FingerprintsVector->AddValues($Vector);
|
|
272
|
|
273 Adds specified *Values* to *FingerprintsVector* and returns
|
|
274 *FingerprintsVector*.
|
|
275
|
|
276 CityBlockDistanceCoefficient
|
|
277 $Value = $FingerprintsVector->CityBlockDistanceCoefficient(
|
|
278 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
279 $Value = Fingerprints::FingerprintsVector::CityBlockDistanceCoefficient(
|
|
280 $FingerprintsVectorA, $FingerprintVectorB,
|
|
281 [$CalculationMode, $SkipValuesCheck]);
|
|
282
|
|
283 Returns value of *CityBlock* distance coefficient between two
|
|
284 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
285 and optional checking of vector values.
|
|
286
|
|
287 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
288 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
289 Default *SkipValuesCheck* value: *0*.
|
|
290
|
|
291 CosineSimilarityCoefficient
|
|
292 $Value = $FingerprintsVector->CosineSimilarityCoefficient(
|
|
293 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
294 $Value = Fingerprints::FingerprintsVector::CosineSimilarityCoefficient(
|
|
295 $FingerprintsVectorA, $FingerprintVectorB,
|
|
296 [$CalculationMode, $SkipValuesCheck]);
|
|
297
|
|
298 Returns value of *Cosine* similarity coefficient between two
|
|
299 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
300 and optional checking of vector values.
|
|
301
|
|
302 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
303 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
304 Default *SkipValuesCheck* value: *0*.
|
|
305
|
|
306 CzekanowskiSimilarityCoefficient
|
|
307 $Value = $FingerprintsVector->CzekanowskiSimilarityCoefficient(
|
|
308 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
309 $Value = Fingerprints::FingerprintsVector::CzekanowskiSimilarityCoefficient(
|
|
310 $FingerprintsVectorA, $FingerprintVectorB,
|
|
311 [$CalculationMode, $SkipValuesCheck]);
|
|
312
|
|
313 Returns value of *Czekanowski* similarity coefficient between two
|
|
314 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
315 and optional checking of vector values.
|
|
316
|
|
317 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
318 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
319 Default *SkipValuesCheck* value: *0*.
|
|
320
|
|
321 DiceSimilarityCoefficient
|
|
322 $Value = $FingerprintsVector->DiceSimilarityCoefficient(
|
|
323 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
324 $Value = Fingerprints::FingerprintsVector::DiceSimilarityCoefficient(
|
|
325 $FingerprintsVectorA, $FingerprintVectorB,
|
|
326 [$CalculationMode, $SkipValuesCheck]);
|
|
327
|
|
328 Returns value of *Dice* similarity coefficient between two
|
|
329 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
330 and optional checking of vector values.
|
|
331
|
|
332 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
333 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
334 Default *SkipValuesCheck* value: *0*.
|
|
335
|
|
336 EuclideanDistanceCoefficient
|
|
337 $Value = $FingerprintsVector->EuclideanDistanceCoefficient(
|
|
338 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
339 $Value = Fingerprints::FingerprintsVector::EuclideanDistanceCoefficient(
|
|
340 $FingerprintsVectorA, $FingerprintVectorB,
|
|
341 [$CalculationMode, $SkipValuesCheck]);
|
|
342
|
|
343 Returns value of *Euclidean* distance coefficient between two
|
|
344 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
345 and optional checking of vector values.
|
|
346
|
|
347 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
348 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
349 Default *SkipValuesCheck* value: *0*.
|
|
350
|
|
351 GetDescription
|
|
352 $Description = $FingerprintsVector->GetDescription();
|
|
353
|
|
354 Returns a string containing description of fingerprints vector.
|
|
355
|
|
356 GetFingerprintsVectorString
|
|
357 $FPString = $FingerprintsVector->GetFingerprintsVectorString($Format);
|
|
358
|
|
359 Returns a FingerprintsString containing vector values and/or IDs in
|
|
360 *FingerprintsVector* corresponding to specified *Format*.
|
|
361
|
|
362 Possible *Format* values: *IDsAndValuesString, IDsAndValues,
|
|
363 IDsAndValuesPairsString, IDsAndValuesPairs, ValuesAndIDsString,
|
|
364 ValuesAndIDs, ValuesAndIDsPairsString, ValuesAndIDsPairs,
|
|
365 ValueIDsString, ValueIDs, ValuesString, or Values*.
|
|
366
|
|
367 GetID
|
|
368 $ID = $FingerprintsVector->GetID();
|
|
369
|
|
370 Returns *ID* of *FingerprintsVector*.
|
|
371
|
|
372 GetVectorType
|
|
373 $VectorType = $FingerprintsVector->GetVectorType();
|
|
374
|
|
375 Returns *VectorType* of *FingerprintsVector*.
|
|
376
|
|
377 GetIDsAndValuesPairsString
|
|
378 $IDsValuesPairsString = $FingerprintsVector->GetIDsAndValuesPairsString();
|
|
379
|
|
380 Returns *FingerprintsVector* value IDs and values as space delimited
|
|
381 ID/value pair string.
|
|
382
|
|
383 GetIDsAndValuesString
|
|
384 $IDsValuesString = $FingerprintsVector->GetIDsAndValuesString();
|
|
385
|
|
386 Returns *FingerprintsVector* value IDs and values as string
|
|
387 containing space delimited IDs followed by values with semicolon as
|
|
388 IDs and values delimiter.
|
|
389
|
|
390 GetNumOfNonZeroValues
|
|
391 $NumOfNonZeroValues = $FingerprintsVector->GetNumOfNonZeroValues();
|
|
392
|
|
393 Returns number of non-zero values in *FingerprintsVector*.
|
|
394
|
|
395 GetNumOfValueIDs
|
|
396 $NumOfValueIDs = $FingerprintsVector->GetNumOfValueIDs();
|
|
397
|
|
398 Returns number of value IDs *FingerprintsVector*.
|
|
399
|
|
400 GetNumOfValues
|
|
401 $NumOfValues = $FingerprintsVector->GetNumOfValues();
|
|
402
|
|
403 Returns number of values *FingerprintsVector*.
|
|
404
|
|
405 GetSupportedDistanceAndSimilarityCoefficients
|
|
406 @SupportedDistanceAndSimilarityCoefficientsReturn =
|
|
407 Fingerprints::FingerprintsVector::GetSupportedDistanceAndSimilarityCoefficients();
|
|
408
|
|
409 Returns an array containing names of supported distance and
|
|
410 similarity coefficients.
|
|
411
|
|
412 GetSupportedDistanceCoefficients
|
|
413 @SupportedDistanceCoefficientsReturn =
|
|
414 Fingerprints::FingerprintsVector::GetSupportedDistanceCoefficients();
|
|
415
|
|
416 Returns an array containing names of supported disyance
|
|
417 coefficients.
|
|
418
|
|
419 GetSupportedSimilarityCoefficients
|
|
420 @SupportedSimilarityCoefficientsReturn =
|
|
421 Fingerprints::FingerprintsVector::GetSupportedSimilarityCoefficients();
|
|
422
|
|
423 Returns an array containing names of supported similarity
|
|
424 coefficients.
|
|
425
|
|
426 GetType
|
|
427 $VectorType = $FingerprintsVector->GetType();
|
|
428
|
|
429 Returns *FingerprintsVector* vector type.
|
|
430
|
|
431 GetValue
|
|
432 $Value = $FingerprintsVector->GetValue($Index);
|
|
433
|
|
434 Returns fingerprints vector Value specified using *Index* starting
|
|
435 at 0.
|
|
436
|
|
437 GetValueID
|
|
438 $ValueID = $FingerprintsVector->GetValueID();
|
|
439
|
|
440 Returns fingerprints vector ValueID specified using *Index* starting
|
|
441 at 0.
|
|
442
|
|
443 GetValueIDs
|
|
444 $ValueIDs = $FingerprintsVector->GetValueIDs();
|
|
445 @ValueIDs = $FingerprintsVector->GetValueIDs();
|
|
446
|
|
447 Returns fingerprints vector ValueIDs as an array or reference to an
|
|
448 array.
|
|
449
|
|
450 GetValueIDsString
|
|
451 $ValueIDsString = $FingerprintsVector->GetValueIDsString();
|
|
452
|
|
453 Returns fingerprints vector ValueIDsString with value IDs delimited
|
|
454 by space.
|
|
455
|
|
456 GetValues
|
|
457 $ValuesRef = $FingerprintsVector->GetValues();
|
|
458 @Values = $FingerprintsVector->GetValues();
|
|
459
|
|
460 Returns fingerprints vector Values as an array or reference to an
|
|
461 array.
|
|
462
|
|
463 GetValuesAndIDsPairsString
|
|
464 $ValuesIDsPairsString = $FingerprintsVector->GetValuesAndIDsPairsString();
|
|
465
|
|
466 Returns *FingerprintsVector* value and value IDs as space delimited
|
|
467 ID/value pair string.
|
|
468
|
|
469 GetValuesAndIDsString
|
|
470 $ValuesIDsString = $FingerprintsVector->GetValuesAndIDsString();
|
|
471
|
|
472 Returns *FingerprintsVector* values and value IDs as string
|
|
473 containing space delimited IDs followed by values with semicolon as
|
|
474 IDs and values delimiter.
|
|
475
|
|
476 GetValuesString
|
|
477 $Return = $FingerprintsVector->GetValuesString();
|
|
478
|
|
479 Returns *FingerprintsVector* values as space delimited string.
|
|
480
|
|
481 HammingDistanceCoefficient
|
|
482 $Value = $FingerprintsVector->HammingDistanceCoefficient(
|
|
483 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
484 $Value = Fingerprints::FingerprintsVector::HammingDistanceCoefficient(
|
|
485 $FingerprintsVectorA, $FingerprintVectorB,
|
|
486 [$CalculationMode, $SkipValuesCheck]);
|
|
487
|
|
488 Returns value of *Hamming* distance coefficient between two
|
|
489 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
490 and optional checking of vector values.
|
|
491
|
|
492 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
493 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
494 Default *SkipValuesCheck* value: *0*.
|
|
495
|
|
496 IsFingerprintsVector
|
|
497 $Status = Fingerprints::FingerprintsVector::IsFingerprintsVector($Object);
|
|
498
|
|
499 Returns 1 or 0 based on whether *Object* is a *FingerprintsVector*.
|
|
500
|
|
501 JaccardSimilarityCoefficient
|
|
502 $Value = $FingerprintsVector->JaccardSimilarityCoefficient(
|
|
503 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
504 $Value = Fingerprints::FingerprintsVector::JaccardSimilarityCoefficient(
|
|
505 $FingerprintsVectorA, $FingerprintVectorB,
|
|
506 [$CalculationMode, $SkipValuesCheck]);
|
|
507
|
|
508 Returns value of *Jaccard* similarity coefficient between two
|
|
509 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
510 and optional checking of vector values.
|
|
511
|
|
512 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
513 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
514 Default *SkipValuesCheck* value: *0*.
|
|
515
|
|
516 ManhattanDistanceCoefficient
|
|
517 $Value = $FingerprintsVector->ManhattanDistanceCoefficient(
|
|
518 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
519 $Value = Fingerprints::FingerprintsVector::ManhattanDistanceCoefficient(
|
|
520 $FingerprintsVectorA, $FingerprintVectorB,
|
|
521 [$CalculationMode, $SkipValuesCheck]);
|
|
522
|
|
523 Returns value of *Manhattan* distance coefficient between two
|
|
524 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
525 and optional checking of vector values.
|
|
526
|
|
527 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
528 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
529 Default *SkipValuesCheck* value: *0*.
|
|
530
|
|
531 NewFromIDsAndValuesPairsString
|
|
532 $FingerprintsVector = $FingerprintsVector->NewFromIDsAndValuesPairsString(
|
|
533 $ValuesType, $IDsAndValuesPairsString);
|
|
534 $FingerprintsVector = Fingerprints::FingerprintsVector::NewFromIDsAndValuesPairsString(
|
|
535 $ValuesType, $IDsAndValuesPairsString);
|
|
536
|
|
537 Creates a new *FingerprintsVector* of *ValuesType* using
|
|
538 *IDsAndValuesPairsString* containing space delimited value IDs and
|
|
539 values pairs and returns new FingerprintsVector object. Possible
|
|
540 *ValuesType* values: *OrderedNumericalValues, NumericalValues, or
|
|
541 AlphaNumericalValues*.
|
|
542
|
|
543 NewFromIDsAndValuesString
|
|
544 $FingerprintsVector = $FingerprintsVector->NewFromIDsAndValuesString(
|
|
545 $ValuesType, $IDsAndValuesString);
|
|
546 $FingerprintsVector = Fingerprints::FingerprintsVector::NewFromIDsAndValuesString(
|
|
547 $ValuesType, $IDsAndValuesString);
|
|
548
|
|
549 Creates a new *FingerprintsVector* of *ValuesType* using
|
|
550 *IDsAndValuesString* containing semicolon delimited value IDs string
|
|
551 followed by values strings and returns new FingerprintsVector
|
|
552 object. The values within value and value IDs tring are delimited by
|
|
553 spaces. Possible *ValuesType* values: *OrderedNumericalValues,
|
|
554 NumericalValues, or AlphaNumericalValues*.
|
|
555
|
|
556 NewFromValuesAndIDsPairsString
|
|
557 $FingerprintsVector = $FingerprintsVector->NewFromValuesAndIDsPairsString(
|
|
558 $ValuesType, $ValuesAndIDsPairsString);
|
|
559 $FingerprintsVector = Fingerprints::FingerprintsVector::NewFromValuesAndIDsPairsString(
|
|
560 $ValuesType, $ValuesAndIDsPairsString);
|
|
561
|
|
562 Creates a new *FingerprintsVector* of *ValuesType* using
|
|
563 *ValuesAndIDsPairsString* containing space delimited value and value
|
|
564 IDs pairs and returns new FingerprintsVector object. Possible
|
|
565 *ValuesType* values: *OrderedNumericalValues, NumericalValues, or
|
|
566 AlphaNumericalValues*.
|
|
567
|
|
568 NewFromValuesAndIDsString
|
|
569 $FingerprintsVector = $FingerprintsVector->NewFromValuesAndIDsString(
|
|
570 $ValuesType, $IDsAndValuesString);
|
|
571 $FingerprintsVector = Fingerprints::FingerprintsVector::NewFromValuesAndIDsString(
|
|
572 $ValuesType, $IDsAndValuesString);
|
|
573
|
|
574 Creates a new *FingerprintsVector* of *ValuesType* using
|
|
575 *ValuesAndIDsString* containing semicolon delimited values string
|
|
576 followed by value IDs strings and returns new FingerprintsVector
|
|
577 object. The values within values and value IDs tring are delimited
|
|
578 by spaces. Possible *ValuesType* values: *OrderedNumericalValues,
|
|
579 NumericalValues, or AlphaNumericalValues*.
|
|
580
|
|
581 NewFromValuesString
|
|
582 $FingerprintsVector = $FingerprintsVector->NewFromValuesString(
|
|
583 $ValuesType, $ValuesString);
|
|
584 $FingerprintsVector = Fingerprints::FingerprintsVector::NewFromValuesString(
|
|
585 $ValuesType, $ValuesString);
|
|
586
|
|
587 Creates a new *FingerprintsVector* of *ValuesType* using
|
|
588 *ValuesString* containing space delimited values string and returns
|
|
589 new FingerprintsVector object. The values within values and value
|
|
590 IDs tring are delimited by spaces. Possible *ValuesType* values:
|
|
591 *OrderedNumericalValues, NumericalValues, or AlphaNumericalValues*.
|
|
592
|
|
593 OchiaiSimilarityCoefficient
|
|
594 $Value = $FingerprintsVector->OchiaiSimilarityCoefficient(
|
|
595 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
596 $Value = Fingerprints::FingerprintsVector::OchiaiSimilarityCoefficient(
|
|
597 $FingerprintsVectorA, $FingerprintVectorB,
|
|
598 [$CalculationMode, $SkipValuesCheck]);
|
|
599
|
|
600 Returns value of *Ochiai* similarity coefficient between two
|
|
601 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
602 and optional checking of vector values.
|
|
603
|
|
604 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
605 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
606 Default *SkipValuesCheck* value: *0*.
|
|
607
|
|
608 SetDescription
|
|
609 $FingerprintsVector->SetDescription($Description);
|
|
610
|
|
611 Sets *Description* of fingerprints vector and returns
|
|
612 *FingerprintsVector*.
|
|
613
|
|
614 SetID
|
|
615 $FingerprintsVector->SetID($ID);
|
|
616
|
|
617 Sets *ID* of fingerprints vector and returns *FingerprintsVector*.
|
|
618
|
|
619 SetVectorType
|
|
620 $FingerprintsVector->SetVectorType($VectorType);
|
|
621
|
|
622 Sets *VectorType* of fingerprints vector and returns
|
|
623 *FingerprintsVector*.
|
|
624
|
|
625 SetType
|
|
626 $FingerprintsVector->SetType($Type);
|
|
627
|
|
628 Sets *FingerprintsVector* values *Type* and returns
|
|
629 *FingerprintsVector*. Possible *Type* values:
|
|
630 *OrderedNumericalValues, NumericalValues, or AlphaNumericalValues*.
|
|
631
|
|
632 During calculation of similarity and distance coefficients between
|
|
633 two *FingerprintsVectors*, the following conditions apply to vector
|
|
634 type, size, value and value IDs:
|
|
635
|
|
636 o For OrderedNumericalValues type, both vectors must be of the same size
|
|
637 and contain similar types of numerical values in the same order.
|
|
638
|
|
639 o For NumericalValues type, vector value IDs for both vectors must be
|
|
640 specified; however, their size and order of IDs and numerical values may
|
|
641 be different. For each vector, value IDs must correspond to vector values.
|
|
642
|
|
643 o For AlphaNumericalValues type, vectors may contain both numerical and
|
|
644 alphanumerical values and their sizes may be different.
|
|
645
|
|
646 SetValue
|
|
647 $FingerprintsVector->SetValue($Index, $Value, [$SkipIndexCheck]);
|
|
648
|
|
649 Sets a *FingerprintsVector* value specified by *Index* starting at 0
|
|
650 to *Value* along with optional index range check and returns
|
|
651 *FingerprintsVector*.
|
|
652
|
|
653 SetValueID
|
|
654 $FingerprintsVector->SetValueID($Index, $ValueID, [$SkipIndexCheck]);
|
|
655
|
|
656 Sets a *FingerprintsVector* value ID specified by *Index* starting
|
|
657 at 0 to *ValueID* along with optional index range check and returns
|
|
658 *FingerprintsVector*.
|
|
659
|
|
660 SetValueIDs
|
|
661 $FingerprintsVector->SetValueIDs($ValueIDsRef);
|
|
662 $FingerprintsVector->SetValueIDs(@ValueIDs);
|
|
663
|
|
664 Sets *FingerprintsVector* value IDs to specified *ValueIDs* and
|
|
665 returns *FingerprintsVector*.
|
|
666
|
|
667 SetValues
|
|
668 $FingerprintsVector->SetValues($ValuesRef);
|
|
669 $FingerprintsVector->SetValues(@Values);
|
|
670
|
|
671 Sets *FingerprintsVector* value to specified *Values* and returns
|
|
672 *FingerprintsVector*.
|
|
673
|
|
674 SoergelDistanceCoefficient
|
|
675 $Value = $FingerprintsVector->SoergelDistanceCoefficient(
|
|
676 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
677 $Value = Fingerprints::FingerprintsVector::SoergelDistanceCoefficient(
|
|
678 $FingerprintsVectorA, $FingerprintVectorB,
|
|
679 [$CalculationMode, $SkipValuesCheck]);
|
|
680
|
|
681 Returns value of *Soergel* distance coefficient between two
|
|
682 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
683 and optional checking of vector values.
|
|
684
|
|
685 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
686 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
687 Default *SkipValuesCheck* value: *0*.
|
|
688
|
|
689 SorensonSimilarityCoefficient
|
|
690 $Value = $FingerprintsVector->SorensonSimilarityCoefficient(
|
|
691 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
692 $Value = Fingerprints::FingerprintsVector::SorensonSimilarityCoefficient(
|
|
693 $FingerprintsVectorA, $FingerprintVectorB,
|
|
694 [$CalculationMode, $SkipValuesCheck]);
|
|
695
|
|
696 Returns value of *Sorenson* similarity coefficient between two
|
|
697 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
698 and optional checking of vector values.
|
|
699
|
|
700 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
701 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
702 Default *SkipValuesCheck* value: *0*.
|
|
703
|
|
704 TanimotoSimilarityCoefficient
|
|
705 $Value = $FingerprintsVector->TanimotoSimilarityCoefficient(
|
|
706 $OtherFingerprintVector, [$CalculationMode, $SkipValuesCheck]);
|
|
707 $Value = Fingerprints::FingerprintsVector::TanimotoSimilarityCoefficient(
|
|
708 $FingerprintsVectorA, $FingerprintVectorB,
|
|
709 [$CalculationMode, $SkipValuesCheck]);
|
|
710
|
|
711 Returns value of *Tanimoto* similarity coefficient between two
|
|
712 *FingerprintsVectors* using optionally specified *CalculationMode*
|
|
713 and optional checking of vector values.
|
|
714
|
|
715 Possible *CalculationMode* values: *AlgebraicForm, BinaryForm or
|
|
716 SetTheoreticForm*. Default *CalculationMode* value: *AlgebraicForm*.
|
|
717 Default *SkipValuesCheck* value: *0*.
|
|
718
|
|
719 StringifyFingerprintsVector
|
|
720 $String = $FingerprintsVector->StringifyFingerprintsVector();
|
|
721
|
|
722 Returns a string containing information about *FingerprintsVector*
|
|
723 object.
|
|
724
|
|
725 AUTHOR
|
|
726 Manish Sud <msud@san.rr.com>
|
|
727
|
|
728 SEE ALSO
|
|
729 BitVector.pm, FingerprintsStringUtil.pm, FingerprintsBitVector.pm,
|
|
730 Vector.pm
|
|
731
|
|
732 COPYRIGHT
|
|
733 Copyright (C) 2015 Manish Sud. All rights reserved.
|
|
734
|
|
735 This file is part of MayaChemTools.
|
|
736
|
|
737 MayaChemTools is free software; you can redistribute it and/or modify it
|
|
738 under the terms of the GNU Lesser General Public License as published by
|
|
739 the Free Software Foundation; either version 3 of the License, or (at
|
|
740 your option) any later version.
|
|
741
|