annotate docs/scripts/txt/SimilarityMatricesFingerprints.txt @ 0:4816e4a8ae95 draft default tip

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 09:23:18 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1 NAME
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
2 SimilarityMatricesFingerprints.pl - Calculate similarity matrices using
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
3 fingerprints strings data in SD, FP and CSV/TSV text file(s)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
4
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
5 SYNOPSIS
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
6 SimilarityMatricesFingerprints.pl SDFile(s) FPFile(s) TextFile(s)...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
7
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
8 SimilarityMatricesFingerprints.pl [--alpha *number*] [--beta *number*]
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
9 [-b, --BitVectorComparisonMode *All | "TanimotoSimilarity,[
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
10 TverskySimilarity, ... ]"*] [-c, --ColMode *ColNum | ColLabel*]
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
11 [--CompoundIDCol *col number | col name*] [--CompoundIDPrefix *text*]
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
12 [--CompoundIDField *DataFieldName*] [--CompoundIDMode *DataField |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
13 MolName | LabelPrefix | MolNameOrLabelPrefix*] [-d, --detail
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
14 *InfoLevel*] [-f, --fast] [--FingerprintsCol *col number | col name*]
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
15 [--FingerprintsField *FieldLabel*] [-h, --help] [--InDelim *comma |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
16 semicolon*] [--InputDataMode *LoadInMemory | ScanFile*] [-m, --mode
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
17 *AutoDetect | FingerprintsBitVectorString | FingerprintsVectorString*]
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
18 [--OutDelim *comma | tab | semicolon*] [--OutMatrixFormat
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
19 *RowsAndColumns | IDPairsAndValue*] [--OutMatrixType *FullMatrix |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
20 UpperTriangularMatrix | LowerTriangularMatrix*] [-o, --overwrite] [-p,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
21 --precision *number*] [-q, --quote *Yes | No*] [-r, --root *RootName*]
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
22 [-v, --VectorComparisonMode *All | "TanimotoSimilairy, [
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
23 ManhattanDistance, ...]"*] [--VectorComparisonFormulism *All |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
24 "AlgebraicForm, [BinaryForm, SetTheoreticForm]"*] [-w, --WorkingDir
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
25 dirname] SDFile(s) FPFile(s) TextFile(s)...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
26
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
27 DESCRIPTION
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
28 Calculate similarity matrices using fingerprint bit-vector or vector
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
29 strings data in *SD, FP and CSV/TSV* text file(s) and generate CSV/TSV
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
30 text file(s) containing values for specified similarity and distance
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
31 coefficients.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
32
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
33 The scripts SimilarityMatrixSDFiles.pl and SimilarityMatrixTextFiles.pl
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
34 have been removed from the current release of MayaChemTools and their
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
35 functionality merged with this script.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
36
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
37 The valid *SDFile* extensions are *.sdf* and *.sd*. All SD files in a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
38 current directory can be specified either by **.sdf* or the current
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
39 directory name.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
40
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
41 The valid *FPFile* extensions are *.fpf* and *.fp*. All FP files in a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
42 current directory can be specified either by **.fpf* or the current
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
43 directory name.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
44
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
45 The valid *TextFile* extensions are *.csv* and *.tsv* for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
46 comma/semicolon and tab delimited text files respectively. All other
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
47 file names are ignored. All text files in a current directory can be
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
48 specified by **.csv*, **.tsv*, or the current directory name. The
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
49 --indelim option determines the format of *TextFile(s)*. Any file which
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
50 doesn't correspond to the format indicated by --indelim option is
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
51 ignored.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
52
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
53 Example of *FP* file containing fingerprints bit-vector string data:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
54
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
55 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
56 # Package = MayaChemTools 7.4
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
57 # ReleaseDate = Oct 21, 2010
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
58 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
59 # TimeStamp = Mon Mar 7 15:14:01 2011
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
60 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
61 # FingerprintsStringType = FingerprintsBitVector
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
62 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
63 # Description = PathLengthBits:AtomicInvariantsAtomTypes:MinLength1:...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
64 # Size = 1024
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
65 # BitStringFormat = HexadecimalString
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
66 # BitsOrder = Ascending
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
67 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
68 Cmpd1 9c8460989ec8a49913991a6603130b0a19e8051c89184414953800cc21510...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
69 Cmpd2 000000249400840040100042011001001980410c000000001010088001120...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
70 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
71 ... ..
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
72
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
73 Example of *FP* file containing fingerprints vector string data:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
74
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
75 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
76 # Package = MayaChemTools 7.4
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
77 # ReleaseDate = Oct 21, 2010
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
78 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
79 # TimeStamp = Mon Mar 7 15:14:01 2011
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
80 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
81 # FingerprintsStringType = FingerprintsVector
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
82 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
83 # Description = PathLengthBits:AtomicInvariantsAtomTypes:MinLength1:...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
84 # VectorStringFormat = IDsAndValuesString
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
85 # VectorValuesType = NumericalValues
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
86 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
87 Cmpd1 338;C F N O C:C C:N C=O CC CF CN CO C:C:C C:C:N C:CC C:CF C:CN C:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
88 N:C C:NC CC:N CC=O CCC CCN CCO CNC NC=O O=CO C:C:C:C C:C:C:N C:C:CC...;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
89 33 1 2 5 21 2 2 12 1 3 3 20 2 10 2 2 1 2 2 2 8 2 5 1 1 1 19 2 8 2 2 2 2
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
90 6 2 2 2 2 2 2 2 2 3 2 2 1 4 1 5 1 1 18 6 2 2 1 2 10 2 1 2 1 2 2 2 2 ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
91 Cmpd2 103;C N O C=N C=O CC CN CO CC=O CCC CCN CCO CNC N=CN NC=O NCN O=C
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
92 O C CC=O CCCC CCCN CCCO CCNC CNC=N CNC=O CNCN CCCC=O CCCCC CCCCN CC...;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
93 15 4 4 1 2 13 5 2 2 15 5 3 2 2 1 1 1 2 17 7 6 5 1 1 1 2 15 8 5 7 2 2 2 2
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
94 1 2 1 1 3 15 7 6 8 3 4 4 3 2 2 1 2 3 14 2 4 7 4 4 4 4 1 1 1 2 1 1 1 ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
95 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
96 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
97
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
98 Example of *SD* file containing fingerprints bit-vector string data:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
99
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
100 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
101 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
102 $$$$
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
103 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
104 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
105 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
106 41 44 0 0 0 0 0 0 0 0999 V2000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
107 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
108 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
109 2 3 1 0 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
110 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
111 M END
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
112 > <CmpdID>
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
113 Cmpd1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
114
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
115 > <PathLengthFingerprints>
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
116 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLengt
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
117 h1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a49913991a66
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
118 03130b0a19e8051c89184414953800cc2151082844a201042800130860308e8204d4028
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
119 00831048940e44281c00060449a5000ac80c894114e006321264401600846c050164462
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
120 08190410805000304a10205b0100e04c0038ba0fad0209c0ca8b1200012268b61c0026a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
121 aa0660a11014a011d46
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
122
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
123 $$$$
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
124 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
125 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
126
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
127 Example of CSV *Text* file containing fingerprints bit-vector string
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
128 data:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
129
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
130 "CompoundID","PathLengthFingerprints"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
131 "Cmpd1","FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
132 :MinLength1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a4
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
133 9913991a6603130b0a19e8051c89184414953800cc2151082844a20104280013086030
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
134 8e8204d402800831048940e44281c00060449a5000ac80c894114e006321264401..."
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
135 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
136 ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
137
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
138 The current release of MayaChemTools supports the following types of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
139 fingerprint bit-vector and vector strings:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
140
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
141 FingerprintsVector;AtomNeighborhoods:AtomicInvariantsAtomTypes:MinRadi
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
142 us0:MaxRadius2;41;AlphaNumericalValues;ValuesString;NR0-C.X1.BO1.H3-AT
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
143 C1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-ATC1 NR0-C.X
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
144 1.BO1.H3-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-A
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
145 TC1 NR0-C.X2.BO2.H2-ATC1:NR1-C.X2.BO2.H2-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
146 -C.X2.BO2.H2-ATC1:NR2-N.X3.BO3-ATC1:NR2-O.X1.BO1.H1-ATC1 NR0-C.X2.B...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
147
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
148 FingerprintsVector;AtomTypesCount:AtomicInvariantsAtomTypes:ArbitraryS
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
149 ize;10;NumericalValues;IDsAndValuesString;C.X1.BO1.H3 C.X2.BO2.H2 C.X2
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
150 .BO3.H1 C.X3.BO3.H1 C.X3.BO4 F.X1.BO1 N.X2.BO2.H1 N.X3.BO3 O.X1.BO1.H1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
151 O.X1.BO2;2 4 14 3 10 1 1 1 3 2
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
152
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
153 FingerprintsVector;AtomTypesCount:SLogPAtomTypes:ArbitrarySize;16;Nume
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
154 ricalValues;IDsAndValuesString;C1 C10 C11 C14 C18 C20 C21 C22 C5 CS F
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
155 N11 N4 O10 O2 O9;5 1 1 1 14 4 2 1 2 2 1 1 1 1 3 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
156
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
157 FingerprintsVector;AtomTypesCount:SLogPAtomTypes:FixedSize;67;OrderedN
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
158 umericalValues;IDsAndValuesString;C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
159 12 C13 C14 C15 C16 C17 C18 C19 C20 C21 C22 C23 C24 C25 C26 C27 CS N1 N
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
160 2 N3 N4 N5 N6 N7 N8 N9 N10 N11 N12 N13 N14 NS O1 O2 O3 O4 O5 O6 O7 O8
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
161 O9 O10 O11 O12 OS F Cl Br I Hal P S1 S2 S3 Me1 Me2;5 0 0 0 2 0 0 0 0 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
162 1 0 0 1 0 0 0 14 0 4 2 1 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
163
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
164 FingerprintsVector;EStateIndicies:ArbitrarySize;11;NumericalValues;IDs
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
165 AndValuesString;SaaCH SaasC SaasN SdO SdssC SsCH3 SsF SsOH SssCH2 SssN
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
166 H SsssCH;24.778 4.387 1.993 25.023 -1.435 3.975 14.006 29.759 -0.073 3
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
167 .024 -2.270
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
168
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
169 FingerprintsVector;EStateIndicies:FixedSize;87;OrderedNumericalValues;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
170 ValuesString;0 0 0 0 0 0 0 3.975 0 -0.073 0 0 24.778 -2.270 0 0 -1.435
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
171 4.387 0 0 0 0 0 0 3.024 0 0 0 0 0 0 0 1.993 0 29.759 25.023 0 0 0 0 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
172 4.006 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
173 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
174
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
175 FingerprintsVector;ExtendedConnectivity:AtomicInvariantsAtomTypes:Radi
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
176 us2;60;AlphaNumericalValues;ValuesString;73555770 333564680 352413391
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
177 666191900 1001270906 1371674323 1481469939 1977749791 2006158649 21414
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
178 08799 49532520 64643108 79385615 96062769 273726379 564565671 85514103
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
179 5 906706094 988546669 1018231313 1032696425 1197507444 1331250018 1338
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
180 532734 1455473691 1607485225 1609687129 1631614296 1670251330 17303...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
181
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
182 FingerprintsVector;ExtendedConnectivityCount:AtomicInvariantsAtomTypes
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
183 :Radius2;60;NumericalValues;IDsAndValuesString;73555770 333564680 3524
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
184 13391 666191900 1001270906 1371674323 1481469939 1977749791 2006158649
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
185 2141408799 49532520 64643108 79385615 96062769 273726379 564565671...;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
186 3 2 1 1 14 1 2 10 4 3 1 1 1 1 2 1 2 1 1 1 2 3 1 1 2 1 3 3 8 2 2 2 6 2
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
187 1 2 1 1 2 1 1 1 2 1 1 2 1 2 1 1 1 1 1 1 1 1 1 2 1 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
188
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
189 FingerprintsBitVector;ExtendedConnectivityBits:AtomicInvariantsAtomTyp
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
190 es:Radius2;1024;BinaryString;Ascending;0000000000000000000000000000100
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
191 0000000001010000000110000011000000000000100000000000000000000000100001
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
192 1000000110000000000000000000000000010011000000000000000000000000010000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
193 0000000000000000000000000010000000000000000001000000000000000000000000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
194 0000000000010000100001000000000000101000000000000000100000000000000...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
195
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
196 FingerprintsVector;ExtendedConnectivity:FunctionalClassAtomTypes:Radiu
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
197 s2;57;AlphaNumericalValues;ValuesString;24769214 508787397 850393286 8
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
198 62102353 981185303 1231636850 1649386610 1941540674 263599683 32920567
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
199 1 571109041 639579325 683993318 723853089 810600886 885767127 90326012
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
200 7 958841485 981022393 1126908698 1152248391 1317567065 1421489994 1455
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
201 632544 1557272891 1826413669 1983319256 2015750777 2029559552 20404...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
202
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
203 FingerprintsVector;ExtendedConnectivity:EStateAtomTypes:Radius2;62;Alp
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
204 haNumericalValues;ValuesString;25189973 528584866 662581668 671034184
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
205 926543080 1347067490 1738510057 1759600920 2034425745 2097234755 21450
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
206 44754 96779665 180364292 341712110 345278822 386540408 387387308 50430
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
207 1706 617094135 771528807 957666640 997798220 1158349170 1291258082 134
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
208 1138533 1395329837 1420277211 1479584608 1486476397 1487556246 1566...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
209
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
210 FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;00000000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
211 0000000000000000000000000000000001001000010010000000010010000000011100
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
212 0100101010111100011011000100110110000011011110100110111111111111011111
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
213 11111111111110111000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
214
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
215 FingerprintsBitVector;MACCSKeyBits;322;BinaryString;Ascending;11101011
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
216 1110011111100101111111000111101100110000000000000011100010000000000000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
217 0000000000000000000000000000000000000000000000101000000000000000000000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
218 0000000000000000000000000000000000000000000000000000000000000000000000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
219 0000000000000000000000000000000000000011000000000000000000000000000000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
220 0000000000000000000000000000000000000000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
221
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
222 FingerprintsVector;MACCSKeyCount;166;OrderedNumericalValues;ValuesStri
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
223 ng;0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
224 0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 4 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
225 0 0 0 0 1 1 8 0 0 0 1 0 0 1 0 1 0 1 0 3 1 3 1 0 0 0 1 2 0 11 1 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
226 5 0 0 1 2 0 1 1 0 0 0 0 0 1 1 0 1 1 1 1 0 4 0 0 1 1 0 4 6 1 1 1 2 1 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
227 3 5 2 2 0 5 3 5 1 1 2 5 1 2 1 2 4 8 3 5 5 2 2 0 3 5 4 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
228
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
229 FingerprintsVector;MACCSKeyCount;322;OrderedNumericalValues;ValuesStri
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
230 ng;14 8 2 0 2 0 4 4 2 1 4 0 0 2 5 10 5 2 1 0 0 2 0 5 13 3 28 5 5 3 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
231 0 4 2 1 1 0 1 1 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22 5 3 0 0 0 1 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
232 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
233 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 0 2 0 0 0 0 0 0 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
234 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
235
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
236 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
237 th1:MaxLength8;1024;BinaryString;Ascending;001000010011010101011000110
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
238 0100010101011000101001011100110001000010001001101000001001001001001000
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
239 0010110100000111001001000001001010100100100000000011000000101001011100
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
240 0010000001000101010100000100111100110111011011011000000010110111001101
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
241 0101100011000000010001000011000010100011101100001000001000100000000...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
242
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
243 FingerprintsVector;PathLengthCount:AtomicInvariantsAtomTypes:MinLength
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
244 1:MaxLength8;432;NumericalValues;IDsAndValuesPairsString;C.X1.BO1.H3 2
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
245 C.X2.BO2.H2 4 C.X2.BO3.H1 14 C.X3.BO3.H1 3 C.X3.BO4 10 F.X1.BO1 1 N.X
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
246 2.BO2.H1 1 N.X3.BO3 1 O.X1.BO1.H1 3 O.X1.BO2 2 C.X1.BO1.H3C.X3.BO3.H1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
247 2 C.X2.BO2.H2C.X2.BO2.H2 1 C.X2.BO2.H2C.X3.BO3.H1 4 C.X2.BO2.H2C.X3.BO
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
248 4 1 C.X2.BO2.H2N.X3.BO3 1 C.X2.BO3.H1:C.X2.BO3.H1 10 C.X2.BO3.H1:C....
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
249
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
250 FingerprintsVector;PathLengthCount:MMFF94AtomTypes:MinLength1:MaxLengt
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
251 h8;463;NumericalValues;IDsAndValuesPairsString;C5A 2 C5B 2 C=ON 1 CB 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
252 8 COO 1 CR 9 F 1 N5 1 NC=O 1 O=CN 1 O=CO 1 OC=O 1 OR 2 C5A:C5B 2 C5A:N
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
253 5 2 C5ACB 1 C5ACR 1 C5B:C5B 1 C5BC=ON 1 C5BCB 1 C=ON=O=CN 1 C=ONNC=O 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
254 CB:CB 18 CBF 1 CBNC=O 1 COO=O=CO 1 COOCR 1 COOOC=O 1 CRCR 7 CRN5 1 CR
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
255 OR 2 C5A:C5B:C5B 2 C5A:C5BC=ON 1 C5A:C5BCB 1 C5A:N5:C5A 1 C5A:N5CR ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
256
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
257 FingerprintsVector;TopologicalAtomPairs:AtomicInvariantsAtomTypes:MinD
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
258 istance1:MaxDistance10;223;NumericalValues;IDsAndValuesString;C.X1.BO1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
259 .H3-D1-C.X3.BO3.H1 C.X2.BO2.H2-D1-C.X2.BO2.H2 C.X2.BO2.H2-D1-C.X3.BO3.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
260 H1 C.X2.BO2.H2-D1-C.X3.BO4 C.X2.BO2.H2-D1-N.X3.BO3 C.X2.BO3.H1-D1-...;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
261 2 1 4 1 1 10 8 1 2 6 1 2 2 1 2 1 2 2 1 2 1 5 1 10 12 2 2 1 2 1 9 1 3 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
262 1 1 2 2 1 3 6 1 6 14 2 2 2 3 1 3 1 8 2 2 1 3 2 6 1 2 2 5 1 3 1 23 1...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
263
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
264 FingerprintsVector;TopologicalAtomPairs:FunctionalClassAtomTypes:MinDi
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
265 stance1:MaxDistance10;144;NumericalValues;IDsAndValuesString;Ar-D1-Ar
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
266 Ar-D1-Ar.HBA Ar-D1-HBD Ar-D1-Hal Ar-D1-None Ar.HBA-D1-None HBA-D1-NI H
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
267 BA-D1-None HBA.HBD-D1-NI HBA.HBD-D1-None HBD-D1-None NI-D1-None No...;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
268 23 2 1 1 2 1 1 1 1 2 1 1 7 28 3 1 3 2 8 2 1 1 1 5 1 5 24 3 3 4 2 13 4
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
269 1 1 4 1 5 22 4 4 3 1 19 1 1 1 1 1 2 2 3 1 1 8 25 4 5 2 3 1 26 1 4 1 ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
270
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
271 FingerprintsVector;TopologicalAtomTorsions:AtomicInvariantsAtomTypes;3
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
272 3;NumericalValues;IDsAndValuesString;C.X1.BO1.H3-C.X3.BO3.H1-C.X3.BO4-
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
273 C.X3.BO4 C.X1.BO1.H3-C.X3.BO3.H1-C.X3.BO4-N.X3.BO3 C.X2.BO2.H2-C.X2.BO
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
274 2.H2-C.X3.BO3.H1-C.X2.BO2.H2 C.X2.BO2.H2-C.X2.BO2.H2-C.X3.BO3.H1-O...;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
275 2 2 1 1 2 2 1 1 3 4 4 8 4 2 2 6 2 2 1 2 1 1 2 1 1 2 6 2 4 2 1 3 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
276
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
277 FingerprintsVector;TopologicalAtomTorsions:EStateAtomTypes;36;Numerica
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
278 lValues;IDsAndValuesString;aaCH-aaCH-aaCH-aaCH aaCH-aaCH-aaCH-aasC aaC
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
279 H-aaCH-aasC-aaCH aaCH-aaCH-aasC-aasC aaCH-aaCH-aasC-sF aaCH-aaCH-aasC-
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
280 ssNH aaCH-aasC-aasC-aasC aaCH-aasC-aasC-aasN aaCH-aasC-ssNH-dssC a...;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
281 4 4 8 4 2 2 6 2 2 2 4 3 2 1 3 3 2 2 2 1 2 1 1 1 2 1 1 1 1 1 1 1 2 1 1 2
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
282
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
283 FingerprintsVector;TopologicalAtomTriplets:AtomicInvariantsAtomTypes:M
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
284 inDistance1:MaxDistance10;3096;NumericalValues;IDsAndValuesString;C.X1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
285 .BO1.H3-D1-C.X1.BO1.H3-D1-C.X3.BO3.H1-D2 C.X1.BO1.H3-D1-C.X2.BO2.H2-D1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
286 0-C.X3.BO4-D9 C.X1.BO1.H3-D1-C.X2.BO2.H2-D3-N.X3.BO3-D4 C.X1.BO1.H3-D1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
287 -C.X2.BO2.H2-D4-C.X2.BO2.H2-D5 C.X1.BO1.H3-D1-C.X2.BO2.H2-D6-C.X3....;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
288 1 2 2 2 2 2 2 2 8 8 4 8 4 4 2 2 2 2 4 2 2 2 4 2 2 2 2 1 2 2 4 4 4 2 2
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
289 2 4 4 4 8 4 4 2 4 4 4 2 4 4 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 8...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
290
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
291 FingerprintsVector;TopologicalAtomTriplets:SYBYLAtomTypes:MinDistance1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
292 :MaxDistance10;2332;NumericalValues;IDsAndValuesString;C.2-D1-C.2-D9-C
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
293 .3-D10 C.2-D1-C.2-D9-C.ar-D10 C.2-D1-C.3-D1-C.3-D2 C.2-D1-C.3-D10-C.3-
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
294 D9 C.2-D1-C.3-D2-C.3-D3 C.2-D1-C.3-D2-C.ar-D3 C.2-D1-C.3-D3-C.3-D4 C.2
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
295 -D1-C.3-D3-N.ar-D4 C.2-D1-C.3-D3-O.3-D2 C.2-D1-C.3-D4-C.3-D5 C.2-D1-C.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
296 3-D5-C.3-D6 C.2-D1-C.3-D5-O.3-D4 C.2-D1-C.3-D6-C.3-D7 C.2-D1-C.3-D7...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
297
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
298 FingerprintsVector;TopologicalPharmacophoreAtomPairs:ArbitrarySize:Min
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
299 Distance1:MaxDistance10;54;NumericalValues;IDsAndValuesString;H-D1-H H
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
300 -D1-NI HBA-D1-NI HBD-D1-NI H-D2-H H-D2-HBA H-D2-HBD HBA-D2-HBA HBA-D2-
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
301 HBD H-D3-H H-D3-HBA H-D3-HBD H-D3-NI HBA-D3-NI HBD-D3-NI H-D4-H H-D4-H
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
302 BA H-D4-HBD HBA-D4-HBA HBA-D4-HBD HBD-D4-HBD H-D5-H H-D5-HBA H-D5-...;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
303 18 1 2 1 22 12 8 1 2 18 6 3 1 1 1 22 13 6 5 7 2 28 9 5 1 1 1 36 16 10
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
304 3 4 1 37 10 8 1 35 10 9 3 3 1 28 7 7 4 18 16 12 5 1 2 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
305
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
306 FingerprintsVector;TopologicalPharmacophoreAtomPairs:FixedSize:MinDist
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
307 ance1:MaxDistance10;150;OrderedNumericalValues;ValuesString;18 0 0 1 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
308 0 0 2 0 0 1 0 0 0 0 22 12 8 0 0 1 2 0 0 0 0 0 0 0 0 18 6 3 1 0 0 0 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
309 0 0 1 0 0 0 0 22 13 6 0 0 5 7 0 0 2 0 0 0 0 0 28 9 5 1 0 0 0 1 0 0 1 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
310 0 0 0 36 16 10 0 0 3 4 0 0 1 0 0 0 0 0 37 10 8 0 0 0 0 1 0 0 0 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
311 0 35 10 9 0 0 3 3 0 0 1 0 0 0 0 0 28 7 7 4 0 0 0 0 0 0 0 0 0 0 0 18...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
312
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
313 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:ArbitrarySize:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
314 MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesString;Ar1-
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
315 Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
316 -H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA1 H1-
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
317 HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 Ar1-...;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
318 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
319 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
320 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
321
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
322 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
323 istance1:MaxDistance10;2692;OrderedNumericalValues;ValuesString;46 106
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
324 8 3 0 0 83 11 4 0 0 0 1 0 0 0 0 0 0 0 0 21 5 3 0 0 1 2 2 0 0 1 0 0 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
325 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 101 18 11 0 0 145 132 26
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
326 14 0 0 23 28 3 3 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 45 10 4 0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
327 0 16 20 7 5 1 0 3 4 5 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 5 ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
328
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
329 OPTIONS
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
330 --alpha *number*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
331 Value of alpha parameter for calculating *Tversky* similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
332 coefficient specified for -b, --BitVectorComparisonMode option. It
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
333 corresponds to weights assigned for bits set to "1" in a pair of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
334 fingerprint bit-vectors during the calculation of similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
335 coefficient. Possible values: *0 to 1*. Default value: <0.5>.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
336
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
337 --beta *number*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
338 Value of beta parameter for calculating *WeightedTanimoto* and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
339 *WeightedTversky* similarity coefficients specified for -b,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
340 --BitVectorComparisonMode option. It is used to weight the
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
341 contributions of bits set to "0" during the calculation of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
342 similarity coefficients. Possible values: *0 to 1*. Default value of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
343 <1> makes *WeightedTanimoto* and *WeightedTversky* equivalent to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
344 *Tanimoto* and *Tversky*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
345
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
346 -b, --BitVectorComparisonMode *All |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
347 "TanimotoSimilarity,[TverskySimilarity,...]"*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
348 Specify what similarity coefficients to use for calculating
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
349 similarity matrices for fingerprints bit-vector strings data values
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
350 in *TextFile(s)*: calculate similarity matrices for all supported
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
351 similarity coefficients or specify a comma delimited list of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
352 similarity coefficients. Possible values: *All |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
353 "TanimotoSimilarity,[TverskySimilarity,...]*. Default:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
354 *TanimotoSimilarity*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
355
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
356 *All* uses complete list of supported similarity coefficients:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
357 *BaroniUrbaniSimilarity, BuserSimilarity, CosineSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
358 DiceSimilarity, DennisSimilarity, ForbesSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
359 FossumSimilarity, HamannSimilarity, JacardSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
360 Kulczynski1Similarity, Kulczynski2Similarity, MatchingSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
361 McConnaugheySimilarity, OchiaiSimilarity, PearsonSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
362 RogersTanimotoSimilarity, RussellRaoSimilarity, SimpsonSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
363 SkoalSneath1Similarity, SkoalSneath2Similarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
364 SkoalSneath3Similarity, TanimotoSimilarity, TverskySimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
365 YuleSimilarity, WeightedTanimotoSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
366 WeightedTverskySimilarity*. These similarity coefficients are
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
367 described below.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
368
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
369 For two fingerprint bit-vectors A and B of same size, let:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
370
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
371 Na = Number of bits set to "1" in A
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
372 Nb = Number of bits set to "1" in B
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
373 Nc = Number of bits set to "1" in both A and B
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
374 Nd = Number of bits set to "0" in both A and B
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
375
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
376 Nt = Number of bits set to "1" or "0" in A or B (Size of A or B)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
377 Nt = Na + Nb - Nc + Nd
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
378
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
379 Na - Nc = Number of bits set to "1" in A but not in B
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
380 Nb - Nc = Number of bits set to "1" in B but not in A
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
381
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
382 Then, various similarity coefficients [ Ref. 40 - 42 ] for a pair of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
383 bit-vectors A and B are defined as follows:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
384
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
385 *BaroniUrbaniSimilarity*: ( SQRT( Nc * Nd ) + Nc ) / ( SQRT ( Nc *
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
386 Nd ) + Nc + ( Na - Nc ) + ( Nb - Nc ) ) ( same as Buser )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
387
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
388 *BuserSimilarity*: ( SQRT ( Nc * Nd ) + Nc ) / ( SQRT ( Nc * Nd ) +
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
389 Nc + ( Na - Nc ) + ( Nb - Nc ) ) ( same as BaroniUrbani )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
390
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
391 *CosineSimilarity*: Nc / SQRT ( Na * Nb ) (same as Ochiai)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
392
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
393 *DiceSimilarity*: (2 * Nc) / ( Na + Nb )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
394
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
395 *DennisSimilarity*: ( Nc * Nd - ( ( Na - Nc ) * ( Nb - Nc ) ) ) /
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
396 SQRT ( Nt * Na * Nb)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
397
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
398 *ForbesSimilarity*: ( Nt * Nc ) / ( Na * Nb )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
399
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
400 *FossumSimilarity*: ( Nt * ( ( Nc - 1/2 ) ** 2 ) / ( Na * Nb )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
401
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
402 *HamannSimilarity*: ( ( Nc + Nd ) - ( Na - Nc ) - ( Nb - Nc ) ) / Nt
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
403
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
404 *JaccardSimilarity*: Nc / ( ( Na - Nc) + ( Nb - Nc ) + Nc ) = Nc / (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
405 Na + Nb - Nc ) (same as Tanimoto)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
406
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
407 *Kulczynski1Similarity*: Nc / ( ( Na - Nc ) + ( Nb - Nc) ) = Nc / (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
408 Na + Nb - 2Nc )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
409
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
410 *Kulczynski2Similarity*: ( ( Nc / 2 ) * ( 2 * Nc + ( Na - Nc ) + (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
411 Nb - Nc) ) ) / ( ( Nc + ( Na - Nc ) ) * ( Nc + ( Nb - Nc ) ) ) = 0.5
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
412 * ( Nc / Na + Nc / Nb )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
413
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
414 *MatchingSimilarity*: ( Nc + Nd ) / Nt
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
415
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
416 *McConnaugheySimilarity*: ( Nc ** 2 - ( Na - Nc ) * ( Nb - Nc) ) / (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
417 Na * Nb )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
418
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
419 *OchiaiSimilarity*: Nc / SQRT ( Na * Nb ) (same as Cosine)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
420
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
421 *PearsonSimilarity*: ( ( Nc * Nd ) - ( ( Na - Nc ) * ( Nb - Nc ) ) /
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
422 SQRT ( Na * Nb * ( Na - Nc + Nd ) * ( Nb - Nc + Nd ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
423
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
424 *RogersTanimotoSimilarity*: ( Nc + Nd ) / ( ( Na - Nc) + ( Nb - Nc)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
425 + Nt) = ( Nc + Nd ) / ( Na + Nb - 2Nc + Nt)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
426
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
427 *RussellRaoSimilarity*: Nc / Nt
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
428
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
429 *SimpsonSimilarity*: Nc / MIN ( Na, Nb)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
430
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
431 *SkoalSneath1Similarity*: Nc / ( Nc + 2 * ( Na - Nc) + 2 * ( Nb -
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
432 Nc) ) = Nc / ( 2 * Na + 2 * Nb - 3 * Nc )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
433
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
434 *SkoalSneath2Similarity*: ( 2 * Nc + 2 * Nd ) / ( Nc + Nd + Nt )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
435
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
436 *SkoalSneath3Similarity*: ( Nc + Nd ) / ( ( Na - Nc ) + ( Nb - Nc )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
437 ) = ( Nc + Nd ) / ( Na + Nb - 2 * Nc )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
438
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
439 *TanimotoSimilarity*: Nc / ( ( Na - Nc) + ( Nb - Nc ) + Nc ) = Nc /
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
440 ( Na + Nb - Nc ) (same as Jaccard)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
441
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
442 *TverskySimilarity*: Nc / ( alpha * ( Na - Nc ) + ( 1 - alpha) * (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
443 Nb - Nc) + Nc ) = Nc / ( alpha * ( Na - Nb ) + Nb)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
444
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
445 *YuleSimilarity*: ( ( Nc * Nd ) - ( ( Na - Nc ) * ( Nb - Nc ) ) ) /
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
446 ( ( Nc * Nd ) + ( ( Na - Nc ) * ( Nb - Nc ) ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
447
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
448 Values of Tanimoto/Jaccard and Tversky coefficients are dependent on
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
449 only those bit which are set to "1" in both A and B. In order to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
450 take into account all bit positions, modified versions of Tanimoto [
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
451 Ref. 42 ] and Tversky [ Ref. 43 ] have been developed.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
452
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
453 Let:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
454
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
455 Na' = Number of bits set to "0" in A
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
456 Nb' = Number of bits set to "0" in B
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
457 Nc' = Number of bits set to "0" in both A and B
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
458
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
459 Tanimoto': Nc' / ( ( Na' - Nc') + ( Nb' - Nc' ) + Nc' ) = Nc' / (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
460 Na' + Nb' - Nc' )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
461
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
462 Tversky': Nc' / ( alpha * ( Na' - Nc' ) + ( 1 - alpha) * ( Nb' - Nc'
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
463 ) + Nc' ) = Nc' / ( alpha * ( Na' - Nb' ) + Nb')
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
464
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
465 Then:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
466
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
467 *WeightedTanimotoSimilarity* = beta * Tanimoto + (1 - beta) *
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
468 Tanimoto'
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
469
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
470 *WeightedTverskySimilarity* = beta * Tversky + (1 - beta) * Tversky'
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
471
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
472 -c, --ColMode *ColNum | ColLabel*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
473 Specify how columns are identified in *TextFile(s)*: using column
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
474 number or column label. Possible values: *ColNum or ColLabel*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
475 Default value: *ColNum*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
476
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
477 --CompoundIDCol *col number | col name*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
478 This value is -c, --ColMode mode specific. It specifies input
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
479 *TextFile(s)* column to use for generating compound ID for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
480 similarity matrices in output *TextFile(s)*. Possible values: *col
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
481 number or col label*. Default value: *first column containing the
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
482 word compoundID in its column label or sequentially generated IDs*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
483
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
484 --CompoundIDPrefix *text*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
485 Specify compound ID prefix to use during sequential generation of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
486 compound IDs for input *SDFile(s)* and *TextFile(s)*. Default value:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
487 *Cmpd*. The default value generates compound IDs which look like
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
488 Cmpd<Number>.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
489
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
490 For input *SDFile(s)*, this value is only used during *LabelPrefix |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
491 MolNameOrLabelPrefix* values of --CompoundIDMode option; otherwise,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
492 it's ignored.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
493
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
494 Examples for *LabelPrefix* or *MolNameOrLabelPrefix* value of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
495 --CompoundIDMode:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
496
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
497 Compound
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
498
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
499 The values specified above generates compound IDs which correspond
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
500 to Compound<Number> instead of default value of Cmpd<Number>.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
501
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
502 --CompoundIDField *DataFieldName*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
503 Specify input *SDFile(s)* datafield label for generating compound
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
504 IDs. This value is only used during *DataField* value of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
505 --CompoundIDMode option.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
506
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
507 Examples for *DataField* value of --CompoundIDMode:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
508
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
509 MolID
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
510 ExtReg
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
511
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
512 --CompoundIDMode *DataField | MolName | LabelPrefix |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
513 MolNameOrLabelPrefix*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
514 Specify how to generate compound IDs from input *SDFile(s)* for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
515 similarity matrix CSV/TSV text file(s): use a *SDFile(s)* datafield
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
516 value; use molname line from *SDFile(s)*; generate a sequential ID
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
517 with specific prefix; use combination of both MolName and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
518 LabelPrefix with usage of LabelPrefix values for empty molname
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
519 lines.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
520
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
521 Possible values: *DataField | MolName | LabelPrefix |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
522 MolNameOrLabelPrefix*. Default: *LabelPrefix*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
523
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
524 For *MolNameAndLabelPrefix* value of --CompoundIDMode, molname line
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
525 in *SDFile(s)* takes precedence over sequential compound IDs
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
526 generated using *LabelPrefix* and only empty molname values are
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
527 replaced with sequential compound IDs.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
528
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
529 -d, --detail *InfoLevel*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
530 Level of information to print about lines being ignored. Default:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
531 *1*. Possible values: *1, 2 or 3*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
532
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
533 -f, --fast
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
534 In this mode, fingerprints columns specified using --FingerprintsCol
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
535 for *TextFile(s)* and --FingerprintsField for *SDFile(s)* are
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
536 assumed to contain valid fingerprints data and no checking is
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
537 performed before calculating similarity matrices. By default,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
538 fingerprints data is validated before computing pairwise similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
539 and distance coefficients.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
540
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
541 --FingerprintsCol *col number | col name*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
542 This value is -c, --colmode specific. It specifies fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
543 column to use during calculation similarity matrices for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
544 *TextFile(s)*. Possible values: *col number or col label*. Default
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
545 value: *first column containing the word Fingerprints in its column
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
546 label*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
547
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
548 --FingerprintsField *FieldLabel*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
549 Fingerprints field label to use during calculation similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
550 matrices for *SDFile(s)*. Default value: *first data field label
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
551 containing the word Fingerprints in its label*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
552
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
553 -h, --help
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
554 Print this help message.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
555
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
556 --InDelim *comma | semicolon*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
557 Input delimiter for CSV *TextFile(s)*. Possible values: *comma or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
558 semicolon*. Default value: *comma*. For TSV files, this option is
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
559 ignored and *tab* is used as a delimiter.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
560
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
561 --InputDataMode *LoadInMemory | ScanFile*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
562 Specify how fingerprints bit-vector or vector strings data from *SD,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
563 FP and CSV/TSV* fingerprint file(s) is processed: Retrieve, process
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
564 and load all available fingerprints data in memory; Retrieve and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
565 process data for fingerprints one at a time. Possible values :
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
566 *LoadInMemory | ScanFile*. Default: *LoadInMemory*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
567
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
568 During *LoadInMemory* value of --InputDataMode, fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
569 bit-vector or vector strings data from input file is retrieved,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
570 processed, and loaded into memory all at once as fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
571 objects for generation for similarity matrices.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
572
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
573 During *ScanFile* value of --InputDataMode, multiple passes over the
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
574 input fingerprints file are performed to retrieve and process
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
575 fingerprints bit-vector or vector strings data one at a time to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
576 generate fingerprints objects used during generation of similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
577 matrices. A temporary copy of the input fingerprints file is made at
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
578 the start and deleted after generating the matrices.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
579
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
580 *ScanFile* value of --InputDataMode allows processing of arbitrary
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
581 large fingerprints files without any additional memory requirement.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
582
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
583 -m, --mode *AutoDetect | FingerprintsBitVectorString |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
584 FingerprintsVectorString*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
585 Format of fingerprint strings data in *TextFile(s)*: automatically
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
586 detect format of fingerprints string created by MayaChemTools
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
587 fingerprints generation scripts or explicitly specify its format.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
588 Possible values: *AutoDetect | FingerprintsBitVectorString |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
589 FingerprintsVectorString*. Default value: *AutoDetect*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
590
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
591 --OutDelim *comma | tab | semicolon*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
592 Delimiter for output CSV/TSV text file(s). Possible values: *comma,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
593 tab, or semicolon* Default value: *comma*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
594
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
595 --OutMatrixFormat *RowsAndColumns | IDPairsAndValue*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
596 Specify how similarity or distance values calculated for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
597 fingerprints vector and bit-vector strings are written to the output
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
598 CSV/TSV text file(s): Generate text files containing rows and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
599 columns with their labels corresponding to compound IDs and each
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
600 matrix element value corresponding to similarity or distance between
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
601 corresponding compounds; Generate text files containing rows
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
602 containing compoundIDs for two compounds followed by similarity or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
603 distance value between these compounds.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
604
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
605 Possible values: *RowsAndColumns, or IDPairsAndValue*. Default
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
606 value: *RowsAndColumns*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
607
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
608 The value of --OutMatrixFormat in conjunction with --OutMatrixType
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
609 determines type of data written to output files and allows
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
610 generation of up to 6 different output data formats:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
611
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
612 OutMatrixFormat OutMatrixType
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
613
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
614 RowsAndColumns FullMatrix [ DEFAULT ]
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
615 RowsAndColumns UpperTriangularMatrix
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
616 RowsAndColumns LowerTriangularMatrix
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
617
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
618 IDPairsAndValue FullMatrix
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
619 IDPairsAndValue UpperTriangularMatrix
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
620 IDPairsAndValue LowerTriangularMatrix
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
621
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
622 Example of data in output file for *RowsAndColumns*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
623 --OutMatrixFormat value for *FullMatrix* valueof --OutMatrixType:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
624
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
625 "","Cmpd1","Cmpd2","Cmpd3","Cmpd4","Cmpd5","Cmpd6",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
626 "Cmpd1","1","0.04","0.25","0.13","0.11","0.2",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
627 "Cmpd2","0.04","1","0.06","0.05","0.19","0.07",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
628 "Cmpd3","0.25","0.06","1","0.12","0.22","0.25",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
629 "Cmpd4","0.13","0.05","0.12","1","0.11","0.13",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
630 "Cmpd5","0.11","0.19","0.22","0.11","1","0.17",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
631 "Cmpd6","0.2","0.07","0.25","0.13","0.17","1",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
632 ... ... ..
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
633 ... ... ..
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
634 ... ... ..
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
635
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
636 Example of data in output file for *RowsAndColumns*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
637 --OutMatrixFormat value for *UpperTriangularMatrix* value of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
638 --OutMatrixType:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
639
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
640 "","Cmpd1","Cmpd2","Cmpd3","Cmpd4","Cmpd5","Cmpd6",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
641 "Cmpd1","1","0.04","0.25","0.13","0.11","0.2",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
642 "Cmpd2","1","0.06","0.05","0.19","0.07",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
643 "Cmpd3","1","0.12","0.22","0.25",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
644 "Cmpd4","1","0.11","0.13",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
645 "Cmpd5","1","0.17",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
646 "Cmpd6","1",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
647 ... ... ..
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
648 ... ... ..
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
649 ... ... ..
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
650
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
651 Example of data in output file for *RowsAndColumns*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
652 --OutMatrixFormat value for *LowerTriangularMatrix* value of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
653 --OutMatrixType:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
654
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
655 "","Cmpd1","Cmpd2","Cmpd3","Cmpd4","Cmpd5","Cmpd6",... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
656 "Cmpd1","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
657 "Cmpd2","0.04","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
658 "Cmpd3","0.25","0.06","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
659 "Cmpd4","0.13","0.05","0.12","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
660 "Cmpd5","0.11","0.19","0.22","0.11","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
661 "Cmpd6","0.2","0.07","0.25","0.13","0.17","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
662 ... ... ..
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
663 ... ... ..
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
664 ... ... ..
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
665
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
666 Example of data in output file for *IDPairsAndValue*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
667 --OutMatrixFormat value for <FullMatrix> value of OutMatrixType:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
668
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
669 "CmpdID1","CmpdID2","Coefficient Value"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
670 "Cmpd1","Cmpd1","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
671 "Cmpd1","Cmpd2","0.04"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
672 "Cmpd1","Cmpd3","0.25"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
673 "Cmpd1","Cmpd4","0.13"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
674 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
675 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
676 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
677 "Cmpd2","Cmpd1","0.04"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
678 "Cmpd2","Cmpd2","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
679 "Cmpd2","Cmpd3","0.06"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
680 "Cmpd2","Cmpd4","0.05"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
681 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
682 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
683 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
684 "Cmpd3","Cmpd1","0.25"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
685 "Cmpd3","Cmpd2","0.06"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
686 "Cmpd3","Cmpd3","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
687 "Cmpd3","Cmpd4","0.12"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
688 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
689 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
690 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
691
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
692 Example of data in output file for *IDPairsAndValue*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
693 --OutMatrixFormat value for <UpperTriangularMatrix> value of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
694 --OutMatrixType:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
695
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
696 "CmpdID1","CmpdID2","Coefficient Value"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
697 "Cmpd1","Cmpd1","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
698 "Cmpd1","Cmpd2","0.04"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
699 "Cmpd1","Cmpd3","0.25"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
700 "Cmpd1","Cmpd4","0.13"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
701 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
702 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
703 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
704 "Cmpd2","Cmpd2","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
705 "Cmpd2","Cmpd3","0.06"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
706 "Cmpd2","Cmpd4","0.05"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
707 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
708 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
709 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
710 "Cmpd3","Cmpd3","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
711 "Cmpd3","Cmpd4","0.12"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
712 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
713 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
714 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
715
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
716 Example of data in output file for *IDPairsAndValue*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
717 --OutMatrixFormat value for <LowerTriangularMatrix> value of
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
718 --OutMatrixType:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
719
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
720 "CmpdID1","CmpdID2","Coefficient Value"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
721 "Cmpd1","Cmpd1","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
722 "Cmpd2","Cmpd1","0.04"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
723 "Cmpd2","Cmpd2","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
724 "Cmpd3","Cmpd1","0.25"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
725 "Cmpd3","Cmpd2","0.06"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
726 "Cmpd3","Cmpd3","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
727 "Cmpd4","Cmpd1","0.13"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
728 "Cmpd4","Cmpd2","0.05"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
729 "Cmpd4","Cmpd3","0.12"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
730 "Cmpd4","Cmpd4","1"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
731 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
732 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
733 ... ... ...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
734
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
735 --OutMatrixType *FullMatrix | UpperTriangularMatrix |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
736 LowerTriangularMatrix*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
737 Type of similarity or distance matrix to calculate for fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
738 vector and bit-vector strings: Calculate full matrix; Calculate
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
739 lower triangular matrix including diagonal; Calculate upper
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
740 triangular matrix including diagonal.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
741
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
742 Possible values: *FullMatrix, UpperTriangularMatrix, or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
743 LowerTriangularMatrix*. Default value: *FullMatrix*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
744
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
745 The value of --OutMatrixType in conjunction with --OutMatrixFormat
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
746 determines type of data written to output files.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
747
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
748 -o, --overwrite
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
749 Overwrite existing files
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
750
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
751 -p, --precision *number*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
752 Precision of calculated values in the output file. Default: up to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
753 *2* decimal places. Valid values: positive integers.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
754
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
755 -q, --quote *Yes | No*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
756 Put quote around column values in output CSV/TSV text file(s).
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
757 Possible values: *Yes or No*. Default value: *Yes*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
758
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
759 -r, --root *RootName*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
760 New file name is generated using the root:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
761 <Root><BitVectorComparisonMode>.<Ext> or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
762 <Root><VectorComparisonMode><VectorComparisonFormulism>.<Ext>. The
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
763 csv, and tsv <Ext> values are used for comma/semicolon, and tab
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
764 delimited text files respectively. This option is ignored for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
765 multiple input files.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
766
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
767 -v, --VectorComparisonMode *All |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
768 "TanimotoSimilarity,[ManhattanDistance,...]"*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
769 Specify what similarity or distance coefficients to use for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
770 calculating similarity matrices for fingerprint vector strings data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
771 values in *TextFile(s)*: calculate similarity matrices for all
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
772 supported similarity and distance coefficients or specify a comma
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
773 delimited list of similarity and distance coefficients. Possible
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
774 values: *All | "TanimotoSimilairy,[ManhattanDistance,..]"*. Default:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
775 *TanimotoSimilarity*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
776
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
777 The value of -v, --VectorComparisonMode, in conjunction with
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
778 --VectorComparisonFormulism, decides which type of similarity and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
779 distance coefficient formulism gets used.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
780
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
781 *All* uses complete list of supported similarity and distance
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
782 coefficients: *CosineSimilarity, CzekanowskiSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
783 DiceSimilarity, OchiaiSimilarity, JaccardSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
784 SorensonSimilarity, TanimotoSimilarity, CityBlockDistance,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
785 EuclideanDistance, HammingDistance, ManhattanDistance,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
786 SoergelDistance*. These similarity and distance coefficients are
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
787 described below.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
788
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
789 FingerprintsVector.pm module, used to calculate similarity and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
790 distance coefficients, provides support to perform comparison
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
791 between vectors containing three different types of values:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
792
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
793 Type I: OrderedNumericalValues
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
794
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
795 . Size of two vectors are same
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
796 . Vectors contain real values in a specific order. For example: MACCS keys
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
797 count, Topological pharmnacophore atom pairs and so on.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
798
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
799 Type II: UnorderedNumericalValues
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
800
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
801 . Size of two vectors might not be same
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
802 . Vectors contain unordered real value identified by value IDs. For example:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
803 Toplogical atom pairs, Topological atom torsions and so on
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
804
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
805 Type III: AlphaNumericalValues
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
806
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
807 . Size of two vectors might not be same
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
808 . Vectors contain unordered alphanumerical values. For example: Extended
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
809 connectivity fingerprints, atom neighborhood fingerprints.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
810
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
811 Before performing similarity or distance calculations between
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
812 vectors containing UnorderedNumericalValues or AlphaNumericalValues,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
813 the vectors are transformed into vectors containing unique
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
814 OrderedNumericalValues using value IDs for UnorderedNumericalValues
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
815 and values itself for AlphaNumericalValues.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
816
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
817 Three forms of similarity and distance calculation between two
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
818 vectors, specified using --VectorComparisonFormulism option, are
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
819 supported: *AlgebraicForm, BinaryForm or SetTheoreticForm*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
820
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
821 For *BinaryForm*, the ordered list of processed final vector values
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
822 containing the value or count of each unique value type is simply
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
823 converted into a binary vector containing 1s and 0s corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
824 presence or absence of values before calculating similarity or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
825 distance between two vectors.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
826
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
827 For two fingerprint vectors A and B of same size containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
828 OrderedNumericalValues, let:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
829
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
830 N = Number values in A or B
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
831
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
832 Xa = Values of vector A
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
833 Xb = Values of vector B
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
834
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
835 Xai = Value of ith element in A
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
836 Xbi = Value of ith element in B
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
837
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
838 SUM = Sum of i over N values
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
839
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
840 For SetTheoreticForm of calculation between two vectors, let:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
841
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
842 SetIntersectionXaXb = SUM ( MIN ( Xai, Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
843 SetDifferenceXaXb = SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN ( Xai, Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
844
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
845 For BinaryForm of calculation between two vectors, let:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
846
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
847 Na = Number of bits set to "1" in A = SUM ( Xai )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
848 Nb = Number of bits set to "1" in B = SUM ( Xbi )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
849 Nc = Number of bits set to "1" in both A and B = SUM ( Xai * Xbi )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
850 Nd = Number of bits set to "0" in both A and B
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
851 = SUM ( 1 - Xai - Xbi + Xai * Xbi)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
852
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
853 N = Number of bits set to "1" or "0" in A or B = Size of A or B = Na + Nb - Nc + Nd
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
854
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
855 Additionally, for BinaryForm various values also correspond to:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
856
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
857 Na = | Xa |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
858 Nb = | Xb |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
859 Nc = | SetIntersectionXaXb |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
860 Nd = N - | SetDifferenceXaXb |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
861
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
862 | SetDifferenceXaXb | = N - Nd = Na + Nb - Nc + Nd - Nd = Na + Nb - Nc
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
863 = | Xa | + | Xb | - | SetIntersectionXaXb |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
864
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
865 Various similarity and distance coefficients [ Ref 40, Ref 62, Ref
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
866 64 ] for a pair of vectors A and B in *AlgebraicForm, BinaryForm and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
867 SetTheoreticForm* are defined as follows:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
868
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
869 CityBlockDistance: ( same as HammingDistance and ManhattanDistance)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
870
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
871 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
872
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
873 *BinaryForm*: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
874
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
875 *SetTheoreticForm*: | SetDifferenceXaXb | - | SetIntersectionXaXb |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
876 = SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
877
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
878 CosineSimilarity: ( same as OchiaiSimilarityCoefficient)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
879
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
880 *AlgebraicForm*: SUM ( Xai * Xbi ) / SQRT ( SUM ( Xai ** 2) * SUM (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
881 Xbi ** 2) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
882
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
883 *BinaryForm*: Nc / SQRT ( Na * Nb)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
884
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
885 *SetTheoreticForm*: | SetIntersectionXaXb | / SQRT ( |Xa| * |Xb| ) =
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
886 SUM ( MIN ( Xai, Xbi ) ) / SQRT ( SUM ( Xai ) * SUM ( Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
887
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
888 CzekanowskiSimilarity: ( same as DiceSimilarity and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
889 SorensonSimilarity)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
890
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
891 *AlgebraicForm*: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) +
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
892 SUM ( Xbi **2 ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
893
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
894 *BinaryForm*: 2 * Nc / ( Na + Nb )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
895
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
896 *SetTheoreticForm*: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) =
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
897 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
898
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
899 DiceSimilarity: ( same as CzekanowskiSimilarity and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
900 SorensonSimilarity)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
901
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
902 *AlgebraicForm*: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) +
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
903 SUM ( Xbi **2 ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
904
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
905 *BinaryForm*: 2 * Nc / ( Na + Nb )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
906
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
907 *SetTheoreticForm*: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) =
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
908 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
909
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
910 EuclideanDistance:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
911
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
912 *AlgebraicForm*: SQRT ( SUM ( ( ( Xai - Xbi ) ** 2 ) ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
913
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
914 *BinaryForm*: SQRT ( ( Na - Nc ) + ( Nb - Nc ) ) = SQRT ( Na + Nb -
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
915 2 * Nc )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
916
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
917 *SetTheoreticForm*: SQRT ( | SetDifferenceXaXb | - |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
918 SetIntersectionXaXb | ) = SQRT ( SUM ( Xai ) + SUM ( Xbi ) - 2 * (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
919 SUM ( MIN ( Xai, Xbi ) ) ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
920
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
921 HammingDistance: ( same as CityBlockDistance and ManhattanDistance)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
922
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
923 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
924
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
925 *BinaryForm*: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
926
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
927 *SetTheoreticForm*: | SetDifferenceXaXb | - | SetIntersectionXaXb |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
928 = SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
929
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
930 JaccardSimilarity: ( same as TanimotoSimilarity)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
931
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
932 *AlgebraicForm*: SUM ( Xai * Xbi ) / ( SUM ( Xai ** 2 ) + SUM ( Xbi
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
933 ** 2 ) - SUM ( Xai * Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
934
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
935 *BinaryForm*: Nc / ( ( Na - Nc ) + ( Nb - Nc ) + Nc ) = Nc / ( Na +
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
936 Nb - Nc )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
937
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
938 *SetTheoreticForm*: | SetIntersectionXaXb | / | SetDifferenceXaXb |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
939 = SUM ( MIN ( Xai, Xbi ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
940 ( Xai, Xbi ) ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
941
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
942 ManhattanDistance: ( same as CityBlockDistance and HammingDistance)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
943
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
944 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
945
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
946 *BinaryForm*: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
947
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
948 *SetTheoreticForm*: | SetDifferenceXaXb | - | SetIntersectionXaXb |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
949 = SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
950
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
951 OchiaiSimilarity: ( same as CosineSimilarity)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
952
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
953 *AlgebraicForm*: SUM ( Xai * Xbi ) / SQRT ( SUM ( Xai ** 2) * SUM (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
954 Xbi ** 2) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
955
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
956 *BinaryForm*: Nc / SQRT ( Na * Nb)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
957
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
958 *SetTheoreticForm*: | SetIntersectionXaXb | / SQRT ( |Xa| * |Xb| ) =
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
959 SUM ( MIN ( Xai, Xbi ) ) / SQRT ( SUM ( Xai ) * SUM ( Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
960
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
961 SorensonSimilarity: ( same as CzekanowskiSimilarity and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
962 DiceSimilarity)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
963
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
964 *AlgebraicForm*: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) +
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
965 SUM ( Xbi **2 ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
966
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
967 *BinaryForm*: 2 * Nc / ( Na + Nb )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
968
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
969 *SetTheoreticForm*: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) =
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
970 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
971
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
972 SoergelDistance:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
973
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
974 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) ) / SUM ( MAX ( Xai, Xbi )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
975 )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
976
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
977 *BinaryForm*: 1 - Nc / ( Na + Nb - Nc ) = ( Na + Nb - 2 * Nc ) / (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
978 Na + Nb - Nc )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
979
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
980 *SetTheoreticForm*: ( | SetDifferenceXaXb | - | SetIntersectionXaXb
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
981 | ) / | SetDifferenceXaXb | = ( SUM ( Xai ) + SUM ( Xbi ) - 2 * (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
982 SUM ( MIN ( Xai, Xbi ) ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
983 MIN ( Xai, Xbi ) ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
984
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
985 TanimotoSimilarity: ( same as JaccardSimilarity)
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
986
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
987 *AlgebraicForm*: SUM ( Xai * Xbi ) / ( SUM ( Xai ** 2 ) + SUM ( Xbi
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
988 ** 2 ) - SUM ( Xai * Xbi ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
989
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
990 *BinaryForm*: Nc / ( ( Na - Nc ) + ( Nb - Nc ) + Nc ) = Nc / ( Na +
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
991 Nb - Nc )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
992
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
993 *SetTheoreticForm*: | SetIntersectionXaXb | / | SetDifferenceXaXb |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
994 = SUM ( MIN ( Xai, Xbi ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
995 ( Xai, Xbi ) ) )
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
996
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
997 --VectorComparisonFormulism *All |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
998 "AlgebraicForm,[BinaryForm,SetTheoreticForm]"*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
999 Specify fingerprints vector comparison formulism to use for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1000 calculation similarity and distance coefficients during -v,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1001 --VectorComparisonMode: use all supported comparison formulisms or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1002 specify a comma delimited. Possible values: *All |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1003 "AlgebraicForm,[BinaryForm,SetTheoreticForm]"*. Default value:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1004 *AlgebraicForm*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1005
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1006 *All* uses all three forms of supported vector comparison formulism
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1007 for values of -v, --VectorComparisonMode option.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1008
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1009 For fingerprint vector strings containing AlphaNumericalValues data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1010 values - ExtendedConnectivityFingerprints,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1011 AtomNeighborhoodsFingerprints and so on - all three formulism result
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1012 in same value during similarity and distance calculations.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1013
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1014 -w, --WorkingDir *DirName*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1015 Location of working directory. Default: current directory.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1016
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1017 EXAMPLES
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1018 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1019 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1020 supported fingerprints in text file present in a column name containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1021 Fingerprint substring by loading all fingerprints data into memory and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1022 create a SampleFPHexTanimotoSimilarity.csv file containing compound IDs
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1023 retrieved from column name containing CompoundID substring, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1024
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1025 % SimilarityMatricesFingerprints.pl -o SampleFPHex.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1026
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1027 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1028 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1029 supported fingerprints in SD File present in a data field with
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1030 Fingerprint substring in its label by loading all fingerprints data into
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1031 memory and create a SampleFPHexTanimotoSimilarity.csv file containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1032 sequentially generated compound IDs with Cmpd prefix, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1033
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1034 % SimilarityMatricesFingerprints.pl -o SampleFPHex.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1035
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1036 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1037 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1038 supported fingerprints in FP file by loading all fingerprints data into
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1039 memory and create a SampleFPHexTanimotoSimilarity.csv file along with
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1040 compound IDs retrieved from FP file, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1041
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1042 % SimilarityMatricesFingerprints.pl -o SampleFPHex.fpf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1043
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1044 To generate a lower triangular similarity matrix corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1045 Tanimoto similarity coefficient for fingerprints bit-vector strings data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1046 corresponding to supported fingerprints in text file present in a column
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1047 name containing Fingerprint substring by loading all fingerprints data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1048 into memory and create a SampleFPHexTanimotoSimilarity.csv file
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1049 containing compound IDs retrieved from column name containing CompoundID
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1050 substring, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1051
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1052 % SimilarityMatricesFingerprints.pl -o --InputDataMode LoadInMemory
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1053 --OutMatrixFormat RowsAndColumns --OutMatrixType LowerTriangularMatrix
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1054 SampleFPHex.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1055
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1056 To generate a upper triangular similarity matrix corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1057 Tanimoto similarity coefficient for fingerprints bit-vector strings data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1058 corresponding to supported fingerprints in text file present in a column
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1059 name containing Fingerprint substring by loading all fingerprints data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1060 into memory and create a SampleFPHexTanimotoSimilarity.csv file in
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1061 IDPairsAndValue format containing compound IDs retrieved from column
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1062 name containing CompoundID substring, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1063
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1064 % SimilarityMatricesFingerprints.pl -o --InputDataMode LoadInMemory
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1065 --OutMatrixFormat IDPairsAndValue --OutMatrixType UpperTriangularMatrix
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1066 SampleFPHex.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1067
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1068 To generate a full similarity matrix corresponding to Tanimoto
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1069 similarity coefficient for fingerprints bit-vector strings data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1070 corresponding to supported fingerprints in text file present in a column
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1071 name containing Fingerprint substring by scanning file without loading
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1072 all fingerprints data into memory and create a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1073 SampleFPHexTanimotoSimilarity.csv file containing compound IDs retrieved
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1074 from column name containing CompoundID substring, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1075
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1076 % SimilarityMatricesFingerprints.pl -o --InputDataMode ScanFile
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1077 --OutMatrixFormat RowsAndColumns --OutMatrixType FullMatrix
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1078 SampleFPHex.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1079
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1080 To generate a lower triangular similarity matrix corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1081 Tanimoto similarity coefficient for fingerprints bit-vector strings data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1082 corresponding to supported fingerprints in text file present in a column
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1083 name containing Fingerprint substring by scanning file without loading
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1084 all fingerprints data into memory and create a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1085 SampleFPHexTanimotoSimilarity.csv file in IDPairsAndValue format
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1086 containing compound IDs retrieved from column name containing CompoundID
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1087 substring, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1088
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1089 % SimilarityMatricesFingerprints.pl -o --InputDataMode ScanFile
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1090 --OutMatrixFormat IDPairsAndValue --OutMatrixType LowerTriangularMatrix
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1091 SampleFPHex.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1092
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1093 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1094 coefficient using algebraic formulism for fingerprints vector strings
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1095 data corresponding to supported fingerprints in text file present in a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1096 column name containing Fingerprint substring and create a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1097 SampleFPCountTanimotoSimilarityAlgebraicForm.csv file containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1098 compound IDs retrieved from column name containing CompoundID substring,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1099 type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1100
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1101 % SimilarityMatricesFingerprints.pl -o SampleFPCount.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1102
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1103 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1104 coefficient using algebraic formulism for fingerprints vector strings
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1105 data corresponding to supported fingerprints in SD file present in a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1106 data field with Fingerprint substring in its label and create a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1107 SampleFPCountTanimotoSimilarityAlgebraicForm.csv file containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1108 sequentially generated compound IDs with Cmpd prefix, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1109
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1110 % SimilarityMatricesFingerprints.pl -o SampleFPCount.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1111
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1112 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1113 coefficient using algebraic formulism vector strings data corresponding
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1114 to supported fingerprints in FP file and create a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1115 SampleFPCountTanimotoSimilarityAlgebraicForm.csv file along with
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1116 compound IDs retrieved from FP file, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1117
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1118 % SimilarityMatricesFingerprints.pl -o SampleFPCount.fpf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1119
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1120 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1121 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1122 supported fingerprints in text file present in a column name containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1123 Fingerprint substring and create a SampleFPHexTanimotoSimilarity.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1124 file in IDPairsAndValue format containing compound IDs retrieved from
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1125 column name containing CompoundID substring, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1126
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1127 % SimilarityMatricesFingerprints.pl --OutMatrixFormat IDPairsAndValue -o
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1128 SampleFPHex.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1129
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1130 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1131 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1132 supported fingerprints in SD file present in a data field with
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1133 Fingerprint substring in its label and create a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1134 SampleFPHexTanimotoSimilarity.csv file in IDPairsAndValue format
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1135 containing sequentially generated compound IDs with Cmpd prefix, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1136
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1137 % SimilarityMatricesFingerprints.pl --OutMatrixFormat IDPairsAndValue -o
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1138 SampleFPHex.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1139
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1140 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1141 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1142 supported fingerprints in FP file and create a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1143 SampleFPHexTanimotoSimilarity.csv file in IDPairsAndValue format along
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1144 with compound IDs retrieved from FP file, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1145
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1146 % SimilarityMatricesFingerprints.pl --OutMatrixFormat IDPairsAndValue -o
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1147 SampleFPHex.fpf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1148
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1149 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1150 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1151 supported fingerprints in SD file present in a data field with
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1152 Fingerprint substring in its label and create a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1153 SampleFPHexTanimotoSimilarity.csv file containing compound IDs from mol
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1154 name line, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1155
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1156 % SimilarityMatricesFingerprints.pl --CompoundIDMode MolName -o
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1157 SampleFPHex.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1158
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1159 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1160 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1161 supported fingerprints present in a data field with Fingerprint
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1162 substring in its label and create a SampleFPHexTanimotoSimilarity.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1163 file containing compound IDs from data field name Mol_ID, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1164
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1165 % SimilarityMatricesFingerprints.pl --CompoundIDMode DataField
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1166 --CompoundIDField Mol_ID -o SampleFPBin.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1167
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1168 To generate similarity matrices corresponding to Buser, Dice and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1169 Tanimoto similarity coefficient for fingerprints bit-vector strings data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1170 corresponding to supported fingerprints present in a column name
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1171 containing Fingerprint substring and create
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1172 SampleFPBin[CoefficientName]Similarity.csv files containing compound IDs
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1173 retrieved from column name containing CompoundID substring, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1174
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1175 % SimilarityMatricesFingerprints.pl -b "BuserSimilarity,DiceSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1176 TanimotoSimilarity" -o SampleFPBin.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1177
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1178 To generate similarity matrices corresponding to Buser, Dice and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1179 Tanimoto similarity coefficient for fingerprints bit-vector strings data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1180 corresponding to supported fingerprints present in a data field with
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1181 Fingerprint substring in its label and create
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1182 SampleFPBin[CoefficientName]Similarity.csv files containing sequentially
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1183 generated compound IDs with Cmpd prefix, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1184
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1185 % SimilarityMatricesFingerprints.pl -b "BuserSimilarity,DiceSimilarity,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1186 TanimotoSimilarity" -o SampleFPBin.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1187
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1188 To generate similarity matrices corresponding to CityBlock distance and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1189 Tanimoto similarity coefficients using algebraic formulism for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1190 fingerprints vector strings data corresponding to supported fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1191 present in a column name containing Fingerprint substring and create
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1192 SampleFPCount[CoefficientName]AlgebraicForm.csv files containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1193 compound IDs retrieved from column name containing CompoundID substring,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1194 type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1195
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1196 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1197 TanimotoSimilarity" -o SampleFPCount.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1198
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1199 To generate similarity matrices corresponding to CityBlock distance and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1200 Tanimoto similarity coefficients using algebraic formulism for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1201 fingerprints vector strings data corresponding to supported fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1202 present in a data field with Fingerprint substring in its label and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1203 create SampleFPCount[CoefficientName]AlgebraicForm.csv files containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1204 sequentially generated compound IDs with Cmpd prefix, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1205
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1206 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1207 TanimotoSimilarity" -o SampleFPCount.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1208
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1209 To generate similarity matrices corresponding to CityBlock distance
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1210 Tanimoto similarity coefficients using binary formulism for fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1211 vector strings data corresponding to supported fingerprints present in a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1212 column name containing Fingerprint substring and create
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1213 SampleFPCount[CoefficientName]Binary.csv files containing compound IDs
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1214 retrieved from column name containing CompoundID substring, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1215
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1216 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1217 TanimotoSimilarity" --VectorComparisonFormulism BinaryForm -o
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1218 SampleFPCount.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1219
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1220 To generate similarity matrices corresponding to CityBlock distance
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1221 Tanimoto similarity coefficients using binary formulism for fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1222 vector strings data corresponding to supported fingerprints present in a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1223 data field with Fingerprint substring in its label and create
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1224 SampleFPCount[CoefficientName]Binary.csv files containing sequentially
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1225 generated compound IDs with Cmpd prefix, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1226
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1227 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1228 TanimotoSimilarity" --VectorComparisonFormulism BinaryForm -o
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1229 SampleFPCount.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1230
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1231 To generate similarity matrices corresponding to CityBlock distance
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1232 Tanimoto similarity coefficients using all supported comparison
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1233 formulisms for fingerprints vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1234 supported fingerprints present in a column name containing Fingerprint
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1235 substring and create SampleFPCount[CoefficientName][FormulismName].csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1236 files containing compound IDs retrieved from column name containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1237 CompoundID substring, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1238
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1239 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1240 TanimotoSimilarity" --VectorComparisonFormulism All -o SampleFPCount.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1241
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1242 To generate similarity matrices corresponding to CityBlock distance
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1243 Tanimoto similarity coefficients using all supported comparison
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1244 formulisms for fingerprints vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1245 supported fingerprints present in a data field with Fingerprint
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1246 substring in its label and create
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1247 SampleFPCount[CoefficientName][FormulismName].csv files containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1248 sequentially generated compound IDs with Cmpd prefix, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1249
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1250 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,TanimotoSimilarity"
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1251 --VectorComparisonFormulism All -o SampleFPCount.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1252
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1253 To generate similarity matrices corresponding to all available
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1254 similarity coefficient for fingerprints bit-vector strings data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1255 corresponding to supported fingerprints present in a column name
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1256 containing Fingerprint substring and create
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1257 SampleFPHex[CoefficientName].csv files containing compound IDs retrieved
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1258 from column name containing CompoundID substring, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1259
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1260 % SimilarityMatricesFingerprints.pl -m AutoDetect --BitVectorComparisonMode
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1261 All --alpha 0.5 -beta 0.5 -o SampleFPHex.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1262
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1263 To generate similarity matrices corresponding to all available
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1264 similarity coefficient for fingerprints bit-vector strings data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1265 corresponding to supported fingerprints present in a data field with
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1266 Fingerprint substring in its label and create
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1267 SampleFPHex[CoefficientName].csv files containing sequentially generated
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1268 compound IDs with Cmpd prefix, type
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1269
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1270 % SimilarityMatricesFingerprints.pl -m AutoDetect --BitVectorComparisonMode
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1271 All --alpha 0.5 -beta 0.5 -o SampleFPHex.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1272
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1273 To generate similarity matrices corresponding to all available
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1274 similarity and distance coefficients using all comparison formulism for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1275 fingerprints vector strings data corresponding to supported fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1276 present in a column name containing Fingerprint substring and create
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1277 SampleFPCount[CoefficientName][FormulismName].csv files containing
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1278 compound IDs retrieved from column name containing CompoundID substring,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1279 type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1280
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1281 % SimilarityMatricesFingerprints.pl -m AutoDetect --VectorComparisonMode
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1282 All --VectorComparisonFormulism All -o SampleFPCount.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1283
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1284 To generate similarity matrices corresponding to all available
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1285 similarity and distance coefficients using all comparison formulism for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1286 fingerprints vector strings data corresponding to supported fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1287 present in a data field with Fingerprint substring in its label and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1288 create SampleFPCount[CoefficientName][FormulismName].csv files
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1289 containing sequentially generated compound IDs with Cmpd prefix, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1290
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1291 % SimilarityMatricesFingerprints.pl -m AutoDetect --VectorComparisonMode
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1292 All --VectorComparisonFormulism All -o SampleFPCount.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1293
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1294 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1295 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1296 supported fingerprints present in a column number 2 and create a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1297 SampleFPHexTanimotoSimilarity.csv file containing compound IDs retrieved
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1298 column number 1, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1299
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1300 % SimilarityMatricesFingerprints.pl --ColMode ColNum --CompoundIDCol 1
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1301 --FingerprintsCol 2 -o SampleFPHex.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1302
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1303 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1304 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1305 supported fingerprints present in a data field name Fingerprints and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1306 create a SampleFPHexTanimotoSimilarity.csv file containing compound IDs
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1307 present in data field name Mol_ID, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1308
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1309 % SimilarityMatricesFingerprints.pl --FingerprintsField Fingerprints
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1310 --CompoundIDMode DataField --CompoundIDField Mol_ID -o SampleFPHex.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1311
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1312 To generate a similarity matrix corresponding to Tversky similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1313 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1314 supported fingerprints present in a column named Fingerprints and create
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1315 a SampleFPHexTverskySimilarity.tsv file containing compound IDs
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1316 retrieved column named CompoundID, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1317
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1318 % SimilarityMatricesFingerprints.pl --BitVectorComparisonMode
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1319 TverskySimilarity --alpha 0.5 --ColMode ColLabel --CompoundIDCol
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1320 CompoundID --FingerprintsCol Fingerprints --OutDelim Tab --quote No
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1321 -o SampleFPHex.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1322
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1323 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1324 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1325 supported fingerprints present in a data field with Fingerprint
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1326 substring in its label and create a SampleFPHexTanimotoSimilarity.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1327 file containing compound IDs from molname line or sequentially generated
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1328 compound IDs with Mol prefix, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1329
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1330 % SimilarityMatricesFingerprints.pl --CompoundIDMode MolnameOrLabelPrefix
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1331 --CompoundIDPrefix Mol -o SampleFPHex.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1332
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1333 To generate a similarity matrix corresponding to Tanimoto similarity
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1334 coefficient for fingerprints bit-vector strings data corresponding to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1335 supported fingerprints present in a data field with Fingerprint
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1336 substring in its label and create a SampleFPHexTanimotoSimilarity.tsv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1337 file containing sequentially generated compound IDs with Cmpd prefix,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1338 type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1339
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1340 % SimilarityMatricesFingerprints.pl -OutDelim Tab --quote No -o SampleFPHex.sdf
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1341
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1342 AUTHOR
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1343 Manish Sud <msud@san.rr.com>
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1344
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1345 SEE ALSO
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1346 InfoFingerprintsFiles.pl, SimilaritySearchingFingerprints.pl,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1347 AtomNeighborhoodsFingerprints.pl, ExtendedConnectivityFingerprints.pl,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1348 MACCSKeysFingerprints.pl, PathLengthFingerprints.pl,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1349 TopologicalAtomPairsFingerprints.pl,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1350 TopologicalAtomTorsionsFingerprints.pl,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1351 TopologicalPharmacophoreAtomPairsFingerprints.pl,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1352 TopologicalPharmacophoreAtomTripletsFingerprints.pl
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1353
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1354 COPYRIGHT
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1355 Copyright (C) 2015 Manish Sud. All rights reserved.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1356
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1357 This file is part of MayaChemTools.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1358
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1359 MayaChemTools is free software; you can redistribute it and/or modify it
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1360 under the terms of the GNU Lesser General Public License as published by
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1361 the Free Software Foundation; either version 3 of the License, or (at
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1362 your option) any later version.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1363