annotate docs/scripts/txt/SimilarityMatricesFingerprints.txt @ 3:90ea638ce878 draft default tip

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 09:11:59 -0500
parents 2abf0d43254d
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1 NAME
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
2 SimilarityMatricesFingerprints.pl - Calculate similarity matrices using
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
3 fingerprints strings data in SD, FP and CSV/TSV text file(s)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
4
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
5 SYNOPSIS
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
6 SimilarityMatricesFingerprints.pl SDFile(s) FPFile(s) TextFile(s)...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
7
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
8 SimilarityMatricesFingerprints.pl [--alpha *number*] [--beta *number*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
9 [-b, --BitVectorComparisonMode *All | "TanimotoSimilarity,[
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
10 TverskySimilarity, ... ]"*] [-c, --ColMode *ColNum | ColLabel*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
11 [--CompoundIDCol *col number | col name*] [--CompoundIDPrefix *text*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
12 [--CompoundIDField *DataFieldName*] [--CompoundIDMode *DataField |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
13 MolName | LabelPrefix | MolNameOrLabelPrefix*] [-d, --detail
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
14 *InfoLevel*] [-f, --fast] [--FingerprintsCol *col number | col name*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
15 [--FingerprintsField *FieldLabel*] [-h, --help] [--InDelim *comma |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
16 semicolon*] [--InputDataMode *LoadInMemory | ScanFile*] [-m, --mode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
17 *AutoDetect | FingerprintsBitVectorString | FingerprintsVectorString*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
18 [--OutDelim *comma | tab | semicolon*] [--OutMatrixFormat
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
19 *RowsAndColumns | IDPairsAndValue*] [--OutMatrixType *FullMatrix |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
20 UpperTriangularMatrix | LowerTriangularMatrix*] [-o, --overwrite] [-p,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
21 --precision *number*] [-q, --quote *Yes | No*] [-r, --root *RootName*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
22 [-v, --VectorComparisonMode *All | "TanimotoSimilairy, [
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
23 ManhattanDistance, ...]"*] [--VectorComparisonFormulism *All |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
24 "AlgebraicForm, [BinaryForm, SetTheoreticForm]"*] [-w, --WorkingDir
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
25 dirname] SDFile(s) FPFile(s) TextFile(s)...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
26
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
27 DESCRIPTION
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
28 Calculate similarity matrices using fingerprint bit-vector or vector
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
29 strings data in *SD, FP and CSV/TSV* text file(s) and generate CSV/TSV
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
30 text file(s) containing values for specified similarity and distance
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
31 coefficients.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
32
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
33 The scripts SimilarityMatrixSDFiles.pl and SimilarityMatrixTextFiles.pl
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
34 have been removed from the current release of MayaChemTools and their
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
35 functionality merged with this script.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
36
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
37 The valid *SDFile* extensions are *.sdf* and *.sd*. All SD files in a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
38 current directory can be specified either by **.sdf* or the current
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
39 directory name.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
40
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
41 The valid *FPFile* extensions are *.fpf* and *.fp*. All FP files in a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
42 current directory can be specified either by **.fpf* or the current
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
43 directory name.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
44
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
45 The valid *TextFile* extensions are *.csv* and *.tsv* for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
46 comma/semicolon and tab delimited text files respectively. All other
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
47 file names are ignored. All text files in a current directory can be
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
48 specified by **.csv*, **.tsv*, or the current directory name. The
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
49 --indelim option determines the format of *TextFile(s)*. Any file which
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
50 doesn't correspond to the format indicated by --indelim option is
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
51 ignored.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
52
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
53 Example of *FP* file containing fingerprints bit-vector string data:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
54
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
55 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
56 # Package = MayaChemTools 7.4
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
57 # ReleaseDate = Oct 21, 2010
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
58 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
59 # TimeStamp = Mon Mar 7 15:14:01 2011
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
60 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
61 # FingerprintsStringType = FingerprintsBitVector
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
62 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
63 # Description = PathLengthBits:AtomicInvariantsAtomTypes:MinLength1:...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
64 # Size = 1024
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
65 # BitStringFormat = HexadecimalString
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
66 # BitsOrder = Ascending
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
67 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
68 Cmpd1 9c8460989ec8a49913991a6603130b0a19e8051c89184414953800cc21510...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
69 Cmpd2 000000249400840040100042011001001980410c000000001010088001120...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
70 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
71 ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
72
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
73 Example of *FP* file containing fingerprints vector string data:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
74
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
75 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
76 # Package = MayaChemTools 7.4
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
77 # ReleaseDate = Oct 21, 2010
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
78 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
79 # TimeStamp = Mon Mar 7 15:14:01 2011
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
80 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
81 # FingerprintsStringType = FingerprintsVector
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
82 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
83 # Description = PathLengthBits:AtomicInvariantsAtomTypes:MinLength1:...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
84 # VectorStringFormat = IDsAndValuesString
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
85 # VectorValuesType = NumericalValues
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
86 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
87 Cmpd1 338;C F N O C:C C:N C=O CC CF CN CO C:C:C C:C:N C:CC C:CF C:CN C:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
88 N:C C:NC CC:N CC=O CCC CCN CCO CNC NC=O O=CO C:C:C:C C:C:C:N C:C:CC...;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
89 33 1 2 5 21 2 2 12 1 3 3 20 2 10 2 2 1 2 2 2 8 2 5 1 1 1 19 2 8 2 2 2 2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
90 6 2 2 2 2 2 2 2 2 3 2 2 1 4 1 5 1 1 18 6 2 2 1 2 10 2 1 2 1 2 2 2 2 ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
91 Cmpd2 103;C N O C=N C=O CC CN CO CC=O CCC CCN CCO CNC N=CN NC=O NCN O=C
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
92 O C CC=O CCCC CCCN CCCO CCNC CNC=N CNC=O CNCN CCCC=O CCCCC CCCCN CC...;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
93 15 4 4 1 2 13 5 2 2 15 5 3 2 2 1 1 1 2 17 7 6 5 1 1 1 2 15 8 5 7 2 2 2 2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
94 1 2 1 1 3 15 7 6 8 3 4 4 3 2 2 1 2 3 14 2 4 7 4 4 4 4 1 1 1 2 1 1 1 ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
95 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
96 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
97
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
98 Example of *SD* file containing fingerprints bit-vector string data:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
99
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
100 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
101 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
102 $$$$
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
103 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
104 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
105 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
106 41 44 0 0 0 0 0 0 0 0999 V2000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
107 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
108 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
109 2 3 1 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
110 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
111 M END
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
112 > <CmpdID>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
113 Cmpd1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
114
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
115 > <PathLengthFingerprints>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
116 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLengt
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
117 h1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a49913991a66
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
118 03130b0a19e8051c89184414953800cc2151082844a201042800130860308e8204d4028
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
119 00831048940e44281c00060449a5000ac80c894114e006321264401600846c050164462
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
120 08190410805000304a10205b0100e04c0038ba0fad0209c0ca8b1200012268b61c0026a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
121 aa0660a11014a011d46
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
122
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
123 $$$$
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
124 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
125 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
126
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
127 Example of CSV *Text* file containing fingerprints bit-vector string
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
128 data:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
129
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
130 "CompoundID","PathLengthFingerprints"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
131 "Cmpd1","FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
132 :MinLength1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a4
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
133 9913991a6603130b0a19e8051c89184414953800cc2151082844a20104280013086030
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
134 8e8204d402800831048940e44281c00060449a5000ac80c894114e006321264401..."
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
135 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
136 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
137
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
138 The current release of MayaChemTools supports the following types of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
139 fingerprint bit-vector and vector strings:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
140
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
141 FingerprintsVector;AtomNeighborhoods:AtomicInvariantsAtomTypes:MinRadi
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
142 us0:MaxRadius2;41;AlphaNumericalValues;ValuesString;NR0-C.X1.BO1.H3-AT
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
143 C1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-ATC1 NR0-C.X
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
144 1.BO1.H3-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-A
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
145 TC1 NR0-C.X2.BO2.H2-ATC1:NR1-C.X2.BO2.H2-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
146 -C.X2.BO2.H2-ATC1:NR2-N.X3.BO3-ATC1:NR2-O.X1.BO1.H1-ATC1 NR0-C.X2.B...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
147
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
148 FingerprintsVector;AtomTypesCount:AtomicInvariantsAtomTypes:ArbitraryS
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
149 ize;10;NumericalValues;IDsAndValuesString;C.X1.BO1.H3 C.X2.BO2.H2 C.X2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
150 .BO3.H1 C.X3.BO3.H1 C.X3.BO4 F.X1.BO1 N.X2.BO2.H1 N.X3.BO3 O.X1.BO1.H1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
151 O.X1.BO2;2 4 14 3 10 1 1 1 3 2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
152
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
153 FingerprintsVector;AtomTypesCount:SLogPAtomTypes:ArbitrarySize;16;Nume
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
154 ricalValues;IDsAndValuesString;C1 C10 C11 C14 C18 C20 C21 C22 C5 CS F
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
155 N11 N4 O10 O2 O9;5 1 1 1 14 4 2 1 2 2 1 1 1 1 3 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
156
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
157 FingerprintsVector;AtomTypesCount:SLogPAtomTypes:FixedSize;67;OrderedN
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
158 umericalValues;IDsAndValuesString;C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
159 12 C13 C14 C15 C16 C17 C18 C19 C20 C21 C22 C23 C24 C25 C26 C27 CS N1 N
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
160 2 N3 N4 N5 N6 N7 N8 N9 N10 N11 N12 N13 N14 NS O1 O2 O3 O4 O5 O6 O7 O8
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
161 O9 O10 O11 O12 OS F Cl Br I Hal P S1 S2 S3 Me1 Me2;5 0 0 0 2 0 0 0 0 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
162 1 0 0 1 0 0 0 14 0 4 2 1 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
163
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
164 FingerprintsVector;EStateIndicies:ArbitrarySize;11;NumericalValues;IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
165 AndValuesString;SaaCH SaasC SaasN SdO SdssC SsCH3 SsF SsOH SssCH2 SssN
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
166 H SsssCH;24.778 4.387 1.993 25.023 -1.435 3.975 14.006 29.759 -0.073 3
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
167 .024 -2.270
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
168
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
169 FingerprintsVector;EStateIndicies:FixedSize;87;OrderedNumericalValues;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
170 ValuesString;0 0 0 0 0 0 0 3.975 0 -0.073 0 0 24.778 -2.270 0 0 -1.435
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
171 4.387 0 0 0 0 0 0 3.024 0 0 0 0 0 0 0 1.993 0 29.759 25.023 0 0 0 0 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
172 4.006 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
173 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
174
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
175 FingerprintsVector;ExtendedConnectivity:AtomicInvariantsAtomTypes:Radi
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
176 us2;60;AlphaNumericalValues;ValuesString;73555770 333564680 352413391
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
177 666191900 1001270906 1371674323 1481469939 1977749791 2006158649 21414
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
178 08799 49532520 64643108 79385615 96062769 273726379 564565671 85514103
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
179 5 906706094 988546669 1018231313 1032696425 1197507444 1331250018 1338
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
180 532734 1455473691 1607485225 1609687129 1631614296 1670251330 17303...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
181
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
182 FingerprintsVector;ExtendedConnectivityCount:AtomicInvariantsAtomTypes
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
183 :Radius2;60;NumericalValues;IDsAndValuesString;73555770 333564680 3524
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
184 13391 666191900 1001270906 1371674323 1481469939 1977749791 2006158649
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
185 2141408799 49532520 64643108 79385615 96062769 273726379 564565671...;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
186 3 2 1 1 14 1 2 10 4 3 1 1 1 1 2 1 2 1 1 1 2 3 1 1 2 1 3 3 8 2 2 2 6 2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
187 1 2 1 1 2 1 1 1 2 1 1 2 1 2 1 1 1 1 1 1 1 1 1 2 1 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
188
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
189 FingerprintsBitVector;ExtendedConnectivityBits:AtomicInvariantsAtomTyp
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
190 es:Radius2;1024;BinaryString;Ascending;0000000000000000000000000000100
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
191 0000000001010000000110000011000000000000100000000000000000000000100001
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
192 1000000110000000000000000000000000010011000000000000000000000000010000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
193 0000000000000000000000000010000000000000000001000000000000000000000000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
194 0000000000010000100001000000000000101000000000000000100000000000000...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
195
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
196 FingerprintsVector;ExtendedConnectivity:FunctionalClassAtomTypes:Radiu
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
197 s2;57;AlphaNumericalValues;ValuesString;24769214 508787397 850393286 8
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
198 62102353 981185303 1231636850 1649386610 1941540674 263599683 32920567
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
199 1 571109041 639579325 683993318 723853089 810600886 885767127 90326012
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
200 7 958841485 981022393 1126908698 1152248391 1317567065 1421489994 1455
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
201 632544 1557272891 1826413669 1983319256 2015750777 2029559552 20404...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
202
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
203 FingerprintsVector;ExtendedConnectivity:EStateAtomTypes:Radius2;62;Alp
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
204 haNumericalValues;ValuesString;25189973 528584866 662581668 671034184
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
205 926543080 1347067490 1738510057 1759600920 2034425745 2097234755 21450
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
206 44754 96779665 180364292 341712110 345278822 386540408 387387308 50430
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
207 1706 617094135 771528807 957666640 997798220 1158349170 1291258082 134
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
208 1138533 1395329837 1420277211 1479584608 1486476397 1487556246 1566...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
209
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
210 FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;00000000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
211 0000000000000000000000000000000001001000010010000000010010000000011100
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
212 0100101010111100011011000100110110000011011110100110111111111111011111
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
213 11111111111110111000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
214
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
215 FingerprintsBitVector;MACCSKeyBits;322;BinaryString;Ascending;11101011
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
216 1110011111100101111111000111101100110000000000000011100010000000000000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
217 0000000000000000000000000000000000000000000000101000000000000000000000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
218 0000000000000000000000000000000000000000000000000000000000000000000000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
219 0000000000000000000000000000000000000011000000000000000000000000000000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
220 0000000000000000000000000000000000000000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
221
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
222 FingerprintsVector;MACCSKeyCount;166;OrderedNumericalValues;ValuesStri
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
223 ng;0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
224 0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 4 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
225 0 0 0 0 1 1 8 0 0 0 1 0 0 1 0 1 0 1 0 3 1 3 1 0 0 0 1 2 0 11 1 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
226 5 0 0 1 2 0 1 1 0 0 0 0 0 1 1 0 1 1 1 1 0 4 0 0 1 1 0 4 6 1 1 1 2 1 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
227 3 5 2 2 0 5 3 5 1 1 2 5 1 2 1 2 4 8 3 5 5 2 2 0 3 5 4 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
228
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
229 FingerprintsVector;MACCSKeyCount;322;OrderedNumericalValues;ValuesStri
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
230 ng;14 8 2 0 2 0 4 4 2 1 4 0 0 2 5 10 5 2 1 0 0 2 0 5 13 3 28 5 5 3 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
231 0 4 2 1 1 0 1 1 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22 5 3 0 0 0 1 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
232 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
233 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 0 2 0 0 0 0 0 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
234 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
235
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
236 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
237 th1:MaxLength8;1024;BinaryString;Ascending;001000010011010101011000110
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
238 0100010101011000101001011100110001000010001001101000001001001001001000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
239 0010110100000111001001000001001010100100100000000011000000101001011100
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
240 0010000001000101010100000100111100110111011011011000000010110111001101
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
241 0101100011000000010001000011000010100011101100001000001000100000000...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
242
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
243 FingerprintsVector;PathLengthCount:AtomicInvariantsAtomTypes:MinLength
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
244 1:MaxLength8;432;NumericalValues;IDsAndValuesPairsString;C.X1.BO1.H3 2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
245 C.X2.BO2.H2 4 C.X2.BO3.H1 14 C.X3.BO3.H1 3 C.X3.BO4 10 F.X1.BO1 1 N.X
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
246 2.BO2.H1 1 N.X3.BO3 1 O.X1.BO1.H1 3 O.X1.BO2 2 C.X1.BO1.H3C.X3.BO3.H1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
247 2 C.X2.BO2.H2C.X2.BO2.H2 1 C.X2.BO2.H2C.X3.BO3.H1 4 C.X2.BO2.H2C.X3.BO
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
248 4 1 C.X2.BO2.H2N.X3.BO3 1 C.X2.BO3.H1:C.X2.BO3.H1 10 C.X2.BO3.H1:C....
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
249
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
250 FingerprintsVector;PathLengthCount:MMFF94AtomTypes:MinLength1:MaxLengt
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
251 h8;463;NumericalValues;IDsAndValuesPairsString;C5A 2 C5B 2 C=ON 1 CB 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
252 8 COO 1 CR 9 F 1 N5 1 NC=O 1 O=CN 1 O=CO 1 OC=O 1 OR 2 C5A:C5B 2 C5A:N
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
253 5 2 C5ACB 1 C5ACR 1 C5B:C5B 1 C5BC=ON 1 C5BCB 1 C=ON=O=CN 1 C=ONNC=O 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
254 CB:CB 18 CBF 1 CBNC=O 1 COO=O=CO 1 COOCR 1 COOOC=O 1 CRCR 7 CRN5 1 CR
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
255 OR 2 C5A:C5B:C5B 2 C5A:C5BC=ON 1 C5A:C5BCB 1 C5A:N5:C5A 1 C5A:N5CR ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
256
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
257 FingerprintsVector;TopologicalAtomPairs:AtomicInvariantsAtomTypes:MinD
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
258 istance1:MaxDistance10;223;NumericalValues;IDsAndValuesString;C.X1.BO1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
259 .H3-D1-C.X3.BO3.H1 C.X2.BO2.H2-D1-C.X2.BO2.H2 C.X2.BO2.H2-D1-C.X3.BO3.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
260 H1 C.X2.BO2.H2-D1-C.X3.BO4 C.X2.BO2.H2-D1-N.X3.BO3 C.X2.BO3.H1-D1-...;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
261 2 1 4 1 1 10 8 1 2 6 1 2 2 1 2 1 2 2 1 2 1 5 1 10 12 2 2 1 2 1 9 1 3 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
262 1 1 2 2 1 3 6 1 6 14 2 2 2 3 1 3 1 8 2 2 1 3 2 6 1 2 2 5 1 3 1 23 1...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
263
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
264 FingerprintsVector;TopologicalAtomPairs:FunctionalClassAtomTypes:MinDi
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
265 stance1:MaxDistance10;144;NumericalValues;IDsAndValuesString;Ar-D1-Ar
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
266 Ar-D1-Ar.HBA Ar-D1-HBD Ar-D1-Hal Ar-D1-None Ar.HBA-D1-None HBA-D1-NI H
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
267 BA-D1-None HBA.HBD-D1-NI HBA.HBD-D1-None HBD-D1-None NI-D1-None No...;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
268 23 2 1 1 2 1 1 1 1 2 1 1 7 28 3 1 3 2 8 2 1 1 1 5 1 5 24 3 3 4 2 13 4
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
269 1 1 4 1 5 22 4 4 3 1 19 1 1 1 1 1 2 2 3 1 1 8 25 4 5 2 3 1 26 1 4 1 ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
270
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
271 FingerprintsVector;TopologicalAtomTorsions:AtomicInvariantsAtomTypes;3
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
272 3;NumericalValues;IDsAndValuesString;C.X1.BO1.H3-C.X3.BO3.H1-C.X3.BO4-
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
273 C.X3.BO4 C.X1.BO1.H3-C.X3.BO3.H1-C.X3.BO4-N.X3.BO3 C.X2.BO2.H2-C.X2.BO
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
274 2.H2-C.X3.BO3.H1-C.X2.BO2.H2 C.X2.BO2.H2-C.X2.BO2.H2-C.X3.BO3.H1-O...;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
275 2 2 1 1 2 2 1 1 3 4 4 8 4 2 2 6 2 2 1 2 1 1 2 1 1 2 6 2 4 2 1 3 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
276
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
277 FingerprintsVector;TopologicalAtomTorsions:EStateAtomTypes;36;Numerica
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
278 lValues;IDsAndValuesString;aaCH-aaCH-aaCH-aaCH aaCH-aaCH-aaCH-aasC aaC
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
279 H-aaCH-aasC-aaCH aaCH-aaCH-aasC-aasC aaCH-aaCH-aasC-sF aaCH-aaCH-aasC-
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
280 ssNH aaCH-aasC-aasC-aasC aaCH-aasC-aasC-aasN aaCH-aasC-ssNH-dssC a...;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
281 4 4 8 4 2 2 6 2 2 2 4 3 2 1 3 3 2 2 2 1 2 1 1 1 2 1 1 1 1 1 1 1 2 1 1 2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
282
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
283 FingerprintsVector;TopologicalAtomTriplets:AtomicInvariantsAtomTypes:M
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
284 inDistance1:MaxDistance10;3096;NumericalValues;IDsAndValuesString;C.X1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
285 .BO1.H3-D1-C.X1.BO1.H3-D1-C.X3.BO3.H1-D2 C.X1.BO1.H3-D1-C.X2.BO2.H2-D1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
286 0-C.X3.BO4-D9 C.X1.BO1.H3-D1-C.X2.BO2.H2-D3-N.X3.BO3-D4 C.X1.BO1.H3-D1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
287 -C.X2.BO2.H2-D4-C.X2.BO2.H2-D5 C.X1.BO1.H3-D1-C.X2.BO2.H2-D6-C.X3....;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
288 1 2 2 2 2 2 2 2 8 8 4 8 4 4 2 2 2 2 4 2 2 2 4 2 2 2 2 1 2 2 4 4 4 2 2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
289 2 4 4 4 8 4 4 2 4 4 4 2 4 4 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 8...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
290
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
291 FingerprintsVector;TopologicalAtomTriplets:SYBYLAtomTypes:MinDistance1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
292 :MaxDistance10;2332;NumericalValues;IDsAndValuesString;C.2-D1-C.2-D9-C
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
293 .3-D10 C.2-D1-C.2-D9-C.ar-D10 C.2-D1-C.3-D1-C.3-D2 C.2-D1-C.3-D10-C.3-
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
294 D9 C.2-D1-C.3-D2-C.3-D3 C.2-D1-C.3-D2-C.ar-D3 C.2-D1-C.3-D3-C.3-D4 C.2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
295 -D1-C.3-D3-N.ar-D4 C.2-D1-C.3-D3-O.3-D2 C.2-D1-C.3-D4-C.3-D5 C.2-D1-C.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
296 3-D5-C.3-D6 C.2-D1-C.3-D5-O.3-D4 C.2-D1-C.3-D6-C.3-D7 C.2-D1-C.3-D7...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
297
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
298 FingerprintsVector;TopologicalPharmacophoreAtomPairs:ArbitrarySize:Min
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
299 Distance1:MaxDistance10;54;NumericalValues;IDsAndValuesString;H-D1-H H
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
300 -D1-NI HBA-D1-NI HBD-D1-NI H-D2-H H-D2-HBA H-D2-HBD HBA-D2-HBA HBA-D2-
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
301 HBD H-D3-H H-D3-HBA H-D3-HBD H-D3-NI HBA-D3-NI HBD-D3-NI H-D4-H H-D4-H
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
302 BA H-D4-HBD HBA-D4-HBA HBA-D4-HBD HBD-D4-HBD H-D5-H H-D5-HBA H-D5-...;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
303 18 1 2 1 22 12 8 1 2 18 6 3 1 1 1 22 13 6 5 7 2 28 9 5 1 1 1 36 16 10
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
304 3 4 1 37 10 8 1 35 10 9 3 3 1 28 7 7 4 18 16 12 5 1 2 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
305
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
306 FingerprintsVector;TopologicalPharmacophoreAtomPairs:FixedSize:MinDist
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
307 ance1:MaxDistance10;150;OrderedNumericalValues;ValuesString;18 0 0 1 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
308 0 0 2 0 0 1 0 0 0 0 22 12 8 0 0 1 2 0 0 0 0 0 0 0 0 18 6 3 1 0 0 0 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
309 0 0 1 0 0 0 0 22 13 6 0 0 5 7 0 0 2 0 0 0 0 0 28 9 5 1 0 0 0 1 0 0 1 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
310 0 0 0 36 16 10 0 0 3 4 0 0 1 0 0 0 0 0 37 10 8 0 0 0 0 1 0 0 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
311 0 35 10 9 0 0 3 3 0 0 1 0 0 0 0 0 28 7 7 4 0 0 0 0 0 0 0 0 0 0 0 18...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
312
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
313 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:ArbitrarySize:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
314 MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesString;Ar1-
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
315 Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
316 -H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA1 H1-
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
317 HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 Ar1-...;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
318 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
319 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
320 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
321
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
322 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
323 istance1:MaxDistance10;2692;OrderedNumericalValues;ValuesString;46 106
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
324 8 3 0 0 83 11 4 0 0 0 1 0 0 0 0 0 0 0 0 21 5 3 0 0 1 2 2 0 0 1 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
325 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 101 18 11 0 0 145 132 26
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
326 14 0 0 23 28 3 3 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 45 10 4 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
327 0 16 20 7 5 1 0 3 4 5 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 5 ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
328
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
329 OPTIONS
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
330 --alpha *number*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
331 Value of alpha parameter for calculating *Tversky* similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
332 coefficient specified for -b, --BitVectorComparisonMode option. It
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
333 corresponds to weights assigned for bits set to "1" in a pair of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
334 fingerprint bit-vectors during the calculation of similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
335 coefficient. Possible values: *0 to 1*. Default value: <0.5>.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
336
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
337 --beta *number*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
338 Value of beta parameter for calculating *WeightedTanimoto* and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
339 *WeightedTversky* similarity coefficients specified for -b,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
340 --BitVectorComparisonMode option. It is used to weight the
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
341 contributions of bits set to "0" during the calculation of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
342 similarity coefficients. Possible values: *0 to 1*. Default value of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
343 <1> makes *WeightedTanimoto* and *WeightedTversky* equivalent to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
344 *Tanimoto* and *Tversky*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
345
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
346 -b, --BitVectorComparisonMode *All |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
347 "TanimotoSimilarity,[TverskySimilarity,...]"*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
348 Specify what similarity coefficients to use for calculating
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
349 similarity matrices for fingerprints bit-vector strings data values
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
350 in *TextFile(s)*: calculate similarity matrices for all supported
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
351 similarity coefficients or specify a comma delimited list of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
352 similarity coefficients. Possible values: *All |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
353 "TanimotoSimilarity,[TverskySimilarity,...]*. Default:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
354 *TanimotoSimilarity*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
355
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
356 *All* uses complete list of supported similarity coefficients:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
357 *BaroniUrbaniSimilarity, BuserSimilarity, CosineSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
358 DiceSimilarity, DennisSimilarity, ForbesSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
359 FossumSimilarity, HamannSimilarity, JacardSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
360 Kulczynski1Similarity, Kulczynski2Similarity, MatchingSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
361 McConnaugheySimilarity, OchiaiSimilarity, PearsonSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
362 RogersTanimotoSimilarity, RussellRaoSimilarity, SimpsonSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
363 SkoalSneath1Similarity, SkoalSneath2Similarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
364 SkoalSneath3Similarity, TanimotoSimilarity, TverskySimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
365 YuleSimilarity, WeightedTanimotoSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
366 WeightedTverskySimilarity*. These similarity coefficients are
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
367 described below.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
368
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
369 For two fingerprint bit-vectors A and B of same size, let:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
370
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
371 Na = Number of bits set to "1" in A
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
372 Nb = Number of bits set to "1" in B
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
373 Nc = Number of bits set to "1" in both A and B
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
374 Nd = Number of bits set to "0" in both A and B
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
375
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
376 Nt = Number of bits set to "1" or "0" in A or B (Size of A or B)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
377 Nt = Na + Nb - Nc + Nd
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
378
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
379 Na - Nc = Number of bits set to "1" in A but not in B
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
380 Nb - Nc = Number of bits set to "1" in B but not in A
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
381
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
382 Then, various similarity coefficients [ Ref. 40 - 42 ] for a pair of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
383 bit-vectors A and B are defined as follows:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
384
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
385 *BaroniUrbaniSimilarity*: ( SQRT( Nc * Nd ) + Nc ) / ( SQRT ( Nc *
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
386 Nd ) + Nc + ( Na - Nc ) + ( Nb - Nc ) ) ( same as Buser )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
387
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
388 *BuserSimilarity*: ( SQRT ( Nc * Nd ) + Nc ) / ( SQRT ( Nc * Nd ) +
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
389 Nc + ( Na - Nc ) + ( Nb - Nc ) ) ( same as BaroniUrbani )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
390
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
391 *CosineSimilarity*: Nc / SQRT ( Na * Nb ) (same as Ochiai)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
392
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
393 *DiceSimilarity*: (2 * Nc) / ( Na + Nb )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
394
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
395 *DennisSimilarity*: ( Nc * Nd - ( ( Na - Nc ) * ( Nb - Nc ) ) ) /
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
396 SQRT ( Nt * Na * Nb)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
397
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
398 *ForbesSimilarity*: ( Nt * Nc ) / ( Na * Nb )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
399
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
400 *FossumSimilarity*: ( Nt * ( ( Nc - 1/2 ) ** 2 ) / ( Na * Nb )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
401
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
402 *HamannSimilarity*: ( ( Nc + Nd ) - ( Na - Nc ) - ( Nb - Nc ) ) / Nt
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
403
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
404 *JaccardSimilarity*: Nc / ( ( Na - Nc) + ( Nb - Nc ) + Nc ) = Nc / (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
405 Na + Nb - Nc ) (same as Tanimoto)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
406
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
407 *Kulczynski1Similarity*: Nc / ( ( Na - Nc ) + ( Nb - Nc) ) = Nc / (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
408 Na + Nb - 2Nc )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
409
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
410 *Kulczynski2Similarity*: ( ( Nc / 2 ) * ( 2 * Nc + ( Na - Nc ) + (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
411 Nb - Nc) ) ) / ( ( Nc + ( Na - Nc ) ) * ( Nc + ( Nb - Nc ) ) ) = 0.5
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
412 * ( Nc / Na + Nc / Nb )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
413
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
414 *MatchingSimilarity*: ( Nc + Nd ) / Nt
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
415
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
416 *McConnaugheySimilarity*: ( Nc ** 2 - ( Na - Nc ) * ( Nb - Nc) ) / (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
417 Na * Nb )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
418
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
419 *OchiaiSimilarity*: Nc / SQRT ( Na * Nb ) (same as Cosine)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
420
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
421 *PearsonSimilarity*: ( ( Nc * Nd ) - ( ( Na - Nc ) * ( Nb - Nc ) ) /
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
422 SQRT ( Na * Nb * ( Na - Nc + Nd ) * ( Nb - Nc + Nd ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
423
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
424 *RogersTanimotoSimilarity*: ( Nc + Nd ) / ( ( Na - Nc) + ( Nb - Nc)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
425 + Nt) = ( Nc + Nd ) / ( Na + Nb - 2Nc + Nt)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
426
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
427 *RussellRaoSimilarity*: Nc / Nt
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
428
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
429 *SimpsonSimilarity*: Nc / MIN ( Na, Nb)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
430
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
431 *SkoalSneath1Similarity*: Nc / ( Nc + 2 * ( Na - Nc) + 2 * ( Nb -
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
432 Nc) ) = Nc / ( 2 * Na + 2 * Nb - 3 * Nc )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
433
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
434 *SkoalSneath2Similarity*: ( 2 * Nc + 2 * Nd ) / ( Nc + Nd + Nt )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
435
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
436 *SkoalSneath3Similarity*: ( Nc + Nd ) / ( ( Na - Nc ) + ( Nb - Nc )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
437 ) = ( Nc + Nd ) / ( Na + Nb - 2 * Nc )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
438
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
439 *TanimotoSimilarity*: Nc / ( ( Na - Nc) + ( Nb - Nc ) + Nc ) = Nc /
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
440 ( Na + Nb - Nc ) (same as Jaccard)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
441
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
442 *TverskySimilarity*: Nc / ( alpha * ( Na - Nc ) + ( 1 - alpha) * (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
443 Nb - Nc) + Nc ) = Nc / ( alpha * ( Na - Nb ) + Nb)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
444
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
445 *YuleSimilarity*: ( ( Nc * Nd ) - ( ( Na - Nc ) * ( Nb - Nc ) ) ) /
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
446 ( ( Nc * Nd ) + ( ( Na - Nc ) * ( Nb - Nc ) ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
447
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
448 Values of Tanimoto/Jaccard and Tversky coefficients are dependent on
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
449 only those bit which are set to "1" in both A and B. In order to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
450 take into account all bit positions, modified versions of Tanimoto [
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
451 Ref. 42 ] and Tversky [ Ref. 43 ] have been developed.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
452
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
453 Let:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
454
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
455 Na' = Number of bits set to "0" in A
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
456 Nb' = Number of bits set to "0" in B
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
457 Nc' = Number of bits set to "0" in both A and B
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
458
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
459 Tanimoto': Nc' / ( ( Na' - Nc') + ( Nb' - Nc' ) + Nc' ) = Nc' / (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
460 Na' + Nb' - Nc' )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
461
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
462 Tversky': Nc' / ( alpha * ( Na' - Nc' ) + ( 1 - alpha) * ( Nb' - Nc'
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
463 ) + Nc' ) = Nc' / ( alpha * ( Na' - Nb' ) + Nb')
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
464
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
465 Then:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
466
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
467 *WeightedTanimotoSimilarity* = beta * Tanimoto + (1 - beta) *
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
468 Tanimoto'
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
469
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
470 *WeightedTverskySimilarity* = beta * Tversky + (1 - beta) * Tversky'
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
471
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
472 -c, --ColMode *ColNum | ColLabel*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
473 Specify how columns are identified in *TextFile(s)*: using column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
474 number or column label. Possible values: *ColNum or ColLabel*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
475 Default value: *ColNum*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
476
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
477 --CompoundIDCol *col number | col name*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
478 This value is -c, --ColMode mode specific. It specifies input
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
479 *TextFile(s)* column to use for generating compound ID for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
480 similarity matrices in output *TextFile(s)*. Possible values: *col
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
481 number or col label*. Default value: *first column containing the
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
482 word compoundID in its column label or sequentially generated IDs*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
483
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
484 --CompoundIDPrefix *text*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
485 Specify compound ID prefix to use during sequential generation of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
486 compound IDs for input *SDFile(s)* and *TextFile(s)*. Default value:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
487 *Cmpd*. The default value generates compound IDs which look like
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
488 Cmpd<Number>.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
489
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
490 For input *SDFile(s)*, this value is only used during *LabelPrefix |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
491 MolNameOrLabelPrefix* values of --CompoundIDMode option; otherwise,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
492 it's ignored.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
493
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
494 Examples for *LabelPrefix* or *MolNameOrLabelPrefix* value of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
495 --CompoundIDMode:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
496
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
497 Compound
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
498
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
499 The values specified above generates compound IDs which correspond
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
500 to Compound<Number> instead of default value of Cmpd<Number>.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
501
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
502 --CompoundIDField *DataFieldName*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
503 Specify input *SDFile(s)* datafield label for generating compound
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
504 IDs. This value is only used during *DataField* value of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
505 --CompoundIDMode option.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
506
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
507 Examples for *DataField* value of --CompoundIDMode:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
508
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
509 MolID
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
510 ExtReg
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
511
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
512 --CompoundIDMode *DataField | MolName | LabelPrefix |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
513 MolNameOrLabelPrefix*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
514 Specify how to generate compound IDs from input *SDFile(s)* for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
515 similarity matrix CSV/TSV text file(s): use a *SDFile(s)* datafield
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
516 value; use molname line from *SDFile(s)*; generate a sequential ID
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
517 with specific prefix; use combination of both MolName and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
518 LabelPrefix with usage of LabelPrefix values for empty molname
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
519 lines.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
520
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
521 Possible values: *DataField | MolName | LabelPrefix |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
522 MolNameOrLabelPrefix*. Default: *LabelPrefix*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
523
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
524 For *MolNameAndLabelPrefix* value of --CompoundIDMode, molname line
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
525 in *SDFile(s)* takes precedence over sequential compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
526 generated using *LabelPrefix* and only empty molname values are
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
527 replaced with sequential compound IDs.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
528
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
529 -d, --detail *InfoLevel*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
530 Level of information to print about lines being ignored. Default:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
531 *1*. Possible values: *1, 2 or 3*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
532
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
533 -f, --fast
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
534 In this mode, fingerprints columns specified using --FingerprintsCol
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
535 for *TextFile(s)* and --FingerprintsField for *SDFile(s)* are
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
536 assumed to contain valid fingerprints data and no checking is
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
537 performed before calculating similarity matrices. By default,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
538 fingerprints data is validated before computing pairwise similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
539 and distance coefficients.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
540
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
541 --FingerprintsCol *col number | col name*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
542 This value is -c, --colmode specific. It specifies fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
543 column to use during calculation similarity matrices for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
544 *TextFile(s)*. Possible values: *col number or col label*. Default
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
545 value: *first column containing the word Fingerprints in its column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
546 label*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
547
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
548 --FingerprintsField *FieldLabel*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
549 Fingerprints field label to use during calculation similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
550 matrices for *SDFile(s)*. Default value: *first data field label
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
551 containing the word Fingerprints in its label*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
552
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
553 -h, --help
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
554 Print this help message.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
555
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
556 --InDelim *comma | semicolon*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
557 Input delimiter for CSV *TextFile(s)*. Possible values: *comma or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
558 semicolon*. Default value: *comma*. For TSV files, this option is
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
559 ignored and *tab* is used as a delimiter.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
560
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
561 --InputDataMode *LoadInMemory | ScanFile*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
562 Specify how fingerprints bit-vector or vector strings data from *SD,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
563 FP and CSV/TSV* fingerprint file(s) is processed: Retrieve, process
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
564 and load all available fingerprints data in memory; Retrieve and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
565 process data for fingerprints one at a time. Possible values :
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
566 *LoadInMemory | ScanFile*. Default: *LoadInMemory*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
567
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
568 During *LoadInMemory* value of --InputDataMode, fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
569 bit-vector or vector strings data from input file is retrieved,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
570 processed, and loaded into memory all at once as fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
571 objects for generation for similarity matrices.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
572
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
573 During *ScanFile* value of --InputDataMode, multiple passes over the
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
574 input fingerprints file are performed to retrieve and process
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
575 fingerprints bit-vector or vector strings data one at a time to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
576 generate fingerprints objects used during generation of similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
577 matrices. A temporary copy of the input fingerprints file is made at
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
578 the start and deleted after generating the matrices.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
579
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
580 *ScanFile* value of --InputDataMode allows processing of arbitrary
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
581 large fingerprints files without any additional memory requirement.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
582
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
583 -m, --mode *AutoDetect | FingerprintsBitVectorString |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
584 FingerprintsVectorString*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
585 Format of fingerprint strings data in *TextFile(s)*: automatically
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
586 detect format of fingerprints string created by MayaChemTools
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
587 fingerprints generation scripts or explicitly specify its format.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
588 Possible values: *AutoDetect | FingerprintsBitVectorString |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
589 FingerprintsVectorString*. Default value: *AutoDetect*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
590
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
591 --OutDelim *comma | tab | semicolon*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
592 Delimiter for output CSV/TSV text file(s). Possible values: *comma,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
593 tab, or semicolon* Default value: *comma*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
594
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
595 --OutMatrixFormat *RowsAndColumns | IDPairsAndValue*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
596 Specify how similarity or distance values calculated for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
597 fingerprints vector and bit-vector strings are written to the output
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
598 CSV/TSV text file(s): Generate text files containing rows and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
599 columns with their labels corresponding to compound IDs and each
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
600 matrix element value corresponding to similarity or distance between
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
601 corresponding compounds; Generate text files containing rows
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
602 containing compoundIDs for two compounds followed by similarity or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
603 distance value between these compounds.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
604
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
605 Possible values: *RowsAndColumns, or IDPairsAndValue*. Default
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
606 value: *RowsAndColumns*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
607
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
608 The value of --OutMatrixFormat in conjunction with --OutMatrixType
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
609 determines type of data written to output files and allows
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
610 generation of up to 6 different output data formats:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
611
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
612 OutMatrixFormat OutMatrixType
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
613
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
614 RowsAndColumns FullMatrix [ DEFAULT ]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
615 RowsAndColumns UpperTriangularMatrix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
616 RowsAndColumns LowerTriangularMatrix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
617
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
618 IDPairsAndValue FullMatrix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
619 IDPairsAndValue UpperTriangularMatrix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
620 IDPairsAndValue LowerTriangularMatrix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
621
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
622 Example of data in output file for *RowsAndColumns*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
623 --OutMatrixFormat value for *FullMatrix* valueof --OutMatrixType:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
624
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
625 "","Cmpd1","Cmpd2","Cmpd3","Cmpd4","Cmpd5","Cmpd6",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
626 "Cmpd1","1","0.04","0.25","0.13","0.11","0.2",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
627 "Cmpd2","0.04","1","0.06","0.05","0.19","0.07",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
628 "Cmpd3","0.25","0.06","1","0.12","0.22","0.25",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
629 "Cmpd4","0.13","0.05","0.12","1","0.11","0.13",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
630 "Cmpd5","0.11","0.19","0.22","0.11","1","0.17",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
631 "Cmpd6","0.2","0.07","0.25","0.13","0.17","1",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
632 ... ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
633 ... ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
634 ... ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
635
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
636 Example of data in output file for *RowsAndColumns*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
637 --OutMatrixFormat value for *UpperTriangularMatrix* value of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
638 --OutMatrixType:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
639
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
640 "","Cmpd1","Cmpd2","Cmpd3","Cmpd4","Cmpd5","Cmpd6",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
641 "Cmpd1","1","0.04","0.25","0.13","0.11","0.2",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
642 "Cmpd2","1","0.06","0.05","0.19","0.07",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
643 "Cmpd3","1","0.12","0.22","0.25",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
644 "Cmpd4","1","0.11","0.13",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
645 "Cmpd5","1","0.17",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
646 "Cmpd6","1",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
647 ... ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
648 ... ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
649 ... ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
650
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
651 Example of data in output file for *RowsAndColumns*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
652 --OutMatrixFormat value for *LowerTriangularMatrix* value of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
653 --OutMatrixType:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
654
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
655 "","Cmpd1","Cmpd2","Cmpd3","Cmpd4","Cmpd5","Cmpd6",... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
656 "Cmpd1","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
657 "Cmpd2","0.04","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
658 "Cmpd3","0.25","0.06","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
659 "Cmpd4","0.13","0.05","0.12","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
660 "Cmpd5","0.11","0.19","0.22","0.11","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
661 "Cmpd6","0.2","0.07","0.25","0.13","0.17","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
662 ... ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
663 ... ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
664 ... ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
665
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
666 Example of data in output file for *IDPairsAndValue*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
667 --OutMatrixFormat value for <FullMatrix> value of OutMatrixType:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
668
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
669 "CmpdID1","CmpdID2","Coefficient Value"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
670 "Cmpd1","Cmpd1","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
671 "Cmpd1","Cmpd2","0.04"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
672 "Cmpd1","Cmpd3","0.25"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
673 "Cmpd1","Cmpd4","0.13"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
674 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
675 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
676 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
677 "Cmpd2","Cmpd1","0.04"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
678 "Cmpd2","Cmpd2","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
679 "Cmpd2","Cmpd3","0.06"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
680 "Cmpd2","Cmpd4","0.05"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
681 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
682 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
683 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
684 "Cmpd3","Cmpd1","0.25"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
685 "Cmpd3","Cmpd2","0.06"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
686 "Cmpd3","Cmpd3","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
687 "Cmpd3","Cmpd4","0.12"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
688 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
689 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
690 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
691
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
692 Example of data in output file for *IDPairsAndValue*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
693 --OutMatrixFormat value for <UpperTriangularMatrix> value of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
694 --OutMatrixType:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
695
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
696 "CmpdID1","CmpdID2","Coefficient Value"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
697 "Cmpd1","Cmpd1","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
698 "Cmpd1","Cmpd2","0.04"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
699 "Cmpd1","Cmpd3","0.25"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
700 "Cmpd1","Cmpd4","0.13"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
701 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
702 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
703 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
704 "Cmpd2","Cmpd2","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
705 "Cmpd2","Cmpd3","0.06"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
706 "Cmpd2","Cmpd4","0.05"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
707 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
708 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
709 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
710 "Cmpd3","Cmpd3","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
711 "Cmpd3","Cmpd4","0.12"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
712 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
713 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
714 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
715
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
716 Example of data in output file for *IDPairsAndValue*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
717 --OutMatrixFormat value for <LowerTriangularMatrix> value of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
718 --OutMatrixType:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
719
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
720 "CmpdID1","CmpdID2","Coefficient Value"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
721 "Cmpd1","Cmpd1","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
722 "Cmpd2","Cmpd1","0.04"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
723 "Cmpd2","Cmpd2","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
724 "Cmpd3","Cmpd1","0.25"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
725 "Cmpd3","Cmpd2","0.06"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
726 "Cmpd3","Cmpd3","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
727 "Cmpd4","Cmpd1","0.13"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
728 "Cmpd4","Cmpd2","0.05"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
729 "Cmpd4","Cmpd3","0.12"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
730 "Cmpd4","Cmpd4","1"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
731 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
732 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
733 ... ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
734
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
735 --OutMatrixType *FullMatrix | UpperTriangularMatrix |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
736 LowerTriangularMatrix*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
737 Type of similarity or distance matrix to calculate for fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
738 vector and bit-vector strings: Calculate full matrix; Calculate
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
739 lower triangular matrix including diagonal; Calculate upper
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
740 triangular matrix including diagonal.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
741
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
742 Possible values: *FullMatrix, UpperTriangularMatrix, or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
743 LowerTriangularMatrix*. Default value: *FullMatrix*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
744
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
745 The value of --OutMatrixType in conjunction with --OutMatrixFormat
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
746 determines type of data written to output files.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
747
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
748 -o, --overwrite
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
749 Overwrite existing files
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
750
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
751 -p, --precision *number*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
752 Precision of calculated values in the output file. Default: up to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
753 *2* decimal places. Valid values: positive integers.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
754
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
755 -q, --quote *Yes | No*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
756 Put quote around column values in output CSV/TSV text file(s).
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
757 Possible values: *Yes or No*. Default value: *Yes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
758
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
759 -r, --root *RootName*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
760 New file name is generated using the root:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
761 <Root><BitVectorComparisonMode>.<Ext> or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
762 <Root><VectorComparisonMode><VectorComparisonFormulism>.<Ext>. The
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
763 csv, and tsv <Ext> values are used for comma/semicolon, and tab
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
764 delimited text files respectively. This option is ignored for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
765 multiple input files.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
766
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
767 -v, --VectorComparisonMode *All |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
768 "TanimotoSimilarity,[ManhattanDistance,...]"*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
769 Specify what similarity or distance coefficients to use for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
770 calculating similarity matrices for fingerprint vector strings data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
771 values in *TextFile(s)*: calculate similarity matrices for all
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
772 supported similarity and distance coefficients or specify a comma
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
773 delimited list of similarity and distance coefficients. Possible
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
774 values: *All | "TanimotoSimilairy,[ManhattanDistance,..]"*. Default:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
775 *TanimotoSimilarity*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
776
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
777 The value of -v, --VectorComparisonMode, in conjunction with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
778 --VectorComparisonFormulism, decides which type of similarity and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
779 distance coefficient formulism gets used.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
780
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
781 *All* uses complete list of supported similarity and distance
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
782 coefficients: *CosineSimilarity, CzekanowskiSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
783 DiceSimilarity, OchiaiSimilarity, JaccardSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
784 SorensonSimilarity, TanimotoSimilarity, CityBlockDistance,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
785 EuclideanDistance, HammingDistance, ManhattanDistance,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
786 SoergelDistance*. These similarity and distance coefficients are
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
787 described below.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
788
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
789 FingerprintsVector.pm module, used to calculate similarity and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
790 distance coefficients, provides support to perform comparison
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
791 between vectors containing three different types of values:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
792
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
793 Type I: OrderedNumericalValues
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
794
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
795 . Size of two vectors are same
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
796 . Vectors contain real values in a specific order. For example: MACCS keys
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
797 count, Topological pharmnacophore atom pairs and so on.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
798
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
799 Type II: UnorderedNumericalValues
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
800
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
801 . Size of two vectors might not be same
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
802 . Vectors contain unordered real value identified by value IDs. For example:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
803 Toplogical atom pairs, Topological atom torsions and so on
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
804
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
805 Type III: AlphaNumericalValues
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
806
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
807 . Size of two vectors might not be same
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
808 . Vectors contain unordered alphanumerical values. For example: Extended
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
809 connectivity fingerprints, atom neighborhood fingerprints.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
810
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
811 Before performing similarity or distance calculations between
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
812 vectors containing UnorderedNumericalValues or AlphaNumericalValues,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
813 the vectors are transformed into vectors containing unique
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
814 OrderedNumericalValues using value IDs for UnorderedNumericalValues
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
815 and values itself for AlphaNumericalValues.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
816
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
817 Three forms of similarity and distance calculation between two
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
818 vectors, specified using --VectorComparisonFormulism option, are
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
819 supported: *AlgebraicForm, BinaryForm or SetTheoreticForm*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
820
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
821 For *BinaryForm*, the ordered list of processed final vector values
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
822 containing the value or count of each unique value type is simply
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
823 converted into a binary vector containing 1s and 0s corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
824 presence or absence of values before calculating similarity or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
825 distance between two vectors.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
826
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
827 For two fingerprint vectors A and B of same size containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
828 OrderedNumericalValues, let:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
829
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
830 N = Number values in A or B
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
831
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
832 Xa = Values of vector A
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
833 Xb = Values of vector B
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
834
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
835 Xai = Value of ith element in A
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
836 Xbi = Value of ith element in B
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
837
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
838 SUM = Sum of i over N values
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
839
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
840 For SetTheoreticForm of calculation between two vectors, let:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
841
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
842 SetIntersectionXaXb = SUM ( MIN ( Xai, Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
843 SetDifferenceXaXb = SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN ( Xai, Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
844
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
845 For BinaryForm of calculation between two vectors, let:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
846
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
847 Na = Number of bits set to "1" in A = SUM ( Xai )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
848 Nb = Number of bits set to "1" in B = SUM ( Xbi )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
849 Nc = Number of bits set to "1" in both A and B = SUM ( Xai * Xbi )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
850 Nd = Number of bits set to "0" in both A and B
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
851 = SUM ( 1 - Xai - Xbi + Xai * Xbi)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
852
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
853 N = Number of bits set to "1" or "0" in A or B = Size of A or B = Na + Nb - Nc + Nd
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
854
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
855 Additionally, for BinaryForm various values also correspond to:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
856
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
857 Na = | Xa |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
858 Nb = | Xb |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
859 Nc = | SetIntersectionXaXb |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
860 Nd = N - | SetDifferenceXaXb |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
861
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
862 | SetDifferenceXaXb | = N - Nd = Na + Nb - Nc + Nd - Nd = Na + Nb - Nc
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
863 = | Xa | + | Xb | - | SetIntersectionXaXb |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
864
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
865 Various similarity and distance coefficients [ Ref 40, Ref 62, Ref
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
866 64 ] for a pair of vectors A and B in *AlgebraicForm, BinaryForm and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
867 SetTheoreticForm* are defined as follows:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
868
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
869 CityBlockDistance: ( same as HammingDistance and ManhattanDistance)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
870
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
871 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
872
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
873 *BinaryForm*: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
874
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
875 *SetTheoreticForm*: | SetDifferenceXaXb | - | SetIntersectionXaXb |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
876 = SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
877
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
878 CosineSimilarity: ( same as OchiaiSimilarityCoefficient)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
879
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
880 *AlgebraicForm*: SUM ( Xai * Xbi ) / SQRT ( SUM ( Xai ** 2) * SUM (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
881 Xbi ** 2) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
882
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
883 *BinaryForm*: Nc / SQRT ( Na * Nb)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
884
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
885 *SetTheoreticForm*: | SetIntersectionXaXb | / SQRT ( |Xa| * |Xb| ) =
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
886 SUM ( MIN ( Xai, Xbi ) ) / SQRT ( SUM ( Xai ) * SUM ( Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
887
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
888 CzekanowskiSimilarity: ( same as DiceSimilarity and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
889 SorensonSimilarity)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
890
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
891 *AlgebraicForm*: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) +
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
892 SUM ( Xbi **2 ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
893
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
894 *BinaryForm*: 2 * Nc / ( Na + Nb )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
895
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
896 *SetTheoreticForm*: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) =
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
897 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
898
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
899 DiceSimilarity: ( same as CzekanowskiSimilarity and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
900 SorensonSimilarity)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
901
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
902 *AlgebraicForm*: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) +
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
903 SUM ( Xbi **2 ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
904
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
905 *BinaryForm*: 2 * Nc / ( Na + Nb )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
906
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
907 *SetTheoreticForm*: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) =
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
908 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
909
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
910 EuclideanDistance:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
911
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
912 *AlgebraicForm*: SQRT ( SUM ( ( ( Xai - Xbi ) ** 2 ) ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
913
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
914 *BinaryForm*: SQRT ( ( Na - Nc ) + ( Nb - Nc ) ) = SQRT ( Na + Nb -
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
915 2 * Nc )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
916
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
917 *SetTheoreticForm*: SQRT ( | SetDifferenceXaXb | - |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
918 SetIntersectionXaXb | ) = SQRT ( SUM ( Xai ) + SUM ( Xbi ) - 2 * (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
919 SUM ( MIN ( Xai, Xbi ) ) ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
920
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
921 HammingDistance: ( same as CityBlockDistance and ManhattanDistance)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
922
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
923 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
924
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
925 *BinaryForm*: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
926
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
927 *SetTheoreticForm*: | SetDifferenceXaXb | - | SetIntersectionXaXb |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
928 = SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
929
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
930 JaccardSimilarity: ( same as TanimotoSimilarity)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
931
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
932 *AlgebraicForm*: SUM ( Xai * Xbi ) / ( SUM ( Xai ** 2 ) + SUM ( Xbi
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
933 ** 2 ) - SUM ( Xai * Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
934
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
935 *BinaryForm*: Nc / ( ( Na - Nc ) + ( Nb - Nc ) + Nc ) = Nc / ( Na +
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
936 Nb - Nc )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
937
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
938 *SetTheoreticForm*: | SetIntersectionXaXb | / | SetDifferenceXaXb |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
939 = SUM ( MIN ( Xai, Xbi ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
940 ( Xai, Xbi ) ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
941
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
942 ManhattanDistance: ( same as CityBlockDistance and HammingDistance)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
943
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
944 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
945
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
946 *BinaryForm*: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
947
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
948 *SetTheoreticForm*: | SetDifferenceXaXb | - | SetIntersectionXaXb |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
949 = SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
950
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
951 OchiaiSimilarity: ( same as CosineSimilarity)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
952
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
953 *AlgebraicForm*: SUM ( Xai * Xbi ) / SQRT ( SUM ( Xai ** 2) * SUM (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
954 Xbi ** 2) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
955
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
956 *BinaryForm*: Nc / SQRT ( Na * Nb)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
957
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
958 *SetTheoreticForm*: | SetIntersectionXaXb | / SQRT ( |Xa| * |Xb| ) =
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
959 SUM ( MIN ( Xai, Xbi ) ) / SQRT ( SUM ( Xai ) * SUM ( Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
960
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
961 SorensonSimilarity: ( same as CzekanowskiSimilarity and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
962 DiceSimilarity)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
963
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
964 *AlgebraicForm*: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) +
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
965 SUM ( Xbi **2 ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
966
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
967 *BinaryForm*: 2 * Nc / ( Na + Nb )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
968
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
969 *SetTheoreticForm*: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) =
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
970 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
971
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
972 SoergelDistance:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
973
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
974 *AlgebraicForm*: SUM ( ABS ( Xai - Xbi ) ) / SUM ( MAX ( Xai, Xbi )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
975 )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
976
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
977 *BinaryForm*: 1 - Nc / ( Na + Nb - Nc ) = ( Na + Nb - 2 * Nc ) / (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
978 Na + Nb - Nc )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
979
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
980 *SetTheoreticForm*: ( | SetDifferenceXaXb | - | SetIntersectionXaXb
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
981 | ) / | SetDifferenceXaXb | = ( SUM ( Xai ) + SUM ( Xbi ) - 2 * (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
982 SUM ( MIN ( Xai, Xbi ) ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM (
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
983 MIN ( Xai, Xbi ) ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
984
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
985 TanimotoSimilarity: ( same as JaccardSimilarity)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
986
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
987 *AlgebraicForm*: SUM ( Xai * Xbi ) / ( SUM ( Xai ** 2 ) + SUM ( Xbi
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
988 ** 2 ) - SUM ( Xai * Xbi ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
989
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
990 *BinaryForm*: Nc / ( ( Na - Nc ) + ( Nb - Nc ) + Nc ) = Nc / ( Na +
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
991 Nb - Nc )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
992
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
993 *SetTheoreticForm*: | SetIntersectionXaXb | / | SetDifferenceXaXb |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
994 = SUM ( MIN ( Xai, Xbi ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
995 ( Xai, Xbi ) ) )
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
996
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
997 --VectorComparisonFormulism *All |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
998 "AlgebraicForm,[BinaryForm,SetTheoreticForm]"*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
999 Specify fingerprints vector comparison formulism to use for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1000 calculation similarity and distance coefficients during -v,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1001 --VectorComparisonMode: use all supported comparison formulisms or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1002 specify a comma delimited. Possible values: *All |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1003 "AlgebraicForm,[BinaryForm,SetTheoreticForm]"*. Default value:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1004 *AlgebraicForm*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1005
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1006 *All* uses all three forms of supported vector comparison formulism
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1007 for values of -v, --VectorComparisonMode option.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1008
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1009 For fingerprint vector strings containing AlphaNumericalValues data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1010 values - ExtendedConnectivityFingerprints,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1011 AtomNeighborhoodsFingerprints and so on - all three formulism result
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1012 in same value during similarity and distance calculations.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1013
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1014 -w, --WorkingDir *DirName*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1015 Location of working directory. Default: current directory.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1016
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1017 EXAMPLES
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1018 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1019 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1020 supported fingerprints in text file present in a column name containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1021 Fingerprint substring by loading all fingerprints data into memory and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1022 create a SampleFPHexTanimotoSimilarity.csv file containing compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1023 retrieved from column name containing CompoundID substring, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1024
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1025 % SimilarityMatricesFingerprints.pl -o SampleFPHex.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1026
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1027 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1028 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1029 supported fingerprints in SD File present in a data field with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1030 Fingerprint substring in its label by loading all fingerprints data into
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1031 memory and create a SampleFPHexTanimotoSimilarity.csv file containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1032 sequentially generated compound IDs with Cmpd prefix, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1033
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1034 % SimilarityMatricesFingerprints.pl -o SampleFPHex.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1035
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1036 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1037 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1038 supported fingerprints in FP file by loading all fingerprints data into
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1039 memory and create a SampleFPHexTanimotoSimilarity.csv file along with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1040 compound IDs retrieved from FP file, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1041
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1042 % SimilarityMatricesFingerprints.pl -o SampleFPHex.fpf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1043
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1044 To generate a lower triangular similarity matrix corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1045 Tanimoto similarity coefficient for fingerprints bit-vector strings data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1046 corresponding to supported fingerprints in text file present in a column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1047 name containing Fingerprint substring by loading all fingerprints data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1048 into memory and create a SampleFPHexTanimotoSimilarity.csv file
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1049 containing compound IDs retrieved from column name containing CompoundID
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1050 substring, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1051
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1052 % SimilarityMatricesFingerprints.pl -o --InputDataMode LoadInMemory
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1053 --OutMatrixFormat RowsAndColumns --OutMatrixType LowerTriangularMatrix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1054 SampleFPHex.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1055
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1056 To generate a upper triangular similarity matrix corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1057 Tanimoto similarity coefficient for fingerprints bit-vector strings data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1058 corresponding to supported fingerprints in text file present in a column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1059 name containing Fingerprint substring by loading all fingerprints data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1060 into memory and create a SampleFPHexTanimotoSimilarity.csv file in
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1061 IDPairsAndValue format containing compound IDs retrieved from column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1062 name containing CompoundID substring, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1063
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1064 % SimilarityMatricesFingerprints.pl -o --InputDataMode LoadInMemory
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1065 --OutMatrixFormat IDPairsAndValue --OutMatrixType UpperTriangularMatrix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1066 SampleFPHex.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1067
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1068 To generate a full similarity matrix corresponding to Tanimoto
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1069 similarity coefficient for fingerprints bit-vector strings data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1070 corresponding to supported fingerprints in text file present in a column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1071 name containing Fingerprint substring by scanning file without loading
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1072 all fingerprints data into memory and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1073 SampleFPHexTanimotoSimilarity.csv file containing compound IDs retrieved
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1074 from column name containing CompoundID substring, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1075
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1076 % SimilarityMatricesFingerprints.pl -o --InputDataMode ScanFile
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1077 --OutMatrixFormat RowsAndColumns --OutMatrixType FullMatrix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1078 SampleFPHex.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1079
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1080 To generate a lower triangular similarity matrix corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1081 Tanimoto similarity coefficient for fingerprints bit-vector strings data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1082 corresponding to supported fingerprints in text file present in a column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1083 name containing Fingerprint substring by scanning file without loading
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1084 all fingerprints data into memory and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1085 SampleFPHexTanimotoSimilarity.csv file in IDPairsAndValue format
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1086 containing compound IDs retrieved from column name containing CompoundID
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1087 substring, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1088
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1089 % SimilarityMatricesFingerprints.pl -o --InputDataMode ScanFile
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1090 --OutMatrixFormat IDPairsAndValue --OutMatrixType LowerTriangularMatrix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1091 SampleFPHex.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1092
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1093 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1094 coefficient using algebraic formulism for fingerprints vector strings
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1095 data corresponding to supported fingerprints in text file present in a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1096 column name containing Fingerprint substring and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1097 SampleFPCountTanimotoSimilarityAlgebraicForm.csv file containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1098 compound IDs retrieved from column name containing CompoundID substring,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1099 type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1100
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1101 % SimilarityMatricesFingerprints.pl -o SampleFPCount.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1102
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1103 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1104 coefficient using algebraic formulism for fingerprints vector strings
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1105 data corresponding to supported fingerprints in SD file present in a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1106 data field with Fingerprint substring in its label and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1107 SampleFPCountTanimotoSimilarityAlgebraicForm.csv file containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1108 sequentially generated compound IDs with Cmpd prefix, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1109
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1110 % SimilarityMatricesFingerprints.pl -o SampleFPCount.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1111
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1112 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1113 coefficient using algebraic formulism vector strings data corresponding
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1114 to supported fingerprints in FP file and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1115 SampleFPCountTanimotoSimilarityAlgebraicForm.csv file along with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1116 compound IDs retrieved from FP file, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1117
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1118 % SimilarityMatricesFingerprints.pl -o SampleFPCount.fpf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1119
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1120 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1121 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1122 supported fingerprints in text file present in a column name containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1123 Fingerprint substring and create a SampleFPHexTanimotoSimilarity.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1124 file in IDPairsAndValue format containing compound IDs retrieved from
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1125 column name containing CompoundID substring, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1126
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1127 % SimilarityMatricesFingerprints.pl --OutMatrixFormat IDPairsAndValue -o
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1128 SampleFPHex.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1129
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1130 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1131 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1132 supported fingerprints in SD file present in a data field with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1133 Fingerprint substring in its label and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1134 SampleFPHexTanimotoSimilarity.csv file in IDPairsAndValue format
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1135 containing sequentially generated compound IDs with Cmpd prefix, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1136
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1137 % SimilarityMatricesFingerprints.pl --OutMatrixFormat IDPairsAndValue -o
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1138 SampleFPHex.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1139
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1140 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1141 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1142 supported fingerprints in FP file and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1143 SampleFPHexTanimotoSimilarity.csv file in IDPairsAndValue format along
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1144 with compound IDs retrieved from FP file, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1145
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1146 % SimilarityMatricesFingerprints.pl --OutMatrixFormat IDPairsAndValue -o
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1147 SampleFPHex.fpf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1148
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1149 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1150 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1151 supported fingerprints in SD file present in a data field with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1152 Fingerprint substring in its label and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1153 SampleFPHexTanimotoSimilarity.csv file containing compound IDs from mol
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1154 name line, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1155
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1156 % SimilarityMatricesFingerprints.pl --CompoundIDMode MolName -o
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1157 SampleFPHex.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1158
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1159 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1160 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1161 supported fingerprints present in a data field with Fingerprint
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1162 substring in its label and create a SampleFPHexTanimotoSimilarity.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1163 file containing compound IDs from data field name Mol_ID, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1164
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1165 % SimilarityMatricesFingerprints.pl --CompoundIDMode DataField
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1166 --CompoundIDField Mol_ID -o SampleFPBin.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1167
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1168 To generate similarity matrices corresponding to Buser, Dice and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1169 Tanimoto similarity coefficient for fingerprints bit-vector strings data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1170 corresponding to supported fingerprints present in a column name
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1171 containing Fingerprint substring and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1172 SampleFPBin[CoefficientName]Similarity.csv files containing compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1173 retrieved from column name containing CompoundID substring, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1174
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1175 % SimilarityMatricesFingerprints.pl -b "BuserSimilarity,DiceSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1176 TanimotoSimilarity" -o SampleFPBin.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1177
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1178 To generate similarity matrices corresponding to Buser, Dice and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1179 Tanimoto similarity coefficient for fingerprints bit-vector strings data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1180 corresponding to supported fingerprints present in a data field with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1181 Fingerprint substring in its label and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1182 SampleFPBin[CoefficientName]Similarity.csv files containing sequentially
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1183 generated compound IDs with Cmpd prefix, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1184
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1185 % SimilarityMatricesFingerprints.pl -b "BuserSimilarity,DiceSimilarity,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1186 TanimotoSimilarity" -o SampleFPBin.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1187
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1188 To generate similarity matrices corresponding to CityBlock distance and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1189 Tanimoto similarity coefficients using algebraic formulism for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1190 fingerprints vector strings data corresponding to supported fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1191 present in a column name containing Fingerprint substring and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1192 SampleFPCount[CoefficientName]AlgebraicForm.csv files containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1193 compound IDs retrieved from column name containing CompoundID substring,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1194 type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1195
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1196 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1197 TanimotoSimilarity" -o SampleFPCount.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1198
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1199 To generate similarity matrices corresponding to CityBlock distance and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1200 Tanimoto similarity coefficients using algebraic formulism for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1201 fingerprints vector strings data corresponding to supported fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1202 present in a data field with Fingerprint substring in its label and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1203 create SampleFPCount[CoefficientName]AlgebraicForm.csv files containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1204 sequentially generated compound IDs with Cmpd prefix, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1205
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1206 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1207 TanimotoSimilarity" -o SampleFPCount.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1208
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1209 To generate similarity matrices corresponding to CityBlock distance
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1210 Tanimoto similarity coefficients using binary formulism for fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1211 vector strings data corresponding to supported fingerprints present in a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1212 column name containing Fingerprint substring and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1213 SampleFPCount[CoefficientName]Binary.csv files containing compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1214 retrieved from column name containing CompoundID substring, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1215
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1216 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1217 TanimotoSimilarity" --VectorComparisonFormulism BinaryForm -o
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1218 SampleFPCount.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1219
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1220 To generate similarity matrices corresponding to CityBlock distance
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1221 Tanimoto similarity coefficients using binary formulism for fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1222 vector strings data corresponding to supported fingerprints present in a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1223 data field with Fingerprint substring in its label and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1224 SampleFPCount[CoefficientName]Binary.csv files containing sequentially
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1225 generated compound IDs with Cmpd prefix, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1226
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1227 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1228 TanimotoSimilarity" --VectorComparisonFormulism BinaryForm -o
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1229 SampleFPCount.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1230
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1231 To generate similarity matrices corresponding to CityBlock distance
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1232 Tanimoto similarity coefficients using all supported comparison
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1233 formulisms for fingerprints vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1234 supported fingerprints present in a column name containing Fingerprint
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1235 substring and create SampleFPCount[CoefficientName][FormulismName].csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1236 files containing compound IDs retrieved from column name containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1237 CompoundID substring, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1238
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1239 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1240 TanimotoSimilarity" --VectorComparisonFormulism All -o SampleFPCount.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1241
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1242 To generate similarity matrices corresponding to CityBlock distance
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1243 Tanimoto similarity coefficients using all supported comparison
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1244 formulisms for fingerprints vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1245 supported fingerprints present in a data field with Fingerprint
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1246 substring in its label and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1247 SampleFPCount[CoefficientName][FormulismName].csv files containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1248 sequentially generated compound IDs with Cmpd prefix, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1249
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1250 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,TanimotoSimilarity"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1251 --VectorComparisonFormulism All -o SampleFPCount.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1252
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1253 To generate similarity matrices corresponding to all available
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1254 similarity coefficient for fingerprints bit-vector strings data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1255 corresponding to supported fingerprints present in a column name
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1256 containing Fingerprint substring and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1257 SampleFPHex[CoefficientName].csv files containing compound IDs retrieved
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1258 from column name containing CompoundID substring, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1259
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1260 % SimilarityMatricesFingerprints.pl -m AutoDetect --BitVectorComparisonMode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1261 All --alpha 0.5 -beta 0.5 -o SampleFPHex.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1262
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1263 To generate similarity matrices corresponding to all available
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1264 similarity coefficient for fingerprints bit-vector strings data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1265 corresponding to supported fingerprints present in a data field with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1266 Fingerprint substring in its label and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1267 SampleFPHex[CoefficientName].csv files containing sequentially generated
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1268 compound IDs with Cmpd prefix, type
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1269
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1270 % SimilarityMatricesFingerprints.pl -m AutoDetect --BitVectorComparisonMode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1271 All --alpha 0.5 -beta 0.5 -o SampleFPHex.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1272
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1273 To generate similarity matrices corresponding to all available
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1274 similarity and distance coefficients using all comparison formulism for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1275 fingerprints vector strings data corresponding to supported fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1276 present in a column name containing Fingerprint substring and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1277 SampleFPCount[CoefficientName][FormulismName].csv files containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1278 compound IDs retrieved from column name containing CompoundID substring,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1279 type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1280
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1281 % SimilarityMatricesFingerprints.pl -m AutoDetect --VectorComparisonMode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1282 All --VectorComparisonFormulism All -o SampleFPCount.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1283
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1284 To generate similarity matrices corresponding to all available
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1285 similarity and distance coefficients using all comparison formulism for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1286 fingerprints vector strings data corresponding to supported fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1287 present in a data field with Fingerprint substring in its label and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1288 create SampleFPCount[CoefficientName][FormulismName].csv files
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1289 containing sequentially generated compound IDs with Cmpd prefix, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1290
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1291 % SimilarityMatricesFingerprints.pl -m AutoDetect --VectorComparisonMode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1292 All --VectorComparisonFormulism All -o SampleFPCount.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1293
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1294 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1295 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1296 supported fingerprints present in a column number 2 and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1297 SampleFPHexTanimotoSimilarity.csv file containing compound IDs retrieved
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1298 column number 1, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1299
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1300 % SimilarityMatricesFingerprints.pl --ColMode ColNum --CompoundIDCol 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1301 --FingerprintsCol 2 -o SampleFPHex.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1302
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1303 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1304 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1305 supported fingerprints present in a data field name Fingerprints and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1306 create a SampleFPHexTanimotoSimilarity.csv file containing compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1307 present in data field name Mol_ID, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1308
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1309 % SimilarityMatricesFingerprints.pl --FingerprintsField Fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1310 --CompoundIDMode DataField --CompoundIDField Mol_ID -o SampleFPHex.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1311
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1312 To generate a similarity matrix corresponding to Tversky similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1313 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1314 supported fingerprints present in a column named Fingerprints and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1315 a SampleFPHexTverskySimilarity.tsv file containing compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1316 retrieved column named CompoundID, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1317
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1318 % SimilarityMatricesFingerprints.pl --BitVectorComparisonMode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1319 TverskySimilarity --alpha 0.5 --ColMode ColLabel --CompoundIDCol
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1320 CompoundID --FingerprintsCol Fingerprints --OutDelim Tab --quote No
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1321 -o SampleFPHex.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1322
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1323 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1324 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1325 supported fingerprints present in a data field with Fingerprint
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1326 substring in its label and create a SampleFPHexTanimotoSimilarity.csv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1327 file containing compound IDs from molname line or sequentially generated
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1328 compound IDs with Mol prefix, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1329
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1330 % SimilarityMatricesFingerprints.pl --CompoundIDMode MolnameOrLabelPrefix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1331 --CompoundIDPrefix Mol -o SampleFPHex.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1332
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1333 To generate a similarity matrix corresponding to Tanimoto similarity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1334 coefficient for fingerprints bit-vector strings data corresponding to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1335 supported fingerprints present in a data field with Fingerprint
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1336 substring in its label and create a SampleFPHexTanimotoSimilarity.tsv
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1337 file containing sequentially generated compound IDs with Cmpd prefix,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1338 type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1339
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1340 % SimilarityMatricesFingerprints.pl -OutDelim Tab --quote No -o SampleFPHex.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1341
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1342 AUTHOR
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1343 Manish Sud <msud@san.rr.com>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1344
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1345 SEE ALSO
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1346 InfoFingerprintsFiles.pl, SimilaritySearchingFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1347 AtomNeighborhoodsFingerprints.pl, ExtendedConnectivityFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1348 MACCSKeysFingerprints.pl, PathLengthFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1349 TopologicalAtomPairsFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1350 TopologicalAtomTorsionsFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1351 TopologicalPharmacophoreAtomPairsFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1352 TopologicalPharmacophoreAtomTripletsFingerprints.pl
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1353
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1354 COPYRIGHT
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1355 Copyright (C) 2015 Manish Sud. All rights reserved.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1356
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1357 This file is part of MayaChemTools.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1358
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1359 MayaChemTools is free software; you can redistribute it and/or modify it
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1360 under the terms of the GNU Lesser General Public License as published by
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1361 the Free Software Foundation; either version 3 of the License, or (at
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1362 your option) any later version.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1363