Mercurial > repos > deepakjadmin > mayatool3_test3
comparison mayachemtools/docs/scripts/html/SimilarityMatricesFingerprints.html @ 0:73ae111cf86f draft
Uploaded
| author | deepakjadmin |
|---|---|
| date | Wed, 20 Jan 2016 11:55:01 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:73ae111cf86f |
|---|---|
| 1 <html> | |
| 2 <head> | |
| 3 <title>MayaChemTools:Documentation:SimilarityMatricesFingerprints.pl</title> | |
| 4 <meta http-equiv="content-type" content="text/html;charset=utf-8"> | |
| 5 <link rel="stylesheet" type="text/css" href="../../css/MayaChemTools.css"> | |
| 6 </head> | |
| 7 <body leftmargin="20" rightmargin="20" topmargin="10" bottommargin="10"> | |
| 8 <br/> | |
| 9 <center> | |
| 10 <a href="http://www.mayachemtools.org" title="MayaChemTools Home"><img src="../../images/MayaChemToolsLogo.gif" border="0" alt="MayaChemTools"></a> | |
| 11 </center> | |
| 12 <br/> | |
| 13 <div class="DocNav"> | |
| 14 <table width="100%" border=0 cellpadding=0 cellspacing=2> | |
| 15 <tr align="left" valign="top"><td width="33%" align="left"><a href="./SDToMolFiles.html" title="SDToMolFiles.html">Previous</a> <a href="./index.html" title="Table of Contents">TOC</a> <a href="./SimilaritySearchingFingerprints.html" title="SimilaritySearchingFingerprints.html">Next</a></td><td width="34%" align="middle"><strong>SimilarityMatricesFingerprints.pl</strong></td><td width="33%" align="right"><a href="././code/SimilarityMatricesFingerprints.html" title="View source code">Code</a> | <a href="./../pdf/SimilarityMatricesFingerprints.pdf" title="PDF US Letter Size">PDF</a> | <a href="./../pdfgreen/SimilarityMatricesFingerprints.pdf" title="PDF US Letter Size with narrow margins: www.changethemargins.com">PDFGreen</a> | <a href="./../pdfa4/SimilarityMatricesFingerprints.pdf" title="PDF A4 Size">PDFA4</a> | <a href="./../pdfa4green/SimilarityMatricesFingerprints.pdf" title="PDF A4 Size with narrow margins: www.changethemargins.com">PDFA4Green</a></td></tr> | |
| 16 </table> | |
| 17 </div> | |
| 18 <p> | |
| 19 </p> | |
| 20 <h2>NAME</h2> | |
| 21 <p>SimilarityMatricesFingerprints.pl - Calculate similarity matrices using fingerprints strings data in SD, FP and CSV/TSV text file(s)</p> | |
| 22 <p> | |
| 23 </p> | |
| 24 <h2>SYNOPSIS</h2> | |
| 25 <p>SimilarityMatricesFingerprints.pl SDFile(s) FPFile(s) TextFile(s)...</p> | |
| 26 <p>SimilarityMatricesFingerprints.pl [<strong>--alpha</strong> <em>number</em>] [<strong>--beta</strong> <em>number</em>] | |
| 27 [<strong>-b, --BitVectorComparisonMode</strong> <em>All | "TanimotoSimilarity,[ TverskySimilarity, ... ]"</em>] | |
| 28 [<strong>-c, --ColMode</strong> <em>ColNum | ColLabel</em>] [<strong>--CompoundIDCol</strong> <em>col number | col name</em>] | |
| 29 [<strong>--CompoundIDPrefix</strong> <em>text</em>] [<strong>--CompoundIDField</strong> <em>DataFieldName</em>] | |
| 30 [<strong>--CompoundIDMode</strong> <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em>] | |
| 31 [<strong>-d, --detail</strong> <em>InfoLevel</em>] [<strong>-f, --fast</strong>] [<strong>--FingerprintsCol</strong> <em>col number | col name</em>] | |
| 32 [<strong>--FingerprintsField</strong> <em>FieldLabel</em>] [<strong>-h, --help</strong>] [<strong>--InDelim</strong> <em>comma | semicolon</em>] | |
| 33 [<strong>--InputDataMode</strong> <em>LoadInMemory | ScanFile</em>] | |
| 34 [<strong>-m, --mode</strong> <em>AutoDetect | FingerprintsBitVectorString | FingerprintsVectorString</em>] | |
| 35 [<strong>--OutDelim</strong> <em>comma | tab | semicolon</em>] [<strong>--OutMatrixFormat</strong> <em>RowsAndColumns | IDPairsAndValue</em>] | |
| 36 [<strong>--OutMatrixType</strong> <em>FullMatrix | UpperTriangularMatrix | LowerTriangularMatrix</em>] | |
| 37 [<strong>-o, --overwrite</strong>] [<strong>-p, --precision</strong> <em>number</em>] | |
| 38 [<strong>-q, --quote</strong> <em>Yes | No</em>] [<strong>-r, --root</strong> <em>RootName</em>] | |
| 39 [<strong>-v, --VectorComparisonMode</strong> <em>All | "TanimotoSimilairy, [ ManhattanDistance, ...]"</em>] | |
| 40 [<strong>--VectorComparisonFormulism</strong> <em>All | "AlgebraicForm, [BinaryForm, SetTheoreticForm]"</em>] | |
| 41 [<strong>-w, --WorkingDir</strong> dirname] SDFile(s) FPFile(s) TextFile(s)...</p> | |
| 42 <p> | |
| 43 </p> | |
| 44 <h2>DESCRIPTION</h2> | |
| 45 <p>Calculate similarity matrices using fingerprint bit-vector or vector strings data in <em>SD, FP | |
| 46 and CSV/TSV</em> text file(s) and generate CSV/TSV text file(s) containing values for specified | |
| 47 similarity and distance coefficients.</p> | |
| 48 <p>The scripts SimilarityMatrixSDFiles.pl and SimilarityMatrixTextFiles.pl have been removed from the | |
| 49 current release of MayaChemTools and their functionality merged with this script.</p> | |
| 50 <p>The valid <em>SDFile</em> extensions are <em>.sdf</em> and <em>.sd</em>. All SD files in a current directory | |
| 51 can be specified either by <em>*.sdf</em> or the current directory name.</p> | |
| 52 <p>The valid <em>FPFile</em> extensions are <em>.fpf</em> and <em>.fp</em>. All FP files in a current directory | |
| 53 can be specified either by <em>*.fpf</em> or the current directory name.</p> | |
| 54 <p>The valid <em>TextFile</em> extensions are <em>.csv</em> and <em>.tsv</em> for comma/semicolon and tab | |
| 55 delimited text files respectively. All other file names are ignored. All text files in a | |
| 56 current directory can be specified by <em>*.csv</em>, <em>*.tsv</em>, or the current directory | |
| 57 name. The <strong>--indelim</strong> option determines the format of <em>TextFile(s)</em>. Any file | |
| 58 which doesn't correspond to the format indicated by <strong>--indelim</strong> option is ignored.</p> | |
| 59 <p>Example of <em>FP</em> file containing fingerprints bit-vector string data:</p> | |
| 60 <div class="OptionsBox"> | |
| 61 # | |
| 62 <br/> # Package = MayaChemTools 7.4 | |
| 63 <br/> # ReleaseDate = Oct 21, 2010 | |
| 64 <br/> # | |
| 65 <br/> # TimeStamp = Mon Mar 7 15:14:01 2011 | |
| 66 <br/> # | |
| 67 <br/> # FingerprintsStringType = FingerprintsBitVector | |
| 68 <br/> # | |
| 69 <br/> # Description = PathLengthBits:AtomicInvariantsAtomTypes:MinLength1:... | |
| 70 <br/> # Size = 1024 | |
| 71 <br/> # BitStringFormat = HexadecimalString | |
| 72 <br/> # BitsOrder = Ascending | |
| 73 <br/> # | |
| 74 <br/> Cmpd1 9c8460989ec8a49913991a6603130b0a19e8051c89184414953800cc21510... | |
| 75 <br/> Cmpd2 000000249400840040100042011001001980410c000000001010088001120... | |
| 76 <br/> ... ... | |
| 77 <br/> ... ..</div> | |
| 78 <p>Example of <em>FP</em> file containing fingerprints vector string data:</p> | |
| 79 <div class="OptionsBox"> | |
| 80 # | |
| 81 <br/> # Package = MayaChemTools 7.4 | |
| 82 <br/> # ReleaseDate = Oct 21, 2010 | |
| 83 <br/> # | |
| 84 <br/> # TimeStamp = Mon Mar 7 15:14:01 2011 | |
| 85 <br/> # | |
| 86 <br/> # FingerprintsStringType = FingerprintsVector | |
| 87 <br/> # | |
| 88 <br/> # Description = PathLengthBits:AtomicInvariantsAtomTypes:MinLength1:... | |
| 89 <br/> # VectorStringFormat = IDsAndValuesString | |
| 90 <br/> # VectorValuesType = NumericalValues | |
| 91 <br/> # | |
| 92 <br/> Cmpd1 338;C F N O C:C C:N C=O CC CF CN CO C:C:C C:C:N C:CC C:CF C:CN C: | |
| 93 <br/> N:C C:NC CC:N CC=O CCC CCN CCO CNC NC=O O=CO C:C:C:C C:C:C:N C:C:CC...; | |
| 94 <br/> 33 1 2 5 21 2 2 12 1 3 3 20 2 10 2 2 1 2 2 2 8 2 5 1 1 1 19 2 8 2 2 2 2 | |
| 95 <br/> 6 2 2 2 2 2 2 2 2 3 2 2 1 4 1 5 1 1 18 6 2 2 1 2 10 2 1 2 1 2 2 2 2 ... | |
| 96 <br/> Cmpd2 103;C N O C=N C=O CC CN CO CC=O CCC CCN CCO CNC N=CN NC=O NCN O=C | |
| 97 <br/> O C CC=O CCCC CCCN CCCO CCNC CNC=N CNC=O CNCN CCCC=O CCCCC CCCCN CC...; | |
| 98 <br/> 15 4 4 1 2 13 5 2 2 15 5 3 2 2 1 1 1 2 17 7 6 5 1 1 1 2 15 8 5 7 2 2 2 2 | |
| 99 <br/> 1 2 1 1 3 15 7 6 8 3 4 4 3 2 2 1 2 3 14 2 4 7 4 4 4 4 1 1 1 2 1 1 1 ... | |
| 100 <br/> ... ... | |
| 101 <br/> ... ...</div> | |
| 102 <p>Example of <em>SD</em> file containing fingerprints bit-vector string data:</p> | |
| 103 <div class="OptionsBox"> | |
| 104 ... ... | |
| 105 <br/> ... ... | |
| 106 <br/> $$$$ | |
| 107 <br/> ... ... | |
| 108 <br/> ... ... | |
| 109 <br/> ... ... | |
| 110 <br/> 41 44 0 0 0 0 0 0 0 0999 V2000 | |
| 111 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 | |
| 112 <br/> ... ... | |
| 113 <br/> 2 3 1 0 0 0 0 | |
| 114 <br/> ... ... | |
| 115 <br/> M END | |
| 116 <br/> > <CmpdID> | |
| 117 <br/> Cmpd1</div> | |
| 118 <div class="OptionsBox"> | |
| 119 > <PathLengthFingerprints> | |
| 120 <br/> FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLengt | |
| 121 <br/> h1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a49913991a66 | |
| 122 <br/> 03130b0a19e8051c89184414953800cc2151082844a201042800130860308e8204d4028 | |
| 123 <br/> 00831048940e44281c00060449a5000ac80c894114e006321264401600846c050164462 | |
| 124 <br/> 08190410805000304a10205b0100e04c0038ba0fad0209c0ca8b1200012268b61c0026a | |
| 125 <br/> aa0660a11014a011d46</div> | |
| 126 <div class="OptionsBox"> | |
| 127 $$$$ | |
| 128 <br/> ... ... | |
| 129 <br/> ... ...</div> | |
| 130 <p>Example of CSV <em>Text</em> file containing fingerprints bit-vector string data:</p> | |
| 131 <div class="OptionsBox"> | |
| 132 "CompoundID","PathLengthFingerprints" | |
| 133 <br/> "Cmpd1","FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes | |
| 134 <br/> :MinLength1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a4 | |
| 135 <br/> 9913991a6603130b0a19e8051c89184414953800cc2151082844a20104280013086030 | |
| 136 <br/> 8e8204d402800831048940e44281c00060449a5000ac80c894114e006321264401..." | |
| 137 <br/> ... ... | |
| 138 <br/> ... ...</div> | |
| 139 <p>The current release of MayaChemTools supports the following types of fingerprint | |
| 140 bit-vector and vector strings:</p> | |
| 141 <div class="OptionsBox"> | |
| 142 FingerprintsVector;AtomNeighborhoods:AtomicInvariantsAtomTypes:MinRadi | |
| 143 <br/> us0:MaxRadius2;41;AlphaNumericalValues;ValuesString;NR0-C.X1.BO1.H3-AT | |
| 144 <br/> C1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-ATC1 NR0-C.X | |
| 145 <br/> 1.BO1.H3-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-A | |
| 146 <br/> TC1 NR0-C.X2.BO2.H2-ATC1:NR1-C.X2.BO2.H2-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2 | |
| 147 <br/> -C.X2.BO2.H2-ATC1:NR2-N.X3.BO3-ATC1:NR2-O.X1.BO1.H1-ATC1 NR0-C.X2.B...</div> | |
| 148 <div class="OptionsBox"> | |
| 149 FingerprintsVector;AtomTypesCount:AtomicInvariantsAtomTypes:ArbitraryS | |
| 150 <br/> ize;10;NumericalValues;IDsAndValuesString;C.X1.BO1.H3 C.X2.BO2.H2 C.X2 | |
| 151 <br/> .BO3.H1 C.X3.BO3.H1 C.X3.BO4 F.X1.BO1 N.X2.BO2.H1 N.X3.BO3 O.X1.BO1.H1 | |
| 152 <br/> O.X1.BO2;2 4 14 3 10 1 1 1 3 2</div> | |
| 153 <div class="OptionsBox"> | |
| 154 FingerprintsVector;AtomTypesCount:SLogPAtomTypes:ArbitrarySize;16;Nume | |
| 155 <br/> ricalValues;IDsAndValuesString;C1 C10 C11 C14 C18 C20 C21 C22 C5 CS F | |
| 156 <br/> N11 N4 O10 O2 O9;5 1 1 1 14 4 2 1 2 2 1 1 1 1 3 1</div> | |
| 157 <div class="OptionsBox"> | |
| 158 FingerprintsVector;AtomTypesCount:SLogPAtomTypes:FixedSize;67;OrderedN | |
| 159 <br/> umericalValues;IDsAndValuesString;C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C | |
| 160 <br/> 12 C13 C14 C15 C16 C17 C18 C19 C20 C21 C22 C23 C24 C25 C26 C27 CS N1 N | |
| 161 <br/> 2 N3 N4 N5 N6 N7 N8 N9 N10 N11 N12 N13 N14 NS O1 O2 O3 O4 O5 O6 O7 O8 | |
| 162 <br/> O9 O10 O11 O12 OS F Cl Br I Hal P S1 S2 S3 Me1 Me2;5 0 0 0 2 0 0 0 0 1 | |
| 163 <br/> 1 0 0 1 0 0 0 14 0 4 2 1 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0...</div> | |
| 164 <div class="OptionsBox"> | |
| 165 FingerprintsVector;EStateIndicies:ArbitrarySize;11;NumericalValues;IDs | |
| 166 <br/> AndValuesString;SaaCH SaasC SaasN SdO SdssC SsCH3 SsF SsOH SssCH2 SssN | |
| 167 <br/> H SsssCH;24.778 4.387 1.993 25.023 -1.435 3.975 14.006 29.759 -0.073 3 | |
| 168 <br/> .024 -2.270</div> | |
| 169 <div class="OptionsBox"> | |
| 170 FingerprintsVector;EStateIndicies:FixedSize;87;OrderedNumericalValues; | |
| 171 <br/> ValuesString;0 0 0 0 0 0 0 3.975 0 -0.073 0 0 24.778 -2.270 0 0 -1.435 | |
| 172 <br/> 4.387 0 0 0 0 0 0 3.024 0 0 0 0 0 0 0 1.993 0 29.759 25.023 0 0 0 0 1 | |
| 173 <br/> 4.006 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | |
| 174 <br/> 0 0 0 0 0 0 0 0 0 0 0 0 0 0</div> | |
| 175 <div class="OptionsBox"> | |
| 176 FingerprintsVector;ExtendedConnectivity:AtomicInvariantsAtomTypes:Radi | |
| 177 <br/> us2;60;AlphaNumericalValues;ValuesString;73555770 333564680 352413391 | |
| 178 <br/> 666191900 1001270906 1371674323 1481469939 1977749791 2006158649 21414 | |
| 179 <br/> 08799 49532520 64643108 79385615 96062769 273726379 564565671 85514103 | |
| 180 <br/> 5 906706094 988546669 1018231313 1032696425 1197507444 1331250018 1338 | |
| 181 <br/> 532734 1455473691 1607485225 1609687129 1631614296 1670251330 17303...</div> | |
| 182 <div class="OptionsBox"> | |
| 183 FingerprintsVector;ExtendedConnectivityCount:AtomicInvariantsAtomTypes | |
| 184 <br/> :Radius2;60;NumericalValues;IDsAndValuesString;73555770 333564680 3524 | |
| 185 <br/> 13391 666191900 1001270906 1371674323 1481469939 1977749791 2006158649 | |
| 186 <br/> 2141408799 49532520 64643108 79385615 96062769 273726379 564565671...; | |
| 187 <br/> 3 2 1 1 14 1 2 10 4 3 1 1 1 1 2 1 2 1 1 1 2 3 1 1 2 1 3 3 8 2 2 2 6 2 | |
| 188 <br/> 1 2 1 1 2 1 1 1 2 1 1 2 1 2 1 1 1 1 1 1 1 1 1 2 1 1</div> | |
| 189 <div class="OptionsBox"> | |
| 190 FingerprintsBitVector;ExtendedConnectivityBits:AtomicInvariantsAtomTyp | |
| 191 <br/> es:Radius2;1024;BinaryString;Ascending;0000000000000000000000000000100 | |
| 192 <br/> 0000000001010000000110000011000000000000100000000000000000000000100001 | |
| 193 <br/> 1000000110000000000000000000000000010011000000000000000000000000010000 | |
| 194 <br/> 0000000000000000000000000010000000000000000001000000000000000000000000 | |
| 195 <br/> 0000000000010000100001000000000000101000000000000000100000000000000...</div> | |
| 196 <div class="OptionsBox"> | |
| 197 FingerprintsVector;ExtendedConnectivity:FunctionalClassAtomTypes:Radiu | |
| 198 <br/> s2;57;AlphaNumericalValues;ValuesString;24769214 508787397 850393286 8 | |
| 199 <br/> 62102353 981185303 1231636850 1649386610 1941540674 263599683 32920567 | |
| 200 <br/> 1 571109041 639579325 683993318 723853089 810600886 885767127 90326012 | |
| 201 <br/> 7 958841485 981022393 1126908698 1152248391 1317567065 1421489994 1455 | |
| 202 <br/> 632544 1557272891 1826413669 1983319256 2015750777 2029559552 20404...</div> | |
| 203 <div class="OptionsBox"> | |
| 204 FingerprintsVector;ExtendedConnectivity:EStateAtomTypes:Radius2;62;Alp | |
| 205 <br/> haNumericalValues;ValuesString;25189973 528584866 662581668 671034184 | |
| 206 <br/> 926543080 1347067490 1738510057 1759600920 2034425745 2097234755 21450 | |
| 207 <br/> 44754 96779665 180364292 341712110 345278822 386540408 387387308 50430 | |
| 208 <br/> 1706 617094135 771528807 957666640 997798220 1158349170 1291258082 134 | |
| 209 <br/> 1138533 1395329837 1420277211 1479584608 1486476397 1487556246 1566...</div> | |
| 210 <div class="OptionsBox"> | |
| 211 FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;00000000 | |
| 212 <br/> 0000000000000000000000000000000001001000010010000000010010000000011100 | |
| 213 <br/> 0100101010111100011011000100110110000011011110100110111111111111011111 | |
| 214 <br/> 11111111111110111000</div> | |
| 215 <div class="OptionsBox"> | |
| 216 FingerprintsBitVector;MACCSKeyBits;322;BinaryString;Ascending;11101011 | |
| 217 <br/> 1110011111100101111111000111101100110000000000000011100010000000000000 | |
| 218 <br/> 0000000000000000000000000000000000000000000000101000000000000000000000 | |
| 219 <br/> 0000000000000000000000000000000000000000000000000000000000000000000000 | |
| 220 <br/> 0000000000000000000000000000000000000011000000000000000000000000000000 | |
| 221 <br/> 0000000000000000000000000000000000000000</div> | |
| 222 <div class="OptionsBox"> | |
| 223 FingerprintsVector;MACCSKeyCount;166;OrderedNumericalValues;ValuesStri | |
| 224 <br/> ng;0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | |
| 225 <br/> 0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 4 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 | |
| 226 <br/> 0 0 0 0 1 1 8 0 0 0 1 0 0 1 0 1 0 1 0 3 1 3 1 0 0 0 1 2 0 11 1 0 0 0 | |
| 227 <br/> 5 0 0 1 2 0 1 1 0 0 0 0 0 1 1 0 1 1 1 1 0 4 0 0 1 1 0 4 6 1 1 1 2 1 1 | |
| 228 <br/> 3 5 2 2 0 5 3 5 1 1 2 5 1 2 1 2 4 8 3 5 5 2 2 0 3 5 4 1</div> | |
| 229 <div class="OptionsBox"> | |
| 230 FingerprintsVector;MACCSKeyCount;322;OrderedNumericalValues;ValuesStri | |
| 231 <br/> ng;14 8 2 0 2 0 4 4 2 1 4 0 0 2 5 10 5 2 1 0 0 2 0 5 13 3 28 5 5 3 0 0 | |
| 232 <br/> 0 4 2 1 1 0 1 1 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22 5 3 0 0 0 1 0 | |
| 233 <br/> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | |
| 234 <br/> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 0 2 0 0 0 0 0 0 0 0 0 | |
| 235 <br/> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...</div> | |
| 236 <div class="OptionsBox"> | |
| 237 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng | |
| 238 <br/> th1:MaxLength8;1024;BinaryString;Ascending;001000010011010101011000110 | |
| 239 <br/> 0100010101011000101001011100110001000010001001101000001001001001001000 | |
| 240 <br/> 0010110100000111001001000001001010100100100000000011000000101001011100 | |
| 241 <br/> 0010000001000101010100000100111100110111011011011000000010110111001101 | |
| 242 <br/> 0101100011000000010001000011000010100011101100001000001000100000000...</div> | |
| 243 <div class="OptionsBox"> | |
| 244 FingerprintsVector;PathLengthCount:AtomicInvariantsAtomTypes:MinLength | |
| 245 <br/> 1:MaxLength8;432;NumericalValues;IDsAndValuesPairsString;C.X1.BO1.H3 2 | |
| 246 <br/> C.X2.BO2.H2 4 C.X2.BO3.H1 14 C.X3.BO3.H1 3 C.X3.BO4 10 F.X1.BO1 1 N.X | |
| 247 <br/> 2.BO2.H1 1 N.X3.BO3 1 O.X1.BO1.H1 3 O.X1.BO2 2 C.X1.BO1.H3C.X3.BO3.H1 | |
| 248 <br/> 2 C.X2.BO2.H2C.X2.BO2.H2 1 C.X2.BO2.H2C.X3.BO3.H1 4 C.X2.BO2.H2C.X3.BO | |
| 249 <br/> 4 1 C.X2.BO2.H2N.X3.BO3 1 C.X2.BO3.H1:C.X2.BO3.H1 10 C.X2.BO3.H1:C....</div> | |
| 250 <div class="OptionsBox"> | |
| 251 FingerprintsVector;PathLengthCount:MMFF94AtomTypes:MinLength1:MaxLengt | |
| 252 <br/> h8;463;NumericalValues;IDsAndValuesPairsString;C5A 2 C5B 2 C=ON 1 CB 1 | |
| 253 <br/> 8 COO 1 CR 9 F 1 N5 1 NC=O 1 O=CN 1 O=CO 1 OC=O 1 OR 2 C5A:C5B 2 C5A:N | |
| 254 <br/> 5 2 C5ACB 1 C5ACR 1 C5B:C5B 1 C5BC=ON 1 C5BCB 1 C=ON=O=CN 1 C=ONNC=O 1 | |
| 255 <br/> CB:CB 18 CBF 1 CBNC=O 1 COO=O=CO 1 COOCR 1 COOOC=O 1 CRCR 7 CRN5 1 CR | |
| 256 <br/> OR 2 C5A:C5B:C5B 2 C5A:C5BC=ON 1 C5A:C5BCB 1 C5A:N5:C5A 1 C5A:N5CR ...</div> | |
| 257 <div class="OptionsBox"> | |
| 258 FingerprintsVector;TopologicalAtomPairs:AtomicInvariantsAtomTypes:MinD | |
| 259 <br/> istance1:MaxDistance10;223;NumericalValues;IDsAndValuesString;C.X1.BO1 | |
| 260 <br/> .H3-D1-C.X3.BO3.H1 C.X2.BO2.H2-D1-C.X2.BO2.H2 C.X2.BO2.H2-D1-C.X3.BO3. | |
| 261 <br/> H1 C.X2.BO2.H2-D1-C.X3.BO4 C.X2.BO2.H2-D1-N.X3.BO3 C.X2.BO3.H1-D1-...; | |
| 262 <br/> 2 1 4 1 1 10 8 1 2 6 1 2 2 1 2 1 2 2 1 2 1 5 1 10 12 2 2 1 2 1 9 1 3 1 | |
| 263 <br/> 1 1 2 2 1 3 6 1 6 14 2 2 2 3 1 3 1 8 2 2 1 3 2 6 1 2 2 5 1 3 1 23 1...</div> | |
| 264 <div class="OptionsBox"> | |
| 265 FingerprintsVector;TopologicalAtomPairs:FunctionalClassAtomTypes:MinDi | |
| 266 <br/> stance1:MaxDistance10;144;NumericalValues;IDsAndValuesString;Ar-D1-Ar | |
| 267 <br/> Ar-D1-Ar.HBA Ar-D1-HBD Ar-D1-Hal Ar-D1-None Ar.HBA-D1-None HBA-D1-NI H | |
| 268 <br/> BA-D1-None HBA.HBD-D1-NI HBA.HBD-D1-None HBD-D1-None NI-D1-None No...; | |
| 269 <br/> 23 2 1 1 2 1 1 1 1 2 1 1 7 28 3 1 3 2 8 2 1 1 1 5 1 5 24 3 3 4 2 13 4 | |
| 270 <br/> 1 1 4 1 5 22 4 4 3 1 19 1 1 1 1 1 2 2 3 1 1 8 25 4 5 2 3 1 26 1 4 1 ...</div> | |
| 271 <div class="OptionsBox"> | |
| 272 FingerprintsVector;TopologicalAtomTorsions:AtomicInvariantsAtomTypes;3 | |
| 273 <br/> 3;NumericalValues;IDsAndValuesString;C.X1.BO1.H3-C.X3.BO3.H1-C.X3.BO4- | |
| 274 <br/> C.X3.BO4 C.X1.BO1.H3-C.X3.BO3.H1-C.X3.BO4-N.X3.BO3 C.X2.BO2.H2-C.X2.BO | |
| 275 <br/> 2.H2-C.X3.BO3.H1-C.X2.BO2.H2 C.X2.BO2.H2-C.X2.BO2.H2-C.X3.BO3.H1-O...; | |
| 276 <br/> 2 2 1 1 2 2 1 1 3 4 4 8 4 2 2 6 2 2 1 2 1 1 2 1 1 2 6 2 4 2 1 3 1</div> | |
| 277 <div class="OptionsBox"> | |
| 278 FingerprintsVector;TopologicalAtomTorsions:EStateAtomTypes;36;Numerica | |
| 279 <br/> lValues;IDsAndValuesString;aaCH-aaCH-aaCH-aaCH aaCH-aaCH-aaCH-aasC aaC | |
| 280 <br/> H-aaCH-aasC-aaCH aaCH-aaCH-aasC-aasC aaCH-aaCH-aasC-sF aaCH-aaCH-aasC- | |
| 281 <br/> ssNH aaCH-aasC-aasC-aasC aaCH-aasC-aasC-aasN aaCH-aasC-ssNH-dssC a...; | |
| 282 <br/> 4 4 8 4 2 2 6 2 2 2 4 3 2 1 3 3 2 2 2 1 2 1 1 1 2 1 1 1 1 1 1 1 2 1 1 2</div> | |
| 283 <div class="OptionsBox"> | |
| 284 FingerprintsVector;TopologicalAtomTriplets:AtomicInvariantsAtomTypes:M | |
| 285 <br/> inDistance1:MaxDistance10;3096;NumericalValues;IDsAndValuesString;C.X1 | |
| 286 <br/> .BO1.H3-D1-C.X1.BO1.H3-D1-C.X3.BO3.H1-D2 C.X1.BO1.H3-D1-C.X2.BO2.H2-D1 | |
| 287 <br/> 0-C.X3.BO4-D9 C.X1.BO1.H3-D1-C.X2.BO2.H2-D3-N.X3.BO3-D4 C.X1.BO1.H3-D1 | |
| 288 <br/> -C.X2.BO2.H2-D4-C.X2.BO2.H2-D5 C.X1.BO1.H3-D1-C.X2.BO2.H2-D6-C.X3....; | |
| 289 <br/> 1 2 2 2 2 2 2 2 8 8 4 8 4 4 2 2 2 2 4 2 2 2 4 2 2 2 2 1 2 2 4 4 4 2 2 | |
| 290 <br/> 2 4 4 4 8 4 4 2 4 4 4 2 4 4 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 8...</div> | |
| 291 <div class="OptionsBox"> | |
| 292 FingerprintsVector;TopologicalAtomTriplets:SYBYLAtomTypes:MinDistance1 | |
| 293 <br/> :MaxDistance10;2332;NumericalValues;IDsAndValuesString;C.2-D1-C.2-D9-C | |
| 294 <br/> .3-D10 C.2-D1-C.2-D9-C.ar-D10 C.2-D1-C.3-D1-C.3-D2 C.2-D1-C.3-D10-C.3- | |
| 295 <br/> D9 C.2-D1-C.3-D2-C.3-D3 C.2-D1-C.3-D2-C.ar-D3 C.2-D1-C.3-D3-C.3-D4 C.2 | |
| 296 <br/> -D1-C.3-D3-N.ar-D4 C.2-D1-C.3-D3-O.3-D2 C.2-D1-C.3-D4-C.3-D5 C.2-D1-C. | |
| 297 <br/> 3-D5-C.3-D6 C.2-D1-C.3-D5-O.3-D4 C.2-D1-C.3-D6-C.3-D7 C.2-D1-C.3-D7...</div> | |
| 298 <div class="OptionsBox"> | |
| 299 FingerprintsVector;TopologicalPharmacophoreAtomPairs:ArbitrarySize:Min | |
| 300 <br/> Distance1:MaxDistance10;54;NumericalValues;IDsAndValuesString;H-D1-H H | |
| 301 <br/> -D1-NI HBA-D1-NI HBD-D1-NI H-D2-H H-D2-HBA H-D2-HBD HBA-D2-HBA HBA-D2- | |
| 302 <br/> HBD H-D3-H H-D3-HBA H-D3-HBD H-D3-NI HBA-D3-NI HBD-D3-NI H-D4-H H-D4-H | |
| 303 <br/> BA H-D4-HBD HBA-D4-HBA HBA-D4-HBD HBD-D4-HBD H-D5-H H-D5-HBA H-D5-...; | |
| 304 <br/> 18 1 2 1 22 12 8 1 2 18 6 3 1 1 1 22 13 6 5 7 2 28 9 5 1 1 1 36 16 10 | |
| 305 <br/> 3 4 1 37 10 8 1 35 10 9 3 3 1 28 7 7 4 18 16 12 5 1 2 1</div> | |
| 306 <div class="OptionsBox"> | |
| 307 FingerprintsVector;TopologicalPharmacophoreAtomPairs:FixedSize:MinDist | |
| 308 <br/> ance1:MaxDistance10;150;OrderedNumericalValues;ValuesString;18 0 0 1 0 | |
| 309 <br/> 0 0 2 0 0 1 0 0 0 0 22 12 8 0 0 1 2 0 0 0 0 0 0 0 0 18 6 3 1 0 0 0 1 | |
| 310 <br/> 0 0 1 0 0 0 0 22 13 6 0 0 5 7 0 0 2 0 0 0 0 0 28 9 5 1 0 0 0 1 0 0 1 0 | |
| 311 <br/> 0 0 0 36 16 10 0 0 3 4 0 0 1 0 0 0 0 0 37 10 8 0 0 0 0 1 0 0 0 0 0 0 | |
| 312 <br/> 0 35 10 9 0 0 3 3 0 0 1 0 0 0 0 0 28 7 7 4 0 0 0 0 0 0 0 0 0 0 0 18...</div> | |
| 313 <div class="OptionsBox"> | |
| 314 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:ArbitrarySize: | |
| 315 <br/> MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesString;Ar1- | |
| 316 <br/> Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1 | |
| 317 <br/> -H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA1 H1- | |
| 318 <br/> HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 Ar1-...; | |
| 319 <br/> 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23 | |
| 320 <br/> 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1 | |
| 321 <br/> 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ...</div> | |
| 322 <div class="OptionsBox"> | |
| 323 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD | |
| 324 <br/> istance1:MaxDistance10;2692;OrderedNumericalValues;ValuesString;46 106 | |
| 325 <br/> 8 3 0 0 83 11 4 0 0 0 1 0 0 0 0 0 0 0 0 21 5 3 0 0 1 2 2 0 0 1 0 0 0 | |
| 326 <br/> 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 101 18 11 0 0 145 132 26 | |
| 327 <br/> 14 0 0 23 28 3 3 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 45 10 4 0 | |
| 328 <br/> 0 16 20 7 5 1 0 3 4 5 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 5 ...</div> | |
| 329 <p> | |
| 330 </p> | |
| 331 <h2>OPTIONS</h2> | |
| 332 <dl> | |
| 333 <dt><strong><strong>--alpha</strong> <em>number</em></strong></dt> | |
| 334 <dd> | |
| 335 <p>Value of alpha parameter for calculating <em>Tversky</em> similarity coefficient specified for | |
| 336 <strong>-b, --BitVectorComparisonMode</strong> option. It corresponds to weights assigned for bits set | |
| 337 to "1" in a pair of fingerprint bit-vectors during the calculation of similarity coefficient. Possible | |
| 338 values: <em>0 to 1</em>. Default value: <0.5>.</p> | |
| 339 </dd> | |
| 340 <dt><strong><strong>--beta</strong> <em>number</em></strong></dt> | |
| 341 <dd> | |
| 342 <p>Value of beta parameter for calculating <em>WeightedTanimoto</em> and <em>WeightedTversky</em> | |
| 343 similarity coefficients specified for <strong>-b, --BitVectorComparisonMode</strong> option. It is used to | |
| 344 weight the contributions of bits set to "0" during the calculation of similarity coefficients. Possible | |
| 345 values: <em>0 to 1</em>. Default value of <1> makes <em>WeightedTanimoto</em> and <em>WeightedTversky</em> | |
| 346 equivalent to <em>Tanimoto</em> and <em>Tversky</em>.</p> | |
| 347 </dd> | |
| 348 <dt><strong><strong>-b, --BitVectorComparisonMode</strong> <em>All | "TanimotoSimilarity,[TverskySimilarity,...]"</em></strong></dt> | |
| 349 <dd> | |
| 350 <p>Specify what similarity coefficients to use for calculating similarity matrices for fingerprints bit-vector | |
| 351 strings data values in <em>TextFile(s)</em>: calculate similarity matrices for all supported similarity | |
| 352 coefficients or specify a comma delimited list of similarity coefficients. Possible values: | |
| 353 <em>All | "TanimotoSimilarity,[TverskySimilarity,...]</em>. Default: <em>TanimotoSimilarity</em></p> | |
| 354 <p><em>All</em> uses complete list of supported similarity coefficients: <em>BaroniUrbaniSimilarity, BuserSimilarity, | |
| 355 CosineSimilarity, DiceSimilarity, DennisSimilarity, ForbesSimilarity, FossumSimilarity, HamannSimilarity, JacardSimilarity, | |
| 356 Kulczynski1Similarity, Kulczynski2Similarity, MatchingSimilarity, McConnaugheySimilarity, OchiaiSimilarity, | |
| 357 PearsonSimilarity, RogersTanimotoSimilarity, RussellRaoSimilarity, SimpsonSimilarity, SkoalSneath1Similarity, | |
| 358 SkoalSneath2Similarity, SkoalSneath3Similarity, TanimotoSimilarity, TverskySimilarity, YuleSimilarity, | |
| 359 WeightedTanimotoSimilarity, WeightedTverskySimilarity</em>. These similarity coefficients are described below.</p> | |
| 360 <p>For two fingerprint bit-vectors A and B of same size, let:</p> | |
| 361 <div class="OptionsBox"> | |
| 362 Na = Number of bits set to "1" in A | |
| 363 <br/> Nb = Number of bits set to "1" in B | |
| 364 <br/> Nc = Number of bits set to "1" in both A and B | |
| 365 <br/> Nd = Number of bits set to "0" in both A and B</div> | |
| 366 <div class="OptionsBox"> | |
| 367 Nt = Number of bits set to "1" or "0" in A or B (Size of A or B) | |
| 368 <br/> Nt = Na + Nb - Nc + Nd</div> | |
| 369 <div class="OptionsBox"> | |
| 370 Na - Nc = Number of bits set to "1" in A but not in B | |
| 371 <br/> Nb - Nc = Number of bits set to "1" in B but not in A</div> | |
| 372 <p>Then, various similarity coefficients [ Ref. 40 - 42 ] for a pair of bit-vectors A and B are | |
| 373 defined as follows:</p> | |
| 374 <p><em>BaroniUrbaniSimilarity</em>: ( SQRT( Nc * Nd ) + Nc ) / ( SQRT ( Nc * Nd ) + Nc + ( Na - Nc ) + ( Nb - Nc ) ) ( same as Buser )</p> | |
| 375 <p><em>BuserSimilarity</em>: ( SQRT ( Nc * Nd ) + Nc ) / ( SQRT ( Nc * Nd ) + Nc + ( Na - Nc ) + ( Nb - Nc ) ) ( same as BaroniUrbani )</p> | |
| 376 <p><em>CosineSimilarity</em>: Nc / SQRT ( Na * Nb ) (same as Ochiai)</p> | |
| 377 <p><em>DiceSimilarity</em>: (2 * Nc) / ( Na + Nb )</p> | |
| 378 <p><em>DennisSimilarity</em>: ( Nc * Nd - ( ( Na - Nc ) * ( Nb - Nc ) ) ) / SQRT ( Nt * Na * Nb)</p> | |
| 379 <p><em>ForbesSimilarity</em>: ( Nt * Nc ) / ( Na * Nb )</p> | |
| 380 <p><em>FossumSimilarity</em>: ( Nt * ( ( Nc - 1/2 ) ** 2 ) / ( Na * Nb )</p> | |
| 381 <p><em>HamannSimilarity</em>: ( ( Nc + Nd ) - ( Na - Nc ) - ( Nb - Nc ) ) / Nt</p> | |
| 382 <p><em>JaccardSimilarity</em>: Nc / ( ( Na - Nc) + ( Nb - Nc ) + Nc ) = Nc / ( Na + Nb - Nc ) (same as Tanimoto)</p> | |
| 383 <p><em>Kulczynski1Similarity</em>: Nc / ( ( Na - Nc ) + ( Nb - Nc) ) = Nc / ( Na + Nb - 2Nc )</p> | |
| 384 <p><em>Kulczynski2Similarity</em>: ( ( Nc / 2 ) * ( 2 * Nc + ( Na - Nc ) + ( Nb - Nc) ) ) / ( ( Nc + ( Na - Nc ) ) * ( Nc + ( Nb - Nc ) ) ) = 0.5 * ( Nc / Na + Nc / Nb )</p> | |
| 385 <p><em>MatchingSimilarity</em>: ( Nc + Nd ) / Nt</p> | |
| 386 <p><em>McConnaugheySimilarity</em>: ( Nc ** 2 - ( Na - Nc ) * ( Nb - Nc) ) / ( Na * Nb )</p> | |
| 387 <p><em>OchiaiSimilarity</em>: Nc / SQRT ( Na * Nb ) (same as Cosine)</p> | |
| 388 <p><em>PearsonSimilarity</em>: ( ( Nc * Nd ) - ( ( Na - Nc ) * ( Nb - Nc ) ) / SQRT ( Na * Nb * ( Na - Nc + Nd ) * ( Nb - Nc + Nd ) )</p> | |
| 389 <p><em>RogersTanimotoSimilarity</em>: ( Nc + Nd ) / ( ( Na - Nc) + ( Nb - Nc) + Nt) = ( Nc + Nd ) / ( Na + Nb - 2Nc + Nt)</p> | |
| 390 <p><em>RussellRaoSimilarity</em>: Nc / Nt</p> | |
| 391 <p><em>SimpsonSimilarity</em>: Nc / MIN ( Na, Nb)</p> | |
| 392 <p><em>SkoalSneath1Similarity</em>: Nc / ( Nc + 2 * ( Na - Nc) + 2 * ( Nb - Nc) ) = Nc / ( 2 * Na + 2 * Nb - 3 * Nc )</p> | |
| 393 <p><em>SkoalSneath2Similarity</em>: ( 2 * Nc + 2 * Nd ) / ( Nc + Nd + Nt )</p> | |
| 394 <p><em>SkoalSneath3Similarity</em>: ( Nc + Nd ) / ( ( Na - Nc ) + ( Nb - Nc ) ) = ( Nc + Nd ) / ( Na + Nb - 2 * Nc )</p> | |
| 395 <p><em>TanimotoSimilarity</em>: Nc / ( ( Na - Nc) + ( Nb - Nc ) + Nc ) = Nc / ( Na + Nb - Nc ) (same as Jaccard)</p> | |
| 396 <p><em>TverskySimilarity</em>: Nc / ( alpha * ( Na - Nc ) + ( 1 - alpha) * ( Nb - Nc) + Nc ) = Nc / ( alpha * ( Na - Nb ) + Nb)</p> | |
| 397 <p><em>YuleSimilarity</em>: ( ( Nc * Nd ) - ( ( Na - Nc ) * ( Nb - Nc ) ) ) / ( ( Nc * Nd ) + ( ( Na - Nc ) * ( Nb - Nc ) ) )</p> | |
| 398 <p>Values of Tanimoto/Jaccard and Tversky coefficients are dependent on only those bit which | |
| 399 are set to "1" in both A and B. In order to take into account all bit positions, modified versions | |
| 400 of Tanimoto [ Ref. 42 ] and Tversky [ Ref. 43 ] have been developed.</p> | |
| 401 <p>Let:</p> | |
| 402 <div class="OptionsBox"> | |
| 403 Na' = Number of bits set to "0" in A | |
| 404 <br/> Nb' = Number of bits set to "0" in B | |
| 405 <br/> Nc' = Number of bits set to "0" in both A and B</div> | |
| 406 <p>Tanimoto': Nc' / ( ( Na' - Nc') + ( Nb' - Nc' ) + Nc' ) = Nc' / ( Na' + Nb' - Nc' )</p> | |
| 407 <p>Tversky': Nc' / ( alpha * ( Na' - Nc' ) + ( 1 - alpha) * ( Nb' - Nc' ) + Nc' ) = Nc' / ( alpha * ( Na' - Nb' ) + Nb')</p> | |
| 408 <p>Then:</p> | |
| 409 <p><em>WeightedTanimotoSimilarity</em> = beta * Tanimoto + (1 - beta) * Tanimoto'</p> | |
| 410 <p><em>WeightedTverskySimilarity</em> = beta * Tversky + (1 - beta) * Tversky'</p> | |
| 411 </dd> | |
| 412 <dt><strong><strong>-c, --ColMode</strong> <em>ColNum | ColLabel</em></strong></dt> | |
| 413 <dd> | |
| 414 <p>Specify how columns are identified in <em>TextFile(s)</em>: using column number or column | |
| 415 label. Possible values: <em>ColNum or ColLabel</em>. Default value: <em>ColNum</em>.</p> | |
| 416 </dd> | |
| 417 <dt><strong><strong>--CompoundIDCol</strong> <em>col number | col name</em></strong></dt> | |
| 418 <dd> | |
| 419 <p>This value is <strong>-c, --ColMode</strong> mode specific. It specifies input <em>TextFile(s)</em> column to use for | |
| 420 generating compound ID for similarity matrices in output <em>TextFile(s)</em>. Possible values: <em>col number | |
| 421 or col label</em>. Default value: <em>first column containing the word compoundID in its column label or sequentially | |
| 422 generated IDs</em>.</p> | |
| 423 </dd> | |
| 424 <dt><strong><strong>--CompoundIDPrefix</strong> <em>text</em></strong></dt> | |
| 425 <dd> | |
| 426 <p>Specify compound ID prefix to use during sequential generation of compound IDs for input <em>SDFile(s)</em> | |
| 427 and <em>TextFile(s)</em>. Default value: <em>Cmpd</em>. The default value generates compound IDs which look | |
| 428 like Cmpd<Number>.</p> | |
| 429 <p>For input <em>SDFile(s)</em>, this value is only used during <em>LabelPrefix | MolNameOrLabelPrefix</em> values | |
| 430 of <strong>--CompoundIDMode</strong> option; otherwise, it's ignored.</p> | |
| 431 <p>Examples for <em>LabelPrefix</em> or <em>MolNameOrLabelPrefix</em> value of <strong>--CompoundIDMode</strong>:</p> | |
| 432 <div class="OptionsBox"> | |
| 433 Compound</div> | |
| 434 <p>The values specified above generates compound IDs which correspond to Compound<Number> | |
| 435 instead of default value of Cmpd<Number>.</p> | |
| 436 </dd> | |
| 437 <dt><strong><strong>--CompoundIDField</strong> <em>DataFieldName</em></strong></dt> | |
| 438 <dd> | |
| 439 <p>Specify input <em>SDFile(s)</em> datafield label for generating compound IDs. This value is only used | |
| 440 during <em>DataField</em> value of <strong>--CompoundIDMode</strong> option.</p> | |
| 441 <p>Examples for <em>DataField</em> value of <strong>--CompoundIDMode</strong>:</p> | |
| 442 <div class="OptionsBox"> | |
| 443 MolID | |
| 444 <br/> ExtReg</div> | |
| 445 </dd> | |
| 446 <dt><strong><strong>--CompoundIDMode</strong> <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em></strong></dt> | |
| 447 <dd> | |
| 448 <p>Specify how to generate compound IDs from input <em>SDFile(s)</em> for similarity matrix CSV/TSV text | |
| 449 file(s): use a <em>SDFile(s)</em> datafield value; use molname line from <em>SDFile(s)</em>; generate a sequential ID | |
| 450 with specific prefix; use combination of both MolName and LabelPrefix with usage of LabelPrefix values | |
| 451 for empty molname lines.</p> | |
| 452 <p>Possible values: <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em>. | |
| 453 Default: <em>LabelPrefix</em>.</p> | |
| 454 <p>For <em>MolNameAndLabelPrefix</em> value of <strong>--CompoundIDMode</strong>, molname line in <em>SDFile(s)</em> takes | |
| 455 precedence over sequential compound IDs generated using <em>LabelPrefix</em> and only empty molname | |
| 456 values are replaced with sequential compound IDs.</p> | |
| 457 </dd> | |
| 458 <dt><strong><strong>-d, --detail</strong> <em>InfoLevel</em></strong></dt> | |
| 459 <dd> | |
| 460 <p>Level of information to print about lines being ignored. Default: <em>1</em>. Possible values: | |
| 461 <em>1, 2 or 3</em>.</p> | |
| 462 </dd> | |
| 463 <dt><strong><strong>-f, --fast</strong></strong></dt> | |
| 464 <dd> | |
| 465 <p>In this mode, fingerprints columns specified using <strong>--FingerprintsCol</strong> for <em>TextFile(s)</em> and | |
| 466 <strong>--FingerprintsField</strong> for <em>SDFile(s)</em> are assumed to contain valid fingerprints data and no | |
| 467 checking is performed before calculating similarity matrices. By default, fingerprints data is | |
| 468 validated before computing pairwise similarity and distance coefficients.</p> | |
| 469 </dd> | |
| 470 <dt><strong><strong>--FingerprintsCol</strong> <em>col number | col name</em></strong></dt> | |
| 471 <dd> | |
| 472 <p>This value is <strong>-c, --colmode</strong> specific. It specifies fingerprints column to use during | |
| 473 calculation similarity matrices for <em>TextFile(s)</em>. Possible values: <em>col number or col label</em>. | |
| 474 Default value: <em>first column containing the word Fingerprints in its column label</em>.</p> | |
| 475 </dd> | |
| 476 <dt><strong><strong>--FingerprintsField</strong> <em>FieldLabel</em></strong></dt> | |
| 477 <dd> | |
| 478 <p>Fingerprints field label to use during calculation similarity matrices for <em>SDFile(s)</em>. | |
| 479 Default value: <em>first data field label containing the word Fingerprints in its label</em></p> | |
| 480 </dd> | |
| 481 <dt><strong><strong>-h, --help</strong></strong></dt> | |
| 482 <dd> | |
| 483 <p>Print this help message.</p> | |
| 484 </dd> | |
| 485 <dt><strong><strong>--InDelim</strong> <em>comma | semicolon</em></strong></dt> | |
| 486 <dd> | |
| 487 <p>Input delimiter for CSV <em>TextFile(s)</em>. Possible values: <em>comma or semicolon</em>. | |
| 488 Default value: <em>comma</em>. For TSV files, this option is ignored and <em>tab</em> is used as a | |
| 489 delimiter.</p> | |
| 490 </dd> | |
| 491 <dt><strong><strong>--InputDataMode</strong> <em>LoadInMemory | ScanFile</em></strong></dt> | |
| 492 <dd> | |
| 493 <p>Specify how fingerprints bit-vector or vector strings data from <em>SD, FP and CSV/TSV</em> | |
| 494 fingerprint file(s) is processed: Retrieve, process and load all available fingerprints | |
| 495 data in memory; Retrieve and process data for fingerprints one at a time. Possible values | |
| 496 : <em>LoadInMemory | ScanFile</em>. Default: <em>LoadInMemory</em>.</p> | |
| 497 <p>During <em>LoadInMemory</em> value of <strong>--InputDataMode</strong>, fingerprints bit-vector or vector | |
| 498 strings data from input file is retrieved, processed, and loaded into memory all at once | |
| 499 as fingerprints objects for generation for similarity matrices.</p> | |
| 500 <p>During <em>ScanFile</em> value of <strong>--InputDataMode</strong>, multiple passes over the input fingerprints | |
| 501 file are performed to retrieve and process fingerprints bit-vector or vector strings data one at | |
| 502 a time to generate fingerprints objects used during generation of similarity matrices. A temporary | |
| 503 copy of the input fingerprints file is made at the start and deleted after generating the matrices.</p> | |
| 504 <p><em>ScanFile</em> value of <strong>--InputDataMode</strong> allows processing of arbitrary large fingerprints files | |
| 505 without any additional memory requirement.</p> | |
| 506 </dd> | |
| 507 <dt><strong><strong>-m, --mode</strong> <em>AutoDetect | FingerprintsBitVectorString | FingerprintsVectorString</em></strong></dt> | |
| 508 <dd> | |
| 509 <p>Format of fingerprint strings data in <em>TextFile(s)</em>: automatically detect format of fingerprints | |
| 510 string created by MayaChemTools fingerprints generation scripts or explicitly specify its format. | |
| 511 Possible values: <em>AutoDetect | FingerprintsBitVectorString | FingerprintsVectorString</em>. Default | |
| 512 value: <em>AutoDetect</em>.</p> | |
| 513 </dd> | |
| 514 <dt><strong><strong>--OutDelim</strong> <em>comma | tab | semicolon</em></strong></dt> | |
| 515 <dd> | |
| 516 <p>Delimiter for output CSV/TSV text file(s). Possible values: <em>comma, tab, or semicolon</em> | |
| 517 Default value: <em>comma</em>.</p> | |
| 518 </dd> | |
| 519 <dt><strong><strong>--OutMatrixFormat</strong> <em>RowsAndColumns | IDPairsAndValue</em></strong></dt> | |
| 520 <dd> | |
| 521 <p>Specify how similarity or distance values calculated for fingerprints vector and bit-vector strings | |
| 522 are written to the output CSV/TSV text file(s): Generate text files containing rows and columns | |
| 523 with their labels corresponding to compound IDs and each matrix element value corresponding to | |
| 524 similarity or distance between corresponding compounds; Generate text files containing rows containing | |
| 525 compoundIDs for two compounds followed by similarity or distance value between these compounds.</p> | |
| 526 <p>Possible values: <em>RowsAndColumns, or IDPairsAndValue</em>. Default value: <em>RowsAndColumns</em>.</p> | |
| 527 <p>The value of <strong>--OutMatrixFormat</strong> in conjunction with <strong>--OutMatrixType</strong> determines type | |
| 528 of data written to output files and allows generation of up to 6 different output data formats:</p> | |
| 529 <div class="OptionsBox"> | |
| 530 OutMatrixFormat OutMatrixType</div> | |
| 531 <div class="OptionsBox"> | |
| 532 RowsAndColumns FullMatrix [ DEFAULT ] | |
| 533 <br/> RowsAndColumns UpperTriangularMatrix | |
| 534 <br/> RowsAndColumns LowerTriangularMatrix</div> | |
| 535 <div class="OptionsBox"> | |
| 536 IDPairsAndValue FullMatrix | |
| 537 <br/> IDPairsAndValue UpperTriangularMatrix | |
| 538 <br/> IDPairsAndValue LowerTriangularMatrix</div> | |
| 539 <p>Example of data in output file for <em>RowsAndColumns</em> <strong>--OutMatrixFormat</strong> value for | |
| 540 <em>FullMatrix</em> valueof <strong>--OutMatrixType</strong>:</p> | |
| 541 <div class="OptionsBox"> | |
| 542 "","Cmpd1","Cmpd2","Cmpd3","Cmpd4","Cmpd5","Cmpd6",... ... | |
| 543 <br/> "Cmpd1","1","0.04","0.25","0.13","0.11","0.2",... ... | |
| 544 <br/> "Cmpd2","0.04","1","0.06","0.05","0.19","0.07",... ... | |
| 545 <br/> "Cmpd3","0.25","0.06","1","0.12","0.22","0.25",... ... | |
| 546 <br/> "Cmpd4","0.13","0.05","0.12","1","0.11","0.13",... ... | |
| 547 <br/> "Cmpd5","0.11","0.19","0.22","0.11","1","0.17",... ... | |
| 548 <br/> "Cmpd6","0.2","0.07","0.25","0.13","0.17","1",... ... | |
| 549 <br/> ... ... .. | |
| 550 <br/> ... ... .. | |
| 551 <br/> ... ... ..</div> | |
| 552 <p>Example of data in output file for <em>RowsAndColumns</em> <strong>--OutMatrixFormat</strong> value for | |
| 553 <em>UpperTriangularMatrix</em> value of <strong>--OutMatrixType</strong>:</p> | |
| 554 <div class="OptionsBox"> | |
| 555 "","Cmpd1","Cmpd2","Cmpd3","Cmpd4","Cmpd5","Cmpd6",... ... | |
| 556 <br/> "Cmpd1","1","0.04","0.25","0.13","0.11","0.2",... ... | |
| 557 <br/> "Cmpd2","1","0.06","0.05","0.19","0.07",... ... | |
| 558 <br/> "Cmpd3","1","0.12","0.22","0.25",... ... | |
| 559 <br/> "Cmpd4","1","0.11","0.13",... ... | |
| 560 <br/> "Cmpd5","1","0.17",... ... | |
| 561 <br/> "Cmpd6","1",... ... | |
| 562 <br/> ... ... .. | |
| 563 <br/> ... ... .. | |
| 564 <br/> ... ... ..</div> | |
| 565 <p>Example of data in output file for <em>RowsAndColumns</em> <strong>--OutMatrixFormat</strong> value for | |
| 566 <em>LowerTriangularMatrix</em> value of <strong>--OutMatrixType</strong>:</p> | |
| 567 <div class="OptionsBox"> | |
| 568 "","Cmpd1","Cmpd2","Cmpd3","Cmpd4","Cmpd5","Cmpd6",... ... | |
| 569 <br/> "Cmpd1","1" | |
| 570 <br/> "Cmpd2","0.04","1" | |
| 571 <br/> "Cmpd3","0.25","0.06","1" | |
| 572 <br/> "Cmpd4","0.13","0.05","0.12","1" | |
| 573 <br/> "Cmpd5","0.11","0.19","0.22","0.11","1" | |
| 574 <br/> "Cmpd6","0.2","0.07","0.25","0.13","0.17","1" | |
| 575 <br/> ... ... .. | |
| 576 <br/> ... ... .. | |
| 577 <br/> ... ... ..</div> | |
| 578 <p>Example of data in output file for <em>IDPairsAndValue</em> <strong>--OutMatrixFormat</strong> value for | |
| 579 <FullMatrix> value of <strong>OutMatrixType</strong>:</p> | |
| 580 <div class="OptionsBox"> | |
| 581 "CmpdID1","CmpdID2","Coefficient Value" | |
| 582 <br/> "Cmpd1","Cmpd1","1" | |
| 583 <br/> "Cmpd1","Cmpd2","0.04" | |
| 584 <br/> "Cmpd1","Cmpd3","0.25" | |
| 585 <br/> "Cmpd1","Cmpd4","0.13" | |
| 586 <br/> ... ... ... | |
| 587 <br/> ... ... ... | |
| 588 <br/> ... ... ... | |
| 589 <br/> "Cmpd2","Cmpd1","0.04" | |
| 590 <br/> "Cmpd2","Cmpd2","1" | |
| 591 <br/> "Cmpd2","Cmpd3","0.06" | |
| 592 <br/> "Cmpd2","Cmpd4","0.05" | |
| 593 <br/> ... ... ... | |
| 594 <br/> ... ... ... | |
| 595 <br/> ... ... ... | |
| 596 <br/> "Cmpd3","Cmpd1","0.25" | |
| 597 <br/> "Cmpd3","Cmpd2","0.06" | |
| 598 <br/> "Cmpd3","Cmpd3","1" | |
| 599 <br/> "Cmpd3","Cmpd4","0.12" | |
| 600 <br/> ... ... ... | |
| 601 <br/> ... ... ... | |
| 602 <br/> ... ... ...</div> | |
| 603 <p>Example of data in output file for <em>IDPairsAndValue</em> <strong>--OutMatrixFormat</strong> value for | |
| 604 <UpperTriangularMatrix> value of <strong>--OutMatrixType</strong>:</p> | |
| 605 <div class="OptionsBox"> | |
| 606 "CmpdID1","CmpdID2","Coefficient Value" | |
| 607 <br/> "Cmpd1","Cmpd1","1" | |
| 608 <br/> "Cmpd1","Cmpd2","0.04" | |
| 609 <br/> "Cmpd1","Cmpd3","0.25" | |
| 610 <br/> "Cmpd1","Cmpd4","0.13" | |
| 611 <br/> ... ... ... | |
| 612 <br/> ... ... ... | |
| 613 <br/> ... ... ... | |
| 614 <br/> "Cmpd2","Cmpd2","1" | |
| 615 <br/> "Cmpd2","Cmpd3","0.06" | |
| 616 <br/> "Cmpd2","Cmpd4","0.05" | |
| 617 <br/> ... ... ... | |
| 618 <br/> ... ... ... | |
| 619 <br/> ... ... ... | |
| 620 <br/> "Cmpd3","Cmpd3","1" | |
| 621 <br/> "Cmpd3","Cmpd4","0.12" | |
| 622 <br/> ... ... ... | |
| 623 <br/> ... ... ... | |
| 624 <br/> ... ... ...</div> | |
| 625 <p>Example of data in output file for <em>IDPairsAndValue</em> <strong>--OutMatrixFormat</strong> value for | |
| 626 <LowerTriangularMatrix> value of <strong>--OutMatrixType</strong>:</p> | |
| 627 <div class="OptionsBox"> | |
| 628 "CmpdID1","CmpdID2","Coefficient Value" | |
| 629 <br/> "Cmpd1","Cmpd1","1" | |
| 630 <br/> "Cmpd2","Cmpd1","0.04" | |
| 631 <br/> "Cmpd2","Cmpd2","1" | |
| 632 <br/> "Cmpd3","Cmpd1","0.25" | |
| 633 <br/> "Cmpd3","Cmpd2","0.06" | |
| 634 <br/> "Cmpd3","Cmpd3","1" | |
| 635 <br/> "Cmpd4","Cmpd1","0.13" | |
| 636 <br/> "Cmpd4","Cmpd2","0.05" | |
| 637 <br/> "Cmpd4","Cmpd3","0.12" | |
| 638 <br/> "Cmpd4","Cmpd4","1" | |
| 639 <br/> ... ... ... | |
| 640 <br/> ... ... ... | |
| 641 <br/> ... ... ...</div> | |
| 642 </dd> | |
| 643 <dt><strong><strong>--OutMatrixType</strong> <em>FullMatrix | UpperTriangularMatrix | LowerTriangularMatrix</em></strong></dt> | |
| 644 <dd> | |
| 645 <p>Type of similarity or distance matrix to calculate for fingerprints vector and bit-vector strings: | |
| 646 Calculate full matrix; Calculate lower triangular matrix including diagonal; Calculate upper triangular | |
| 647 matrix including diagonal.</p> | |
| 648 <p>Possible values: <em>FullMatrix, UpperTriangularMatrix, or LowerTriangularMatrix</em>. Default value: | |
| 649 <em>FullMatrix</em>.</p> | |
| 650 <p>The value of <strong>--OutMatrixType</strong> in conjunction with <strong>--OutMatrixFormat</strong> determines type | |
| 651 of data written to output files.</p> | |
| 652 </dd> | |
| 653 <dt><strong><strong>-o, --overwrite</strong></strong></dt> | |
| 654 <dd> | |
| 655 <p>Overwrite existing files</p> | |
| 656 </dd> | |
| 657 <dt><strong><strong>-p, --precision</strong> <em>number</em></strong></dt> | |
| 658 <dd> | |
| 659 <p>Precision of calculated values in the output file. Default: up to <em>2</em> decimal places. | |
| 660 Valid values: positive integers.</p> | |
| 661 </dd> | |
| 662 <dt><strong><strong>-q, --quote</strong> <em>Yes | No</em></strong></dt> | |
| 663 <dd> | |
| 664 <p>Put quote around column values in output CSV/TSV text file(s). Possible values: | |
| 665 <em>Yes or No</em>. Default value: <em>Yes</em>.</p> | |
| 666 </dd> | |
| 667 <dt><strong><strong>-r, --root</strong> <em>RootName</em></strong></dt> | |
| 668 <dd> | |
| 669 <p>New file name is generated using the root: <Root><BitVectorComparisonMode>.<Ext> or | |
| 670 <Root><VectorComparisonMode><VectorComparisonFormulism>.<Ext>. | |
| 671 The csv, and tsv <Ext> values are used for comma/semicolon, and tab delimited text files | |
| 672 respectively. This option is ignored for multiple input files.</p> | |
| 673 </dd> | |
| 674 <dt><strong><strong>-v, --VectorComparisonMode</strong> <em>All | "TanimotoSimilarity,[ManhattanDistance,...]"</em></strong></dt> | |
| 675 <dd> | |
| 676 <p>Specify what similarity or distance coefficients to use for calculating similarity matrices for | |
| 677 fingerprint vector strings data values in <em>TextFile(s)</em>: calculate similarity matrices for all | |
| 678 supported similarity and distance coefficients or specify a comma delimited list of similarity | |
| 679 and distance coefficients. Possible values: <em>All | "TanimotoSimilairy,[ManhattanDistance,..]"</em>. | |
| 680 Default: <em>TanimotoSimilarity</em>.</p> | |
| 681 <p>The value of <strong>-v, --VectorComparisonMode</strong>, in conjunction with <strong>--VectorComparisonFormulism</strong>, | |
| 682 decides which type of similarity and distance coefficient formulism gets used.</p> | |
| 683 <p><em>All</em> uses complete list of supported similarity and distance coefficients: <em>CosineSimilarity, | |
| 684 CzekanowskiSimilarity, DiceSimilarity, OchiaiSimilarity, JaccardSimilarity, SorensonSimilarity, TanimotoSimilarity, | |
| 685 CityBlockDistance, EuclideanDistance, HammingDistance, ManhattanDistance, SoergelDistance</em>. These | |
| 686 similarity and distance coefficients are described below.</p> | |
| 687 <p><strong>FingerprintsVector.pm</strong> module, used to calculate similarity and distance coefficients, | |
| 688 provides support to perform comparison between vectors containing three different types of | |
| 689 values:</p> | |
| 690 <p>Type I: OrderedNumericalValues</p> | |
| 691 <div class="OptionsBox"> | |
| 692 . Size of two vectors are same | |
| 693 <br/> . Vectors contain real values in a specific order. For example: MACCS keys | |
| 694 count, Topological pharmnacophore atom pairs and so on.</div> | |
| 695 <p>Type II: UnorderedNumericalValues</p> | |
| 696 <div class="OptionsBox"> | |
| 697 . Size of two vectors might not be same | |
| 698 <br/> . Vectors contain unordered real value identified by value IDs. For example: | |
| 699 Toplogical atom pairs, Topological atom torsions and so on</div> | |
| 700 <p>Type III: AlphaNumericalValues</p> | |
| 701 <div class="OptionsBox"> | |
| 702 . Size of two vectors might not be same | |
| 703 <br/> . Vectors contain unordered alphanumerical values. For example: Extended | |
| 704 connectivity fingerprints, atom neighborhood fingerprints.</div> | |
| 705 <p>Before performing similarity or distance calculations between vectors containing UnorderedNumericalValues | |
| 706 or AlphaNumericalValues, the vectors are transformed into vectors containing unique OrderedNumericalValues | |
| 707 using value IDs for UnorderedNumericalValues and values itself for AlphaNumericalValues.</p> | |
| 708 <p>Three forms of similarity and distance calculation between two vectors, specified using <strong>--VectorComparisonFormulism</strong> | |
| 709 option, are supported: <em>AlgebraicForm, BinaryForm or SetTheoreticForm</em>.</p> | |
| 710 <p>For <em>BinaryForm</em>, the ordered list of processed final vector values containing the value or | |
| 711 count of each unique value type is simply converted into a binary vector containing 1s and 0s | |
| 712 corresponding to presence or absence of values before calculating similarity or distance between | |
| 713 two vectors.</p> | |
| 714 <p>For two fingerprint vectors A and B of same size containing OrderedNumericalValues, let:</p> | |
| 715 <div class="OptionsBox"> | |
| 716 N = Number values in A or B</div> | |
| 717 <div class="OptionsBox"> | |
| 718 Xa = Values of vector A | |
| 719 <br/> Xb = Values of vector B</div> | |
| 720 <div class="OptionsBox"> | |
| 721 Xai = Value of ith element in A | |
| 722 <br/> Xbi = Value of ith element in B</div> | |
| 723 <div class="OptionsBox"> | |
| 724 SUM = Sum of i over N values</div> | |
| 725 <p>For SetTheoreticForm of calculation between two vectors, let:</p> | |
| 726 <div class="OptionsBox"> | |
| 727 SetIntersectionXaXb = SUM ( MIN ( Xai, Xbi ) ) | |
| 728 <br/> SetDifferenceXaXb = SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN ( Xai, Xbi ) )</div> | |
| 729 <p>For BinaryForm of calculation between two vectors, let:</p> | |
| 730 <div class="OptionsBox"> | |
| 731 Na = Number of bits set to "1" in A = SUM ( Xai ) | |
| 732 <br/> Nb = Number of bits set to "1" in B = SUM ( Xbi ) | |
| 733 <br/> Nc = Number of bits set to "1" in both A and B = SUM ( Xai * Xbi ) | |
| 734 <br/> Nd = Number of bits set to "0" in both A and B | |
| 735 = SUM ( 1 - Xai - Xbi + Xai * Xbi)</div> | |
| 736 <div class="OptionsBox"> | |
| 737 N = Number of bits set to "1" or "0" in A or B = Size of A or B = Na + Nb - Nc + Nd</div> | |
| 738 <p>Additionally, for BinaryForm various values also correspond to:</p> | |
| 739 <div class="OptionsBox"> | |
| 740 Na = | Xa | | |
| 741 <br/> Nb = | Xb | | |
| 742 <br/> Nc = | SetIntersectionXaXb | | |
| 743 <br/> Nd = N - | SetDifferenceXaXb |</div> | |
| 744 <div class="OptionsBox"> | |
| 745 | SetDifferenceXaXb | = N - Nd = Na + Nb - Nc + Nd - Nd = Na + Nb - Nc | |
| 746 = | Xa | + | Xb | - | SetIntersectionXaXb |</div> | |
| 747 <p>Various similarity and distance coefficients [ Ref 40, Ref 62, Ref 64 ] for a pair of vectors A and B | |
| 748 in <em>AlgebraicForm, BinaryForm and SetTheoreticForm</em> are defined as follows:</p> | |
| 749 <p><strong>CityBlockDistance</strong>: ( same as HammingDistance and ManhattanDistance)</p> | |
| 750 <p><em>AlgebraicForm</em>: SUM ( ABS ( Xai - Xbi ) )</p> | |
| 751 <p><em>BinaryForm</em>: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc</p> | |
| 752 <p><em>SetTheoreticForm</em>: | SetDifferenceXaXb | - | SetIntersectionXaXb | = SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )</p> | |
| 753 <p><strong>CosineSimilarity</strong>: ( same as OchiaiSimilarityCoefficient)</p> | |
| 754 <p><em>AlgebraicForm</em>: SUM ( Xai * Xbi ) / SQRT ( SUM ( Xai ** 2) * SUM ( Xbi ** 2) )</p> | |
| 755 <p><em>BinaryForm</em>: Nc / SQRT ( Na * Nb)</p> | |
| 756 <p><em>SetTheoreticForm</em>: | SetIntersectionXaXb | / SQRT ( |Xa| * |Xb| ) = SUM ( MIN ( Xai, Xbi ) ) / SQRT ( SUM ( Xai ) * SUM ( Xbi ) )</p> | |
| 757 <p><strong>CzekanowskiSimilarity</strong>: ( same as DiceSimilarity and SorensonSimilarity)</p> | |
| 758 <p><em>AlgebraicForm</em>: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) + SUM ( Xbi **2 ) )</p> | |
| 759 <p><em>BinaryForm</em>: 2 * Nc / ( Na + Nb )</p> | |
| 760 <p><em>SetTheoreticForm</em>: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) = 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )</p> | |
| 761 <p><strong>DiceSimilarity</strong>: ( same as CzekanowskiSimilarity and SorensonSimilarity)</p> | |
| 762 <p><em>AlgebraicForm</em>: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) + SUM ( Xbi **2 ) )</p> | |
| 763 <p><em>BinaryForm</em>: 2 * Nc / ( Na + Nb )</p> | |
| 764 <p><em>SetTheoreticForm</em>: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) = 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )</p> | |
| 765 <p><strong>EuclideanDistance</strong>:</p> | |
| 766 <p><em>AlgebraicForm</em>: SQRT ( SUM ( ( ( Xai - Xbi ) ** 2 ) ) )</p> | |
| 767 <p><em>BinaryForm</em>: SQRT ( ( Na - Nc ) + ( Nb - Nc ) ) = SQRT ( Na + Nb - 2 * Nc )</p> | |
| 768 <p><em>SetTheoreticForm</em>: SQRT ( | SetDifferenceXaXb | - | SetIntersectionXaXb | ) = SQRT ( SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) )</p> | |
| 769 <p><strong>HammingDistance</strong>: ( same as CityBlockDistance and ManhattanDistance)</p> | |
| 770 <p><em>AlgebraicForm</em>: SUM ( ABS ( Xai - Xbi ) )</p> | |
| 771 <p><em>BinaryForm</em>: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc</p> | |
| 772 <p><em>SetTheoreticForm</em>: | SetDifferenceXaXb | - | SetIntersectionXaXb | = SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )</p> | |
| 773 <p><strong>JaccardSimilarity</strong>: ( same as TanimotoSimilarity)</p> | |
| 774 <p><em>AlgebraicForm</em>: SUM ( Xai * Xbi ) / ( SUM ( Xai ** 2 ) + SUM ( Xbi ** 2 ) - SUM ( Xai * Xbi ) )</p> | |
| 775 <p><em>BinaryForm</em>: Nc / ( ( Na - Nc ) + ( Nb - Nc ) + Nc ) = Nc / ( Na + Nb - Nc )</p> | |
| 776 <p><em>SetTheoreticForm</em>: | SetIntersectionXaXb | / | SetDifferenceXaXb | = SUM ( MIN ( Xai, Xbi ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN ( Xai, Xbi ) ) )</p> | |
| 777 <p><strong>ManhattanDistance</strong>: ( same as CityBlockDistance and HammingDistance)</p> | |
| 778 <p><em>AlgebraicForm</em>: SUM ( ABS ( Xai - Xbi ) )</p> | |
| 779 <p><em>BinaryForm</em>: ( Na - Nc ) + ( Nb - Nc ) = Na + Nb - 2 * Nc</p> | |
| 780 <p><em>SetTheoreticForm</em>: | SetDifferenceXaXb | - | SetIntersectionXaXb | = SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) )</p> | |
| 781 <p><strong>OchiaiSimilarity</strong>: ( same as CosineSimilarity)</p> | |
| 782 <p><em>AlgebraicForm</em>: SUM ( Xai * Xbi ) / SQRT ( SUM ( Xai ** 2) * SUM ( Xbi ** 2) )</p> | |
| 783 <p><em>BinaryForm</em>: Nc / SQRT ( Na * Nb)</p> | |
| 784 <p><em>SetTheoreticForm</em>: | SetIntersectionXaXb | / SQRT ( |Xa| * |Xb| ) = SUM ( MIN ( Xai, Xbi ) ) / SQRT ( SUM ( Xai ) * SUM ( Xbi ) )</p> | |
| 785 <p><strong>SorensonSimilarity</strong>: ( same as CzekanowskiSimilarity and DiceSimilarity)</p> | |
| 786 <p><em>AlgebraicForm</em>: ( 2 * ( SUM ( Xai * Xbi ) ) ) / ( SUM ( Xai ** 2) + SUM ( Xbi **2 ) )</p> | |
| 787 <p><em>BinaryForm</em>: 2 * Nc / ( Na + Nb )</p> | |
| 788 <p><em>SetTheoreticForm</em>: 2 * | SetIntersectionXaXb | / ( |Xa| + |Xb| ) = 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) )</p> | |
| 789 <p><strong>SoergelDistance</strong>:</p> | |
| 790 <p><em>AlgebraicForm</em>: SUM ( ABS ( Xai - Xbi ) ) / SUM ( MAX ( Xai, Xbi ) )</p> | |
| 791 <p><em>BinaryForm</em>: 1 - Nc / ( Na + Nb - Nc ) = ( Na + Nb - 2 * Nc ) / ( Na + Nb - Nc )</p> | |
| 792 <p><em>SetTheoreticForm</em>: ( | SetDifferenceXaXb | - | SetIntersectionXaXb | ) / | SetDifferenceXaXb | = ( SUM ( Xai ) + SUM ( Xbi ) - 2 * ( SUM ( MIN ( Xai, Xbi ) ) ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN ( Xai, Xbi ) ) )</p> | |
| 793 <p><strong>TanimotoSimilarity</strong>: ( same as JaccardSimilarity)</p> | |
| 794 <p><em>AlgebraicForm</em>: SUM ( Xai * Xbi ) / ( SUM ( Xai ** 2 ) + SUM ( Xbi ** 2 ) - SUM ( Xai * Xbi ) )</p> | |
| 795 <p><em>BinaryForm</em>: Nc / ( ( Na - Nc ) + ( Nb - Nc ) + Nc ) = Nc / ( Na + Nb - Nc )</p> | |
| 796 <p><em>SetTheoreticForm</em>: | SetIntersectionXaXb | / | SetDifferenceXaXb | = SUM ( MIN ( Xai, Xbi ) ) / ( SUM ( Xai ) + SUM ( Xbi ) - SUM ( MIN ( Xai, Xbi ) ) )</p> | |
| 797 </dd> | |
| 798 <dt><strong><strong>--VectorComparisonFormulism</strong> <em>All | "AlgebraicForm,[BinaryForm,SetTheoreticForm]"</em></strong></dt> | |
| 799 <dd> | |
| 800 <p>Specify fingerprints vector comparison formulism to use for calculation similarity and distance | |
| 801 coefficients during <strong>-v, --VectorComparisonMode</strong>: use all supported comparison formulisms | |
| 802 or specify a comma delimited. Possible values: <em>All | "AlgebraicForm,[BinaryForm,SetTheoreticForm]"</em>. | |
| 803 Default value: <em>AlgebraicForm</em>.</p> | |
| 804 <p><em>All</em> uses all three forms of supported vector comparison formulism for values of <strong>-v, --VectorComparisonMode</strong> | |
| 805 option.</p> | |
| 806 <p>For fingerprint vector strings containing <strong>AlphaNumericalValues</strong> data values - <strong>ExtendedConnectivityFingerprints</strong>, | |
| 807 <strong>AtomNeighborhoodsFingerprints</strong> and so on - all three formulism result in same value during similarity and distance | |
| 808 calculations.</p> | |
| 809 </dd> | |
| 810 <dt><strong><strong>-w, --WorkingDir</strong> <em>DirName</em></strong></dt> | |
| 811 <dd> | |
| 812 <p>Location of working directory. Default: current directory.</p> | |
| 813 </dd> | |
| 814 </dl> | |
| 815 <p> | |
| 816 </p> | |
| 817 <h2>EXAMPLES</h2> | |
| 818 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 819 bit-vector strings data corresponding to supported fingerprints in text file present in a column | |
| 820 name containing Fingerprint substring by loading all fingerprints data into memory and create a | |
| 821 SampleFPHexTanimotoSimilarity.csv file containing compound IDs retrieved from column name | |
| 822 containing CompoundID substring, type:</p> | |
| 823 <div class="ExampleBox"> | |
| 824 % SimilarityMatricesFingerprints.pl -o SampleFPHex.csv</div> | |
| 825 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 826 bit-vector strings data corresponding to supported fingerprints in SD File present in a data field | |
| 827 with Fingerprint substring in its label by loading all fingerprints data into memory and create a | |
| 828 SampleFPHexTanimotoSimilarity.csv file containing sequentially generated compound IDs with | |
| 829 Cmpd prefix, type:</p> | |
| 830 <div class="ExampleBox"> | |
| 831 % SimilarityMatricesFingerprints.pl -o SampleFPHex.sdf</div> | |
| 832 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 833 bit-vector strings data corresponding to supported fingerprints in FP file by loading all fingerprints | |
| 834 data into memory and create a SampleFPHexTanimotoSimilarity.csv file along with compound IDs | |
| 835 retrieved from FP file, type:</p> | |
| 836 <div class="ExampleBox"> | |
| 837 % SimilarityMatricesFingerprints.pl -o SampleFPHex.fpf</div> | |
| 838 <p>To generate a lower triangular similarity matrix corresponding to Tanimoto similarity coefficient for | |
| 839 fingerprints bit-vector strings data corresponding to supported fingerprints in text file present in a | |
| 840 column name containing Fingerprint substring by loading all fingerprints data into memory and create | |
| 841 a SampleFPHexTanimotoSimilarity.csv file containing compound IDs retrieved from column name | |
| 842 containing CompoundID substring, type:</p> | |
| 843 <div class="ExampleBox"> | |
| 844 % SimilarityMatricesFingerprints.pl -o --InputDataMode LoadInMemory | |
| 845 --OutMatrixFormat RowsAndColumns --OutMatrixType LowerTriangularMatrix | |
| 846 SampleFPHex.csv</div> | |
| 847 <p>To generate a upper triangular similarity matrix corresponding to Tanimoto similarity coefficient for | |
| 848 fingerprints bit-vector strings data corresponding to supported fingerprints in text file present in a | |
| 849 column name containing Fingerprint substring by loading all fingerprints data into memory and create | |
| 850 a SampleFPHexTanimotoSimilarity.csv file in IDPairsAndValue format containing compound IDs retrieved | |
| 851 from column name containing CompoundID substring, type:</p> | |
| 852 <div class="ExampleBox"> | |
| 853 % SimilarityMatricesFingerprints.pl -o --InputDataMode LoadInMemory | |
| 854 --OutMatrixFormat IDPairsAndValue --OutMatrixType UpperTriangularMatrix | |
| 855 SampleFPHex.csv</div> | |
| 856 <p>To generate a full similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 857 bit-vector strings data corresponding to supported fingerprints in text file present in a column | |
| 858 name containing Fingerprint substring by scanning file without loading all fingerprints data into memory | |
| 859 and create a SampleFPHexTanimotoSimilarity.csv file containing compound IDs retrieved from | |
| 860 column name containing CompoundID substring, type:</p> | |
| 861 <div class="ExampleBox"> | |
| 862 % SimilarityMatricesFingerprints.pl -o --InputDataMode ScanFile | |
| 863 --OutMatrixFormat RowsAndColumns --OutMatrixType FullMatrix | |
| 864 SampleFPHex.csv</div> | |
| 865 <p>To generate a lower triangular similarity matrix corresponding to Tanimoto similarity coefficient for | |
| 866 fingerprints bit-vector strings data corresponding to supported fingerprints in text file present in a | |
| 867 column name containing Fingerprint substring by scanning file without loading all fingerprints data into | |
| 868 memory and create a SampleFPHexTanimotoSimilarity.csv file in IDPairsAndValue format containing | |
| 869 compound IDs retrieved from column name containing CompoundID substring, type:</p> | |
| 870 <div class="ExampleBox"> | |
| 871 % SimilarityMatricesFingerprints.pl -o --InputDataMode ScanFile | |
| 872 --OutMatrixFormat IDPairsAndValue --OutMatrixType LowerTriangularMatrix | |
| 873 SampleFPHex.csv</div> | |
| 874 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient using algebraic formulism | |
| 875 for fingerprints vector strings data corresponding to supported fingerprints in text file present in a column name | |
| 876 containing Fingerprint substring and create a SampleFPCountTanimotoSimilarityAlgebraicForm.csv file | |
| 877 containing compound IDs retrieved from column name containing CompoundID substring, type:</p> | |
| 878 <div class="ExampleBox"> | |
| 879 % SimilarityMatricesFingerprints.pl -o SampleFPCount.csv</div> | |
| 880 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient using algebraic formulism | |
| 881 for fingerprints vector strings data corresponding to supported fingerprints in SD file present in a data field with | |
| 882 Fingerprint substring in its label and create a SampleFPCountTanimotoSimilarityAlgebraicForm.csv file | |
| 883 containing sequentially generated compound IDs with Cmpd prefix, type:</p> | |
| 884 <div class="ExampleBox"> | |
| 885 % SimilarityMatricesFingerprints.pl -o SampleFPCount.sdf</div> | |
| 886 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient using algebraic formulism | |
| 887 vector strings data corresponding to supported fingerprints in FP file and create a | |
| 888 SampleFPCountTanimotoSimilarityAlgebraicForm.csv file along with compound IDs retrieved from FP file, type:</p> | |
| 889 <div class="ExampleBox"> | |
| 890 % SimilarityMatricesFingerprints.pl -o SampleFPCount.fpf</div> | |
| 891 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 892 bit-vector strings data corresponding to supported fingerprints in text file present in a column name | |
| 893 containing Fingerprint substring and create a SampleFPHexTanimotoSimilarity.csv file in | |
| 894 IDPairsAndValue format containing compound IDs retrieved from column name containing | |
| 895 CompoundID substring, type:</p> | |
| 896 <div class="ExampleBox"> | |
| 897 % SimilarityMatricesFingerprints.pl --OutMatrixFormat IDPairsAndValue -o | |
| 898 SampleFPHex.csv</div> | |
| 899 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 900 bit-vector strings data corresponding to supported fingerprints in SD file present in a data field with | |
| 901 Fingerprint substring in its label and create a SampleFPHexTanimotoSimilarity.csv file in | |
| 902 IDPairsAndValue format containing sequentially generated compound IDs with Cmpd prefix, | |
| 903 type:</p> | |
| 904 <div class="ExampleBox"> | |
| 905 % SimilarityMatricesFingerprints.pl --OutMatrixFormat IDPairsAndValue -o | |
| 906 SampleFPHex.sdf</div> | |
| 907 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 908 bit-vector strings data corresponding to supported fingerprints in FP file and create a | |
| 909 SampleFPHexTanimotoSimilarity.csv file in IDPairsAndValue format along with compound IDs retrieved | |
| 910 from FP file, type:</p> | |
| 911 <div class="ExampleBox"> | |
| 912 % SimilarityMatricesFingerprints.pl --OutMatrixFormat IDPairsAndValue -o | |
| 913 SampleFPHex.fpf</div> | |
| 914 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 915 bit-vector strings data corresponding to supported fingerprints in SD file present in a data field with | |
| 916 Fingerprint substring in its label and create a SampleFPHexTanimotoSimilarity.csv file | |
| 917 containing compound IDs from mol name line, type:</p> | |
| 918 <div class="ExampleBox"> | |
| 919 % SimilarityMatricesFingerprints.pl --CompoundIDMode MolName -o | |
| 920 SampleFPHex.sdf</div> | |
| 921 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 922 bit-vector strings data corresponding to supported fingerprints present in a data field with | |
| 923 Fingerprint substring in its label and create a SampleFPHexTanimotoSimilarity.csv file | |
| 924 containing compound IDs from data field name Mol_ID, type:</p> | |
| 925 <div class="ExampleBox"> | |
| 926 % SimilarityMatricesFingerprints.pl --CompoundIDMode DataField | |
| 927 --CompoundIDField Mol_ID -o SampleFPBin.sdf</div> | |
| 928 <p>To generate similarity matrices corresponding to Buser, Dice and Tanimoto similarity coefficient | |
| 929 for fingerprints bit-vector strings data corresponding to supported fingerprints present in a column | |
| 930 name containing Fingerprint substring and create SampleFPBin[CoefficientName]Similarity.csv files | |
| 931 containing compound IDs retrieved from column name containing CompoundID substring, type:</p> | |
| 932 <div class="ExampleBox"> | |
| 933 % SimilarityMatricesFingerprints.pl -b "BuserSimilarity,DiceSimilarity, | |
| 934 TanimotoSimilarity" -o SampleFPBin.csv</div> | |
| 935 <p>To generate similarity matrices corresponding to Buser, Dice and Tanimoto similarity coefficient | |
| 936 for fingerprints bit-vector strings data corresponding to supported fingerprints present in a data field with | |
| 937 Fingerprint substring in its label and create SampleFPBin[CoefficientName]Similarity.csv files | |
| 938 containing sequentially generated compound IDs with Cmpd prefix, type:</p> | |
| 939 <div class="ExampleBox"> | |
| 940 % SimilarityMatricesFingerprints.pl -b "BuserSimilarity,DiceSimilarity, | |
| 941 TanimotoSimilarity" -o SampleFPBin.sdf</div> | |
| 942 <p>To generate similarity matrices corresponding to CityBlock distance and Tanimoto similarity coefficients using | |
| 943 algebraic formulism for fingerprints vector strings data corresponding to supported fingerprints present in | |
| 944 a column name containing Fingerprint substring and create SampleFPCount[CoefficientName]AlgebraicForm.csv | |
| 945 files containing compound IDs retrieved from column name containing CompoundID substring, type:</p> | |
| 946 <div class="ExampleBox"> | |
| 947 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance, | |
| 948 TanimotoSimilarity" -o SampleFPCount.csv</div> | |
| 949 <p>To generate similarity matrices corresponding to CityBlock distance and Tanimoto similarity coefficients using | |
| 950 algebraic formulism for fingerprints vector strings data corresponding to supported fingerprints present in | |
| 951 a data field with Fingerprint substring in its label and create SampleFPCount[CoefficientName]AlgebraicForm.csv | |
| 952 files containing sequentially generated compound IDs with Cmpd prefix, type:</p> | |
| 953 <div class="ExampleBox"> | |
| 954 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance, | |
| 955 TanimotoSimilarity" -o SampleFPCount.sdf</div> | |
| 956 <p>To generate similarity matrices corresponding to CityBlock distance Tanimoto similarity coefficients using | |
| 957 binary formulism for fingerprints vector strings data corresponding to supported fingerprints present in | |
| 958 a column name containing Fingerprint substring and create SampleFPCount[CoefficientName]Binary.csv | |
| 959 files containing compound IDs retrieved from column name containing CompoundID substring, type:</p> | |
| 960 <div class="ExampleBox"> | |
| 961 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance, | |
| 962 TanimotoSimilarity" --VectorComparisonFormulism BinaryForm -o | |
| 963 SampleFPCount.csv</div> | |
| 964 <p>To generate similarity matrices corresponding to CityBlock distance Tanimoto similarity coefficients using | |
| 965 binary formulism for fingerprints vector strings data corresponding to supported fingerprints present in | |
| 966 a data field with Fingerprint substring in its label and create SampleFPCount[CoefficientName]Binary.csv | |
| 967 files containing sequentially generated compound IDs with Cmpd prefix, type:</p> | |
| 968 <div class="ExampleBox"> | |
| 969 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance, | |
| 970 TanimotoSimilarity" --VectorComparisonFormulism BinaryForm -o | |
| 971 SampleFPCount.sdf</div> | |
| 972 <p>To generate similarity matrices corresponding to CityBlock distance Tanimoto similarity coefficients using | |
| 973 all supported comparison formulisms for fingerprints vector strings data corresponding to supported | |
| 974 fingerprints present in a column name containing Fingerprint substring and create | |
| 975 SampleFPCount[CoefficientName][FormulismName].csv files containing compound IDs retrieved from column | |
| 976 name containing CompoundID substring, type:</p> | |
| 977 <div class="ExampleBox"> | |
| 978 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance, | |
| 979 TanimotoSimilarity" --VectorComparisonFormulism All -o SampleFPCount.csv</div> | |
| 980 <p>To generate similarity matrices corresponding to CityBlock distance Tanimoto similarity coefficients using | |
| 981 all supported comparison formulisms for fingerprints vector strings data corresponding to supported | |
| 982 fingerprints present in a data field with Fingerprint substring in its label and create | |
| 983 SampleFPCount[CoefficientName][FormulismName].csv files containing sequentially generated | |
| 984 compound IDs with Cmpd prefix, type:</p> | |
| 985 <div class="ExampleBox"> | |
| 986 % SimilarityMatricesFingerprints.pl -v "CityBlockDistance,TanimotoSimilarity" | |
| 987 --VectorComparisonFormulism All -o SampleFPCount.sdf</div> | |
| 988 <p>To generate similarity matrices corresponding to all available similarity coefficient for fingerprints | |
| 989 bit-vector strings data corresponding to supported fingerprints present in a column name | |
| 990 containing Fingerprint substring and create SampleFPHex[CoefficientName].csv files | |
| 991 containing compound IDs retrieved from column name containing CompoundID substring, type:</p> | |
| 992 <div class="ExampleBox"> | |
| 993 % SimilarityMatricesFingerprints.pl -m AutoDetect --BitVectorComparisonMode | |
| 994 All --alpha 0.5 -beta 0.5 -o SampleFPHex.csv</div> | |
| 995 <p>To generate similarity matrices corresponding to all available similarity coefficient for fingerprints | |
| 996 bit-vector strings data corresponding to supported fingerprints present in a data field with Fingerprint | |
| 997 substring in its label and create SampleFPHex[CoefficientName].csv files containing sequentially | |
| 998 generated compound IDs with Cmpd prefix, type</p> | |
| 999 <div class="ExampleBox"> | |
| 1000 % SimilarityMatricesFingerprints.pl -m AutoDetect --BitVectorComparisonMode | |
| 1001 All --alpha 0.5 -beta 0.5 -o SampleFPHex.sdf</div> | |
| 1002 <p>To generate similarity matrices corresponding to all available similarity and distance coefficients using | |
| 1003 all comparison formulism for fingerprints vector strings data corresponding to supported fingerprints | |
| 1004 present in a column name containing Fingerprint substring and create | |
| 1005 SampleFPCount[CoefficientName][FormulismName].csv files containing compound IDs | |
| 1006 retrieved from column name containing CompoundID substring, type:</p> | |
| 1007 <div class="ExampleBox"> | |
| 1008 % SimilarityMatricesFingerprints.pl -m AutoDetect --VectorComparisonMode | |
| 1009 All --VectorComparisonFormulism All -o SampleFPCount.csv</div> | |
| 1010 <p>To generate similarity matrices corresponding to all available similarity and distance coefficients using | |
| 1011 all comparison formulism for fingerprints vector strings data corresponding to supported fingerprints | |
| 1012 present in a data field with Fingerprint substring in its label and create | |
| 1013 SampleFPCount[CoefficientName][FormulismName].csv files containing sequentially generated | |
| 1014 compound IDs with Cmpd prefix, type:</p> | |
| 1015 <div class="ExampleBox"> | |
| 1016 % SimilarityMatricesFingerprints.pl -m AutoDetect --VectorComparisonMode | |
| 1017 All --VectorComparisonFormulism All -o SampleFPCount.sdf</div> | |
| 1018 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 1019 bit-vector strings data corresponding to supported fingerprints present in a column number 2 | |
| 1020 and create a SampleFPHexTanimotoSimilarity.csv file containing compound IDs retrieved column | |
| 1021 number 1, type:</p> | |
| 1022 <div class="ExampleBox"> | |
| 1023 % SimilarityMatricesFingerprints.pl --ColMode ColNum --CompoundIDCol 1 | |
| 1024 --FingerprintsCol 2 -o SampleFPHex.csv</div> | |
| 1025 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 1026 bit-vector strings data corresponding to supported fingerprints present in a data field name | |
| 1027 Fingerprints and create a SampleFPHexTanimotoSimilarity.csv file containing compound IDs | |
| 1028 present in data field name Mol_ID, type:</p> | |
| 1029 <div class="ExampleBox"> | |
| 1030 % SimilarityMatricesFingerprints.pl --FingerprintsField Fingerprints | |
| 1031 --CompoundIDMode DataField --CompoundIDField Mol_ID -o SampleFPHex.sdf</div> | |
| 1032 <p>To generate a similarity matrix corresponding to Tversky similarity coefficient for fingerprints | |
| 1033 bit-vector strings data corresponding to supported fingerprints present in a column named Fingerprints | |
| 1034 and create a SampleFPHexTverskySimilarity.tsv file containing compound IDs retrieved column named | |
| 1035 CompoundID, type:</p> | |
| 1036 <div class="ExampleBox"> | |
| 1037 % SimilarityMatricesFingerprints.pl --BitVectorComparisonMode | |
| 1038 TverskySimilarity --alpha 0.5 --ColMode ColLabel --CompoundIDCol | |
| 1039 CompoundID --FingerprintsCol Fingerprints --OutDelim Tab --quote No | |
| 1040 -o SampleFPHex.csv</div> | |
| 1041 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 1042 bit-vector strings data corresponding to supported fingerprints present in a data field | |
| 1043 with Fingerprint substring in its label and create a SampleFPHexTanimotoSimilarity.csv file | |
| 1044 containing compound IDs from molname line or sequentially generated compound IDs | |
| 1045 with Mol prefix, type:</p> | |
| 1046 <div class="ExampleBox"> | |
| 1047 % SimilarityMatricesFingerprints.pl --CompoundIDMode MolnameOrLabelPrefix | |
| 1048 --CompoundIDPrefix Mol -o SampleFPHex.sdf</div> | |
| 1049 <p>To generate a similarity matrix corresponding to Tanimoto similarity coefficient for fingerprints | |
| 1050 bit-vector strings data corresponding to supported fingerprints present in a data field with | |
| 1051 Fingerprint substring in its label and create a SampleFPHexTanimotoSimilarity.tsv file | |
| 1052 containing sequentially generated compound IDs with Cmpd prefix, type:</p> | |
| 1053 <div class="ExampleBox"> | |
| 1054 % SimilarityMatricesFingerprints.pl -OutDelim Tab --quote No -o SampleFPHex.sdf</div> | |
| 1055 <p> | |
| 1056 </p> | |
| 1057 <h2>AUTHOR</h2> | |
| 1058 <p><a href="mailto:msud@san.rr.com">Manish Sud</a></p> | |
| 1059 <p> | |
| 1060 </p> | |
| 1061 <h2>SEE ALSO</h2> | |
| 1062 <p><a href="./InfoFingerprintsFiles.html">InfoFingerprintsFiles.pl</a>, <a href="./SimilaritySearchingFingerprints.html">SimilaritySearchingFingerprints.pl</a>, <a href="./AtomNeighborhoodsFingerprints.html">AtomNeighborhoodsFingerprints.pl</a>,  | |
| 1063 <a href="./ExtendedConnectivityFingerprints.html">ExtendedConnectivityFingerprints.pl</a>, <a href="./MACCSKeysFingerprints.html">MACCSKeysFingerprints.pl</a>, <a href="./PathLengthFingerprints.html">PathLengthFingerprints.pl</a>,  | |
| 1064 <a href="./TopologicalAtomPairsFingerprints.html">TopologicalAtomPairsFingerprints.pl</a>, <a href="./TopologicalAtomTorsionsFingerprints.html">TopologicalAtomTorsionsFingerprints.pl</a>,  | |
| 1065 <a href="./TopologicalPharmacophoreAtomPairsFingerprints.html">TopologicalPharmacophoreAtomPairsFingerprints.pl</a>, <a href="./TopologicalPharmacophoreAtomTripletsFingerprints.html">TopologicalPharmacophoreAtomTripletsFingerprints.pl</a> | |
| 1066 </p> | |
| 1067 <p> | |
| 1068 </p> | |
| 1069 <h2>COPYRIGHT</h2> | |
| 1070 <p>Copyright (C) 2015 Manish Sud. All rights reserved.</p> | |
| 1071 <p>This file is part of MayaChemTools.</p> | |
| 1072 <p>MayaChemTools is free software; you can redistribute it and/or modify it under | |
| 1073 the terms of the GNU Lesser General Public License as published by the Free | |
| 1074 Software Foundation; either version 3 of the License, or (at your option) | |
| 1075 any later version.</p> | |
| 1076 <p> </p><p> </p><div class="DocNav"> | |
| 1077 <table width="100%" border=0 cellpadding=0 cellspacing=2> | |
| 1078 <tr align="left" valign="top"><td width="33%" align="left"><a href="./SDToMolFiles.html" title="SDToMolFiles.html">Previous</a> <a href="./index.html" title="Table of Contents">TOC</a> <a href="./SimilaritySearchingFingerprints.html" title="SimilaritySearchingFingerprints.html">Next</a></td><td width="34%" align="middle"><strong>March 29, 2015</strong></td><td width="33%" align="right"><strong>SimilarityMatricesFingerprints.pl</strong></td></tr> | |
| 1079 </table> | |
| 1080 </div> | |
| 1081 <br /> | |
| 1082 <center> | |
| 1083 <img src="../../images/h2o2.png"> | |
| 1084 </center> | |
| 1085 </body> | |
| 1086 </html> |
