| 0 | 1 <html> | 
|  | 2 <head> | 
|  | 3 <title>MayaChemTools:Documentation:MACCSKeysFingerprints.pl</title> | 
|  | 4 <meta http-equiv="content-type" content="text/html;charset=utf-8"> | 
|  | 5 <link rel="stylesheet" type="text/css" href="../../css/MayaChemTools.css"> | 
|  | 6 </head> | 
|  | 7 <body leftmargin="20" rightmargin="20" topmargin="10" bottommargin="10"> | 
|  | 8 <br/> | 
|  | 9 <center> | 
|  | 10 <a href="http://www.mayachemtools.org" title="MayaChemTools Home"><img src="../../images/MayaChemToolsLogo.gif" border="0" alt="MayaChemTools"></a> | 
|  | 11 </center> | 
|  | 12 <br/> | 
|  | 13 <div class="DocNav"> | 
|  | 14 <table width="100%" border=0 cellpadding=0 cellspacing=2> | 
|  | 15 <tr align="left" valign="top"><td width="33%" align="left"><a href="./JoinTextFiles.html" title="JoinTextFiles.html">Previous</a>  <a href="./index.html" title="Table of Contents">TOC</a>  <a href="./MergeTextFiles.html" title="MergeTextFiles.html">Next</a></td><td width="34%" align="middle"><strong>MACCSKeysFingerprints.pl</strong></td><td width="33%" align="right"><a href="././code/MACCSKeysFingerprints.html" title="View source code">Code</a> | <a href="./../pdf/MACCSKeysFingerprints.pdf" title="PDF US Letter Size">PDF</a> | <a href="./../pdfgreen/MACCSKeysFingerprints.pdf" title="PDF US Letter Size with narrow margins: www.changethemargins.com">PDFGreen</a> | <a href="./../pdfa4/MACCSKeysFingerprints.pdf" title="PDF A4 Size">PDFA4</a> | <a href="./../pdfa4green/MACCSKeysFingerprints.pdf" title="PDF A4 Size with narrow margins: www.changethemargins.com">PDFA4Green</a></td></tr> | 
|  | 16 </table> | 
|  | 17 </div> | 
|  | 18 <p> | 
|  | 19 </p> | 
|  | 20 <h2>NAME</h2> | 
|  | 21 <p>MACCSKeysFingerprints.pl - Generate MACCS key fingerprints for SD files</p> | 
|  | 22 <p> | 
|  | 23 </p> | 
|  | 24 <h2>SYNOPSIS</h2> | 
|  | 25 <p>MACCSKeysFingerprints.pl SDFile(s)...</p> | 
|  | 26 <p>MACCSKeysFingerprints.pl [<strong>--AromaticityModel</strong> <em>AromaticityModelType</em>] | 
|  | 27 [<strong>--BitsOrder</strong> <em>Ascending | Descending</em>] | 
|  | 28 [<strong>-b, --BitStringFormat</strong> <em>BinaryString | HexadecimalString</em>] | 
|  | 29 [<strong>--CompoundID</strong> <em>DataFieldName or LabelPrefixString</em>] [<strong>--CompoundIDLabel</strong> <em>text</em>] | 
|  | 30 [<strong>--CompoundIDMode</strong> <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em>] | 
|  | 31 [<strong>--DataFields</strong> <em>"FieldLabel1,FieldLabel2,..."</em>] [<strong>-d, --DataFieldsMode</strong> <em>All | Common | Specify | CompoundID</em>] | 
|  | 32 [<strong>-f, --Filter</strong> <em>Yes | No</em>] [<strong>--FingerprintsLabel</strong> <em>text</em>] [<strong>-h, --help</strong>] [<strong>-k, --KeepLargestComponent</strong> <em>Yes | No</em>] | 
|  | 33 [<strong>-m, --mode</strong> <em>MACCSKeyBits | MACCSKeyCount</em>] [<strong>--OutDelim</strong> <em>comma | tab | semicolon</em>] | 
|  | 34 [<strong>--output</strong> <em>SD | FP | text | all</em>] [<strong>-o, --overwrite</strong>] | 
|  | 35 [<strong>-q, --quote</strong> <em>Yes | No</em>] [<strong>-r, --root</strong> <em>RootName</em>] [<strong>-s, --size</strong> <em>number</em>] | 
|  | 36 [<strong>-v, --VectorStringFormat</strong> <em>IDsAndValuesString | IDsAndValuesPairsString | ValuesAndIDsString | ValuesAndIDsPairsString</em>] | 
|  | 37 [<strong>-w, --WorkingDir</strong> <em>DirName</em>]</p> | 
|  | 38 <p> | 
|  | 39 </p> | 
|  | 40 <h2>DESCRIPTION</h2> | 
|  | 41 <p>Generate MACCS (Molecular ACCess System) keys fingerprints [ Ref 45-47 ] for <em>SDFile(s)</em> | 
|  | 42 and create appropriate SD, FP or CSV/TSV text file(s) containing fingerprints bit-vector or | 
|  | 43 vector strings corresponding to molecular fingerprints.</p> | 
|  | 44 <p>Multiple SDFile names are separated by spaces. The valid file extensions are <em>.sdf</em> | 
|  | 45 and <em>.sd</em>. All other file names are ignored. All the SD files in a current directory | 
|  | 46 can be specified either by <em>*.sdf</em> or the current directory name.</p> | 
|  | 47 <p>For each MACCS keys definition, atoms are processed to determine their membership to the key | 
|  | 48 and the appropriate molecular fingerprints strings are generated. An atom can belong to multiple | 
|  | 49 MACCS keys.</p> | 
|  | 50 <p>For <em>MACCSKeyBits</em> value of <strong>-m, --mode</strong> option, a fingerprint bit-vector string containing | 
|  | 51 zeros and ones is generated and for <em>MACCSKeyCount</em> value, a fingerprint vector string | 
|  | 52 corresponding to number of MACCS keys [ Ref 45-47 ] is generated.</p> | 
|  | 53 <p><em>MACCSKeyBits | MACCSKeyCount</em> values for <strong>-m, --mode</strong> option along with two possible | 
|  | 54 <em>166 | 322</em>  values of <strong>-s, --size</strong> supports generation of four different types of MACCS | 
|  | 55 keys fingerprint: <em>MACCS166KeyBits, MACCS166KeyCount, MACCS322KeyBits, MACCS322KeyCount</em>.</p> | 
|  | 56 <p>Example of <em>SD</em> file containing MAACS keys fingerprints string data:</p> | 
|  | 57 <div class="OptionsBox"> | 
|  | 58     ... ... | 
|  | 59 <br/>    ... ... | 
|  | 60 <br/>    $$$$ | 
|  | 61 <br/>    ... ... | 
|  | 62 <br/>    ... ... | 
|  | 63 <br/>    ... ... | 
|  | 64 <br/>    41 44  0  0  0  0  0  0  0  0999 V2000 | 
|  | 65      -3.3652    1.4499    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0 | 
|  | 66 <br/>    ... ... | 
|  | 67 <br/>    2  3  1  0  0  0  0 | 
|  | 68 <br/>    ... ... | 
|  | 69 <br/>    M  END | 
|  | 70 <br/>    >  <CmpdID> | 
|  | 71 <br/>    Cmpd1</div> | 
|  | 72 <div class="OptionsBox"> | 
|  | 73     >  <MACCSKeysFingerprints> | 
|  | 74 <br/>    FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;000000000 | 
|  | 75 <br/>    00000000000000000000000000000000100100001001000000001001000000001110001 | 
|  | 76 <br/>    00101010111100011011000100110110000011011110100110111111111111011111111 | 
|  | 77 <br/>    11111111110111000</div> | 
|  | 78 <div class="OptionsBox"> | 
|  | 79     $$$$ | 
|  | 80 <br/>    ... ... | 
|  | 81 <br/>    ... ...</div> | 
|  | 82 <p>Example of <em>FP</em> file containing MAACS keys fingerprints string data:</p> | 
|  | 83 <div class="OptionsBox"> | 
|  | 84     # | 
|  | 85 <br/>    # Package = MayaChemTools 7.4 | 
|  | 86 <br/>    # Release Date = Oct 21, 2010 | 
|  | 87 <br/>    # | 
|  | 88 <br/>    # TimeStamp = Fri Mar 11 14:57:24 2011 | 
|  | 89 <br/>    # | 
|  | 90 <br/>    # FingerprintsStringType = FingerprintsBitVector | 
|  | 91 <br/>    # | 
|  | 92 <br/>    # Description = MACCSKeyBits | 
|  | 93 <br/>    # Size = 166 | 
|  | 94 <br/>    # BitStringFormat = BinaryString | 
|  | 95 <br/>    # BitsOrder = Ascending | 
|  | 96 <br/>    # | 
|  | 97 <br/>    Cmpd1 00000000000000000000000000000000000000000100100001001000000001... | 
|  | 98 <br/>    Cmpd2 00000000000000000000000010000000001000000010000000001000000000... | 
|  | 99 <br/>    ... ... | 
|  | 100 <br/>    ... ..</div> | 
|  | 101 <p>Example of CSV <em>Text</em> file containing MAACS keys fingerprints string data:</p> | 
|  | 102 <div class="OptionsBox"> | 
|  | 103     "CompoundID","MACCSKeysFingerprints" | 
|  | 104 <br/>    "Cmpd1","FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending; | 
|  | 105 <br/>    00000000000000000000000000000000000000000100100001001000000001001000000 | 
|  | 106 <br/>    00111000100101010111100011011000100110110000011011110100110111111111111 | 
|  | 107 <br/>    01111111111111111110111000" | 
|  | 108 <br/>    ... ... | 
|  | 109 <br/>    ... ...</div> | 
|  | 110 <p>The current release of MayaChemTools generates the following types of MACCS keys | 
|  | 111 fingerprints bit-vector and vector strings:</p> | 
|  | 112 <div class="OptionsBox"> | 
|  | 113     FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;00000000 | 
|  | 114 <br/>    0000000000000000000000000000000001001000010010000000010010000000011100 | 
|  | 115 <br/>    0100101010111100011011000100110110000011011110100110111111111111011111 | 
|  | 116 <br/>    11111111111110111000</div> | 
|  | 117 <div class="OptionsBox"> | 
|  | 118     FingerprintsBitVector;MACCSKeyBits;166;HexadecimalString;Ascending;000 | 
|  | 119 <br/>    000000021210210e845f8d8c60b79dffbffffd1</div> | 
|  | 120 <div class="OptionsBox"> | 
|  | 121     FingerprintsBitVector;MACCSKeyBits;322;BinaryString;Ascending;11101011 | 
|  | 122 <br/>    1110011111100101111111000111101100110000000000000011100010000000000000 | 
|  | 123 <br/>    0000000000000000000000000000000000000000000000101000000000000000000000 | 
|  | 124 <br/>    0000000000000000000000000000000000000000000000000000000000000000000000 | 
|  | 125 <br/>    0000000000000000000000000000000000000011000000000000000000000000000000 | 
|  | 126 <br/>    0000000000000000000000000000000000000000</div> | 
|  | 127 <div class="OptionsBox"> | 
|  | 128     FingerprintsBitVector;MACCSKeyBits;322;HexadecimalString;Ascending;7d7 | 
|  | 129 <br/>    e7af3edc000c1100000000000000500000000000000000000000000000000300000000 | 
|  | 130 <br/>    000000000</div> | 
|  | 131 <div class="OptionsBox"> | 
|  | 132     FingerprintsVector;MACCSKeyCount;166;OrderedNumericalValues;ValuesStri | 
|  | 133 <br/>    ng;0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 
|  | 134 <br/>    0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 4 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 | 
|  | 135 <br/>    0 0 0 0 1 1 8 0 0 0 1 0 0 1 0 1 0 1 0 3 1 3 1 0 0 0 1 2 0 11 1 0 0 0 | 
|  | 136 <br/>    5 0 0 1 2 0 1 1 0 0 0 0 0 1 1 0 1 1 1 1 0 4 0 0 1 1 0 4 6 1 1 1 2 1 1 | 
|  | 137 <br/>    3 5 2 2 0 5 3 5 1 1 2 5 1 2 1 2 4 8 3 5 5 2 2 0 3 5 4 1</div> | 
|  | 138 <div class="OptionsBox"> | 
|  | 139     FingerprintsVector;MACCSKeyCount;322;OrderedNumericalValues;ValuesStri | 
|  | 140 <br/>    ng;14 8 2 0 2 0 4 4 2 1 4 0 0 2 5 10 5 2 1 0 0 2 0 5 13 3 28 5 5 3 0 0 | 
|  | 141 <br/>    0 4 2 1 1 0 1 1 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22 5 3 0 0 0 1 0 | 
|  | 142 <br/>    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 
|  | 143 <br/>    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 0 2 0 0 0 0 0 0 0 0 0 | 
|  | 144 <br/>    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...</div> | 
|  | 145 <p> | 
|  | 146 </p> | 
|  | 147 <h2>OPTIONS</h2> | 
|  | 148 <dl> | 
|  | 149 <dt><strong><strong>--AromaticityModel</strong> <em>MDLAromaticityModel | TriposAromaticityModel | MMFFAromaticityModel | ChemAxonBasicAromaticityModel | ChemAxonGeneralAromaticityModel | DaylightAromaticityModel | MayaChemToolsAromaticityModel</em></strong></dt> | 
|  | 150 <dd> | 
|  | 151 <p>Specify aromaticity model to use during detection of aromaticity. Possible values in the current | 
|  | 152 release are: <em>MDLAromaticityModel, TriposAromaticityModel, MMFFAromaticityModel, | 
|  | 153 ChemAxonBasicAromaticityModel, ChemAxonGeneralAromaticityModel, DaylightAromaticityModel | 
|  | 154 or MayaChemToolsAromaticityModel</em>. Default value: <em>MayaChemToolsAromaticityModel</em>.</p> | 
|  | 155 <p>The supported aromaticity model names along with model specific control parameters | 
|  | 156 are defined in <strong>AromaticityModelsData.csv</strong>, which is distributed with the current release | 
|  | 157 and is available under <strong>lib/data</strong> directory. <strong>Molecule.pm</strong> module retrieves data from | 
|  | 158 this file during class instantiation and makes it available to method <strong>DetectAromaticity</strong> | 
|  | 159 for detecting aromaticity corresponding to a specific model.</p> | 
|  | 160 </dd> | 
|  | 161 <dt><strong><strong>--BitsOrder</strong> <em>Ascending | Descending</em></strong></dt> | 
|  | 162 <dd> | 
|  | 163 <p>Bits order to use during generation of fingerprints bit-vector string for <em>MACCSKeyBits</em> value of | 
|  | 164 <strong>-m, --mode</strong> option. Possible values: <em>Ascending, Descending</em>. Default: <em>Ascending</em>.</p> | 
|  | 165 <p><em>Ascending</em> bit order which corresponds to first bit in each byte as the lowest bit as | 
|  | 166 opposed to the highest bit.</p> | 
|  | 167 <p>Internally, bits are stored in <em>Ascending</em> order using Perl vec function. Regardless | 
|  | 168 of machine order, big-endian or little-endian, vec function always considers first | 
|  | 169 string byte as the lowest byte and first bit within each byte as the lowest bit.</p> | 
|  | 170 </dd> | 
|  | 171 <dt><strong><strong>-b, --BitStringFormat</strong> <em>BinaryString | HexadecimalString</em></strong></dt> | 
|  | 172 <dd> | 
|  | 173 <p>Format of fingerprints bit-vector string data in output SD, FP or CSV/TSV text file(s) specified by | 
|  | 174 <strong>--output</strong> used during <em>MACCSKeyBits</em> value of <strong>-m, --mode</strong> option. Possible | 
|  | 175 values: <em>BinaryString, HexadecimalString</em>. Default value: <em>BinaryString</em>.</p> | 
|  | 176 <p><em>BinaryString</em> corresponds to an ASCII string containing 1s and 0s. <em>HexadecimalString</em> | 
|  | 177 contains bit values in ASCII hexadecimal format.</p> | 
|  | 178 <p>Examples:</p> | 
|  | 179 <div class="OptionsBox"> | 
|  | 180     FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;00000000 | 
|  | 181 <br/>    0000000000000000000000000000000001001000010010000000010010000000011100 | 
|  | 182 <br/>    0100101010111100011011000100110110000011011110100110111111111111011111 | 
|  | 183 <br/>    11111111111110111000</div> | 
|  | 184 <div class="OptionsBox"> | 
|  | 185     FingerprintsBitVector;MACCSKeyBits;166;HexadecimalString;Ascending;000 | 
|  | 186 <br/>    000000021210210e845f8d8c60b79dffbffffd1</div> | 
|  | 187 <div class="OptionsBox"> | 
|  | 188     FingerprintsBitVector;MACCSKeyBits;322;BinaryString;Ascending;11101011 | 
|  | 189 <br/>    1110011111100101111111000111101100110000000000000011100010000000000000 | 
|  | 190 <br/>    0000000000000000000000000000000000000000000000101000000000000000000000 | 
|  | 191 <br/>    0000000000000000000000000000000000000000000000000000000000000000000000 | 
|  | 192 <br/>    0000000000000000000000000000000000000011000000000000000000000000000000 | 
|  | 193 <br/>    0000000000000000000000000000000000000000</div> | 
|  | 194 <div class="OptionsBox"> | 
|  | 195     FingerprintsBitVector;MACCSKeyBits;322;HexadecimalString;Ascending;7d7 | 
|  | 196 <br/>    e7af3edc000c1100000000000000500000000000000000000000000000000300000000 | 
|  | 197 <br/>    000000000</div> | 
|  | 198 </dd> | 
|  | 199 <dt><strong><strong>--CompoundID</strong> <em>DataFieldName or LabelPrefixString</em></strong></dt> | 
|  | 200 <dd> | 
|  | 201 <p>This value is <strong>--CompoundIDMode</strong> specific and indicates how compound ID is generated.</p> | 
|  | 202 <p>For <em>DataField</em> value of <strong>--CompoundIDMode</strong> option, it corresponds to datafield label name | 
|  | 203 whose value is used as compound ID; otherwise, it's a prefix string used for generating compound | 
|  | 204 IDs like LabelPrefixString<Number>. Default value, <em>Cmpd</em>, generates compound IDs which | 
|  | 205 look like Cmpd<Number>.</p> | 
|  | 206 <p>Examples for <em>DataField</em> value of <strong>--CompoundIDMode</strong>:</p> | 
|  | 207 <div class="OptionsBox"> | 
|  | 208     MolID | 
|  | 209 <br/>    ExtReg</div> | 
|  | 210 <p>Examples for <em>LabelPrefix</em> or <em>MolNameOrLabelPrefix</em> value of <strong>--CompoundIDMode</strong>:</p> | 
|  | 211 <div class="OptionsBox"> | 
|  | 212     Compound</div> | 
|  | 213 <p>The value specified above generates compound IDs which correspond to Compound<Number> | 
|  | 214 instead of default value of Cmpd<Number>.</p> | 
|  | 215 </dd> | 
|  | 216 <dt><strong><strong>--CompoundIDLabel</strong> <em>text</em></strong></dt> | 
|  | 217 <dd> | 
|  | 218 <p>Specify compound ID column label for FP or CSV/TSV text file(s) used during <em>CompoundID</em> value | 
|  | 219 of <strong>--DataFieldsMode</strong> option. Default: <em>CompoundID</em>.</p> | 
|  | 220 </dd> | 
|  | 221 <dt><strong><strong>--CompoundIDMode</strong> <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em></strong></dt> | 
|  | 222 <dd> | 
|  | 223 <p>Specify how to generate compound IDs and write to FP or CSV/TSV text file(s) along with generated | 
|  | 224 fingerprints for <em>FP | text | all</em> values of <strong>--output</strong> option: use a <em>SDFile(s)</em> datafield value; | 
|  | 225 use molname line from <em>SDFile(s)</em>; generate a sequential ID with specific prefix; use combination | 
|  | 226 of both MolName and LabelPrefix with usage of LabelPrefix values for empty molname lines.</p> | 
|  | 227 <p>Possible values: <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em>. | 
|  | 228 Default: <em>LabelPrefix</em>.</p> | 
|  | 229 <p>For <em>MolNameAndLabelPrefix</em> value of <strong>--CompoundIDMode</strong>, molname line in <em>SDFile(s)</em> takes | 
|  | 230 precedence over sequential compound IDs generated using <em>LabelPrefix</em> and only empty molname | 
|  | 231 values are replaced with sequential compound IDs.</p> | 
|  | 232 <p>This is only used for <em>CompoundID</em> value of <strong>--DataFieldsMode</strong> option.</p> | 
|  | 233 </dd> | 
|  | 234 <dt><strong><strong>--DataFields</strong> <em>"FieldLabel1,FieldLabel2,..."</em></strong></dt> | 
|  | 235 <dd> | 
|  | 236 <p>Comma delimited list of <em>SDFiles(s)</em> data fields to extract and write to CSV/TSV text file(s) along | 
|  | 237 with generated fingerprints for <em>text | all</em> values of <strong>--output</strong> option.</p> | 
|  | 238 <p>This is only used for <em>Specify</em> value of <strong>--DataFieldsMode</strong> option.</p> | 
|  | 239 <p>Examples:</p> | 
|  | 240 <div class="OptionsBox"> | 
|  | 241     Extreg | 
|  | 242 <br/>    MolID,CompoundName</div> | 
|  | 243 </dd> | 
|  | 244 <dt><strong><strong>-d, --DataFieldsMode</strong> <em>All | Common | Specify | CompoundID</em></strong></dt> | 
|  | 245 <dd> | 
|  | 246 <p>Specify how data fields in <em>SDFile(s)</em> are transferred to output CSV/TSV text file(s) along | 
|  | 247 with generated fingerprints for <em>text | all</em> values of <strong>--output</strong> option: transfer all SD | 
|  | 248 data field; transfer SD data files common to all compounds; extract specified data fields; | 
|  | 249 generate a compound ID using molname line, a compound prefix, or a combination of both. | 
|  | 250 Possible values: <em>All | Common | specify | CompoundID</em>. Default value: <em>CompoundID</em>.</p> | 
|  | 251 </dd> | 
|  | 252 <dt><strong><strong>-f, --Filter</strong> <em>Yes | No</em></strong></dt> | 
|  | 253 <dd> | 
|  | 254 <p>Specify whether to check and filter compound data in SDFile(s). Possible values: <em>Yes or No</em>. | 
|  | 255 Default value: <em>Yes</em>.</p> | 
|  | 256 <p>By default, compound data is checked before calculating fingerprints and compounds containing | 
|  | 257 atom data corresponding to non-element symbols or no atom data are ignored.</p> | 
|  | 258 </dd> | 
|  | 259 <dt><strong><strong>--FingerprintsLabel</strong> <em>text</em></strong></dt> | 
|  | 260 <dd> | 
|  | 261 <p>SD data label or text file column label to use for fingerprints string in output SD or | 
|  | 262 CSV/TSV text file(s) specified by <strong>--output</strong>. Default value: <em>MACCSKeyFingerprints</em>.</p> | 
|  | 263 </dd> | 
|  | 264 <dt><strong><strong>-h, --help</strong></strong></dt> | 
|  | 265 <dd> | 
|  | 266 <p>Print this help message.</p> | 
|  | 267 </dd> | 
|  | 268 <dt><strong><strong>-k, --KeepLargestComponent</strong> <em>Yes | No</em></strong></dt> | 
|  | 269 <dd> | 
|  | 270 <p>Generate fingerprints for only the largest component in molecule. Possible values: | 
|  | 271 <em>Yes or No</em>. Default value: <em>Yes</em>.</p> | 
|  | 272 <p>For molecules containing multiple connected components, fingerprints can be generated | 
|  | 273 in two different ways: use all connected components or just the largest connected | 
|  | 274 component. By default, all atoms except for the largest connected component are | 
|  | 275 deleted before generation of fingerprints.</p> | 
|  | 276 </dd> | 
|  | 277 <dt><strong><strong>-m, --mode</strong> <em>MACCSKeyBits | MACCSKeyCount</em></strong></dt> | 
|  | 278 <dd> | 
|  | 279 <p>Specify type of MACCS keys [ Ref 45-47 ] fingerprints to generate for molecules in <em>SDFile(s)</em>. | 
|  | 280 Possible values: <em>MACCSKeyBits, MACCSKeyCount</em>. Default value: <em>MACCSKeyBits</em>.</p> | 
|  | 281 <p>For <em>MACCSKeyBits</em> value of <strong>-m, --mode</strong> option, a fingerprint bit-vector string containing | 
|  | 282 zeros and ones is generated and for <em>MACCSKeyCount</em> value, a fingerprint vector string | 
|  | 283 corresponding to number of MACCS keys is generated.</p> | 
|  | 284 <p><em>MACCSKeyBits | MACCSKeyCount</em> values for <strong>-m, --mode</strong> option along with two possible | 
|  | 285 <em>166 | 322</em>  values of <strong>-s, --size</strong> supports generation of four different types of MACCS | 
|  | 286 keys fingerprint: <em>MACCS166KeyBits, MACCS166KeyCount, MACCS322KeyBits, MACCS322KeyCount</em>.</p> | 
|  | 287 <p>Definition of MACCS keys uses the following atom and bond symbols to define atom and | 
|  | 288 bond environments:</p> | 
|  | 289 <div class="OptionsBox"> | 
|  | 290     Atom symbols for 166 keys [ Ref 47 ]:</div> | 
|  | 291 <div class="OptionsBox"> | 
|  | 292     A : Any valid periodic table element symbol | 
|  | 293 <br/>    Q  : Hetro atoms; any non-C or non-H atom | 
|  | 294 <br/>    X  : Halogens; F, Cl, Br, I | 
|  | 295 <br/>    Z  : Others; other than H, C, N, O, Si, P, S, F, Cl, Br, I</div> | 
|  | 296 <div class="OptionsBox"> | 
|  | 297     Atom symbols for 322 keys [ Ref 46 ]:</div> | 
|  | 298 <div class="OptionsBox"> | 
|  | 299     A : Any valid periodic table element symbol | 
|  | 300 <br/>    Q  : Hetro atoms; any non-C or non-H atom | 
|  | 301 <br/>    X  : Others; other than H, C, N, O, Si, P, S, F, Cl, Br, I | 
|  | 302 <br/>    Z is neither defined nor used</div> | 
|  | 303 <div class="OptionsBox"> | 
|  | 304     Bond types:</div> | 
|  | 305 <div class="OptionsBox"> | 
|  | 306     -  : Single | 
|  | 307 <br/>    =  : Double | 
|  | 308 <br/>    T  : Triple | 
|  | 309 <br/>    #  : Triple | 
|  | 310 <br/>    ~  : Single or double query bond | 
|  | 311 <br/>    %  : An aromatic query bond</div> | 
|  | 312 <div class="OptionsBox"> | 
|  | 313     None : Any bond type; no explicit bond specified</div> | 
|  | 314 <div class="OptionsBox"> | 
|  | 315     $  : Ring bond; $ before a bond type specifies ring bond | 
|  | 316 <br/>    !  : Chain or non-ring bond; ! before a bond type specifies chain bond</div> | 
|  | 317 <div class="OptionsBox"> | 
|  | 318     @  : A ring linkage and the number following it specifies the | 
|  | 319          atoms position in the line, thus @1 means linked back to the first | 
|  | 320          atom in the list.</div> | 
|  | 321 <div class="OptionsBox"> | 
|  | 322     Aromatic: Kekule or Arom5</div> | 
|  | 323 <div class="OptionsBox"> | 
|  | 324     Kekule: Bonds in 6-membered rings with alternate single/double bonds | 
|  | 325             or perimeter bonds | 
|  | 326 <br/>    Arom5:  Bonds in 5-membered rings with two double bonds and a hetro | 
|  | 327             atom at the apex of the ring.</div> | 
|  | 328 <p>MACCS 166 keys [ Ref 45-47 ] are defined as follows:</p> | 
|  | 329 <div class="OptionsBox"> | 
|  | 330     Key Description</div> | 
|  | 331 <div class="OptionsBox"> | 
|  | 332     1   ISOTOPE | 
|  | 333 <br/>    2   103 < ATOMIC NO. < 256 | 
|  | 334 <br/>    3   GROUP IVA,VA,VIA PERIODS 4-6 (Ge...) | 
|  | 335 <br/>    4   ACTINIDE | 
|  | 336 <br/>    5   GROUP IIIB,IVB (Sc...) | 
|  | 337 <br/>    6   LANTHANIDE | 
|  | 338 <br/>    7   GROUP VB,VIB,VIIB (V...) | 
|  | 339 <br/>    8   QAAA@1 | 
|  | 340 <br/>    9   GROUP VIII (Fe...) | 
|  | 341 <br/>    10  GROUP IIA (ALKALINE EARTH) | 
|  | 342 <br/>    11  4M RING | 
|  | 343 <br/>    12  GROUP IB,IIB (Cu...) | 
|  | 344 <br/>    13  ON(C)C | 
|  | 345 <br/>    14  S-S | 
|  | 346 <br/>    15  OC(O)O | 
|  | 347 <br/>    16  QAA@1 | 
|  | 348 <br/>    17  CTC | 
|  | 349 <br/>    18  GROUP IIIA (B...) | 
|  | 350 <br/>    19  7M RING | 
|  | 351 <br/>    20  SI | 
|  | 352 <br/>    21  C=C(Q)Q | 
|  | 353 <br/>    22  3M RING | 
|  | 354 <br/>    23  NC(O)O | 
|  | 355 <br/>    24  N-O | 
|  | 356 <br/>    25  NC(N)N | 
|  | 357 <br/>    26  C$=C($A)$A | 
|  | 358 <br/>    27  I | 
|  | 359 <br/>    28  QCH2Q | 
|  | 360 <br/>    29  P | 
|  | 361 <br/>    30  CQ(C)(C)A | 
|  | 362 <br/>    31  QX | 
|  | 363 <br/>    32  CSN | 
|  | 364 <br/>    33  NS | 
|  | 365 <br/>    34  CH2=A | 
|  | 366 <br/>    35  GROUP IA (ALKALI METAL) | 
|  | 367 <br/>    36  S HETEROCYCLE | 
|  | 368 <br/>    37  NC(O)N | 
|  | 369 <br/>    38  NC(C)N | 
|  | 370 <br/>    39  OS(O)O | 
|  | 371 <br/>    40  S-O | 
|  | 372 <br/>    41  CTN | 
|  | 373 <br/>    42  F | 
|  | 374 <br/>    43  QHAQH | 
|  | 375 <br/>    44  OTHER | 
|  | 376 <br/>    45  C=CN | 
|  | 377 <br/>    46  BR | 
|  | 378 <br/>    47  SAN | 
|  | 379 <br/>    48  OQ(O)O | 
|  | 380 <br/>    49  CHARGE | 
|  | 381 <br/>    50  C=C(C)C | 
|  | 382 <br/>    51  CSO | 
|  | 383 <br/>    52  NN | 
|  | 384 <br/>    53  QHAAAQH | 
|  | 385 <br/>    54  QHAAQH | 
|  | 386 <br/>    55  OSO | 
|  | 387 <br/>    56  ON(O)C | 
|  | 388 <br/>    57  O HETEROCYCLE | 
|  | 389 <br/>    58  QSQ | 
|  | 390 <br/>    59  Snot%A%A | 
|  | 391 <br/>    60  S=O | 
|  | 392 <br/>    61  AS(A)A | 
|  | 393 <br/>    62  A$A!A$A | 
|  | 394 <br/>    63  N=O | 
|  | 395 <br/>    64  A$A!S | 
|  | 396 <br/>    65  C%N | 
|  | 397 <br/>    66  CC(C)(C)A | 
|  | 398 <br/>    67  QS | 
|  | 399 <br/>    68  QHQH (&...) | 
|  | 400 <br/>    69  QQH | 
|  | 401 <br/>    70  QNQ | 
|  | 402 <br/>    71  NO | 
|  | 403 <br/>    72  OAAO | 
|  | 404 <br/>    73  S=A | 
|  | 405 <br/>    74  CH3ACH3 | 
|  | 406 <br/>    75  A!N$A | 
|  | 407 <br/>    76  C=C(A)A | 
|  | 408 <br/>    77  NAN | 
|  | 409 <br/>    78  C=N | 
|  | 410 <br/>    79  NAAN | 
|  | 411 <br/>    80  NAAAN | 
|  | 412 <br/>    81  SA(A)A | 
|  | 413 <br/>    82  ACH2QH | 
|  | 414 <br/>    83  QAAAA@1 | 
|  | 415 <br/>    84  NH2 | 
|  | 416 <br/>    85  CN(C)C | 
|  | 417 <br/>    86  CH2QCH2 | 
|  | 418 <br/>    87  X!A$A | 
|  | 419 <br/>    88  S | 
|  | 420 <br/>    89  OAAAO | 
|  | 421 <br/>    90  QHAACH2A | 
|  | 422 <br/>    91  QHAAACH2A | 
|  | 423 <br/>    92  OC(N)C | 
|  | 424 <br/>    93  QCH3 | 
|  | 425 <br/>    94  QN | 
|  | 426 <br/>    95  NAAO | 
|  | 427 <br/>    96  5M RING | 
|  | 428 <br/>    97  NAAAO | 
|  | 429 <br/>    98  QAAAAA@1 | 
|  | 430 <br/>    99  C=C | 
|  | 431 <br/>    100 ACH2N | 
|  | 432 <br/>    101 8M RING | 
|  | 433 <br/>    102 QO | 
|  | 434 <br/>    103 CL | 
|  | 435 <br/>    104 QHACH2A | 
|  | 436 <br/>    105 A$A($A)$A | 
|  | 437 <br/>    106 QA(Q)Q | 
|  | 438 <br/>    107 XA(A)A | 
|  | 439 <br/>    108 CH3AAACH2A | 
|  | 440 <br/>    109 ACH2O | 
|  | 441 <br/>    110 NCO | 
|  | 442 <br/>    111 NACH2A | 
|  | 443 <br/>    112 AA(A)(A)A | 
|  | 444 <br/>    113 Onot%A%A | 
|  | 445 <br/>    114 CH3CH2A | 
|  | 446 <br/>    115 CH3ACH2A | 
|  | 447 <br/>    116 CH3AACH2A | 
|  | 448 <br/>    117 NAO | 
|  | 449 <br/>    118 ACH2CH2A > 1 | 
|  | 450 <br/>    119 N=A | 
|  | 451 <br/>    120 HETEROCYCLIC ATOM > 1 (&...) | 
|  | 452 <br/>    121 N HETEROCYCLE | 
|  | 453 <br/>    122 AN(A)A | 
|  | 454 <br/>    123 OCO | 
|  | 455 <br/>    124 QQ | 
|  | 456 <br/>    125 AROMATIC RING > 1 | 
|  | 457 <br/>    126 A!O!A | 
|  | 458 <br/>    127 A$A!O > 1 (&...) | 
|  | 459 <br/>    128 ACH2AAACH2A | 
|  | 460 <br/>    129 ACH2AACH2A | 
|  | 461 <br/>    130 QQ > 1 (&...) | 
|  | 462 <br/>    131 QH > 1 | 
|  | 463 <br/>    132 OACH2A | 
|  | 464 <br/>    133 A$A!N | 
|  | 465 <br/>    134 X (HALOGEN) | 
|  | 466 <br/>    135 Nnot%A%A | 
|  | 467 <br/>    136 O=A > 1 | 
|  | 468 <br/>    137 HETEROCYCLE | 
|  | 469 <br/>    138 QCH2A > 1 (&...) | 
|  | 470 <br/>    139 OH | 
|  | 471 <br/>    140 O > 3 (&...) | 
|  | 472 <br/>    141 CH3 > 2 (&...) | 
|  | 473 <br/>    142 N > 1 | 
|  | 474 <br/>    143 A$A!O | 
|  | 475 <br/>    144 Anot%A%Anot%A | 
|  | 476 <br/>    145 6M RING > 1 | 
|  | 477 <br/>    146 O > 2 | 
|  | 478 <br/>    147 ACH2CH2A | 
|  | 479 <br/>    148 AQ(A)A | 
|  | 480 <br/>    149 CH3 > 1 | 
|  | 481 <br/>    150 A!A$A!A | 
|  | 482 <br/>    151 NH | 
|  | 483 <br/>    152 OC(C)C | 
|  | 484 <br/>    153 QCH2A | 
|  | 485 <br/>    154 C=O | 
|  | 486 <br/>    155 A!CH2!A | 
|  | 487 <br/>    156 NA(A)A | 
|  | 488 <br/>    157 C-O | 
|  | 489 <br/>    158 C-N | 
|  | 490 <br/>    159 O > 1 | 
|  | 491 <br/>    160 CH3 | 
|  | 492 <br/>    161 N | 
|  | 493 <br/>    162 AROMATIC | 
|  | 494 <br/>    163 6M RING | 
|  | 495 <br/>    164 O | 
|  | 496 <br/>    165 RING | 
|  | 497 <br/>    166         FRAGMENTS</div> | 
|  | 498 <p>MACCS 322 keys set as defined in tables 1, 2 and 3 [ Ref 46 ] include:</p> | 
|  | 499 <div class="OptionsBox"> | 
|  | 500     . 26 atom properties of type P, as listed in Table 1 | 
|  | 501 <br/>    . 32 one-atom environments, as listed in Table 3 | 
|  | 502 <br/>    . 264 atom-bond-atom combinations listed in Table 4</div> | 
|  | 503 <p>Total number of keys in three tables is : 322</p> | 
|  | 504 <p>Atom symbol, X, used for 322 keys [ Ref 46 ] doesn't refer to Halogens as it does for 166 keys. In | 
|  | 505 order to keep the definition of 322 keys consistent with the published definitions, the symbol X is | 
|  | 506 used to imply "others" atoms, but it's internally mapped to symbol X as defined for 166 keys | 
|  | 507 during the generation of key values.</p> | 
|  | 508 <p>Atom properties-based keys (26):</p> | 
|  | 509 <div class="OptionsBox"> | 
|  | 510     Key   Description | 
|  | 511 <br/>    1     A(AAA) or AA(A)A - atom with at least three neighbors | 
|  | 512 <br/>    2     Q - heteroatom | 
|  | 513 <br/>    3     Anot%not-A - atom involved in one or more multiple bonds, not aromatic | 
|  | 514 <br/>    4     A(AAAA) or AA(A)(A)A - atom with at least four neighbors | 
|  | 515 <br/>    5     A(QQ) or QA(Q) - atom with at least two heteroatom neighbors | 
|  | 516 <br/>    6     A(QQQ) or QA(Q)Q - atom with at least three heteroatom neighbors | 
|  | 517 <br/>    7     QH - heteroatom with at least one hydrogen attached | 
|  | 518 <br/>    8     CH2(AA) or ACH2A - carbon with at least two single bonds and at least | 
|  | 519           two hydrogens attached | 
|  | 520 <br/>    9     CH3(A) or ACH3 - carbon with at least one single bond and at least three | 
|  | 521           hydrogens attached | 
|  | 522 <br/>    10    Halogen | 
|  | 523 <br/>    11    A(-A-A-A) or A-A(-A)-A - atom has at least three single bonds | 
|  | 524 <br/>    12    AAAAAA@1 > 2 - atom is in at least two different six-membered rings | 
|  | 525 <br/>    13    A($A$A$A) or A$A($A)$A - atom has more than two ring bonds | 
|  | 526 <br/>    14    A$A!A$A - atom is at a ring/chain boundary. When a comparison is done | 
|  | 527           with another atom the path passes through the chain bond. | 
|  | 528 <br/>    15    Anot%A%Anot%A - atom is at an aromatic/nonaromatic boundary. When a | 
|  | 529           comparison is done with another atom the path | 
|  | 530           passes through the aromatic bond. | 
|  | 531 <br/>    16    A!A!A  - atom with more than one chain bond | 
|  | 532 <br/>    17    A!A$A!A - atom is at a ring/chain boundary. When a comparison is done | 
|  | 533           with another atom the path passes through the ring bond. | 
|  | 534 <br/>    18    A%Anot%A%A - atom is at an aromatic/nonaromatic boundary. When a | 
|  | 535           comparison is done with another atom the | 
|  | 536           path passes through the nonaromatic bond. | 
|  | 537 <br/>    19    HETEROCYCLE - atom is a heteroatom in a ring. | 
|  | 538 <br/>    20    rare properties: atom with five or more neighbors, atom in | 
|  | 539           four or more rings, or atom types other than | 
|  | 540           H, C, N, O, S, F, Cl, Br, or I | 
|  | 541 <br/>    21    rare properties: atom has a charge, is an isotope, has two or | 
|  | 542           more multiple bonds, or has a triple bond. | 
|  | 543 <br/>    22    N - nitrogen | 
|  | 544 <br/>    23    S - sulfur | 
|  | 545 <br/>    24    O - oxygen | 
|  | 546 <br/>    25    A(AA)A(A)A(AA) - atom has two neighbors, each with three or | 
|  | 547           more neighbors (including the central atom). | 
|  | 548 <br/>    26    CHACH2 - atom has two hydrocarbon (CH2) neighbors</div> | 
|  | 549 <p>Atomic environments properties-based keys (32):</p> | 
|  | 550 <div class="OptionsBox"> | 
|  | 551     Key   Description | 
|  | 552 <br/>    27    C(CC) | 
|  | 553 <br/>    28    C(CCC) | 
|  | 554 <br/>    29    C(CN) | 
|  | 555 <br/>    30    C(CCN) | 
|  | 556 <br/>    31    C(NN) | 
|  | 557 <br/>    32    C(NNC) | 
|  | 558 <br/>    33    C(NNN) | 
|  | 559 <br/>    34    C(CO) | 
|  | 560 <br/>    35    C(CCO) | 
|  | 561 <br/>    36    C(NO) | 
|  | 562 <br/>    37    C(NCO) | 
|  | 563 <br/>    38    C(NNO) | 
|  | 564 <br/>    39    C(OO) | 
|  | 565 <br/>    40    C(COO) | 
|  | 566 <br/>    41    C(NOO) | 
|  | 567 <br/>    42    C(OOO) | 
|  | 568 <br/>    43    Q(CC) | 
|  | 569 <br/>    44    Q(CCC) | 
|  | 570 <br/>    45    Q(CN) | 
|  | 571 <br/>    46    Q(CCN) | 
|  | 572 <br/>    47    Q(NN) | 
|  | 573 <br/>    48    Q(CNN) | 
|  | 574 <br/>    49    Q(NNN) | 
|  | 575 <br/>    50    Q(CO) | 
|  | 576 <br/>    51    Q(CCO) | 
|  | 577 <br/>    52    Q(NO) | 
|  | 578 <br/>    53    Q(CNO) | 
|  | 579 <br/>    54    Q(NNO) | 
|  | 580 <br/>    55    Q(OO) | 
|  | 581 <br/>    56    Q(COO) | 
|  | 582 <br/>    57    Q(NOO) | 
|  | 583 <br/>    58    Q(OOO)</div> | 
|  | 584 <p>Note: The first symbol is the central atom, with atoms bonded to the central atom listed in | 
|  | 585 parentheses. Q is any non-C, non-H atom. If only two atoms are in parentheses, there is | 
|  | 586 no implication concerning the other atoms bonded to the central atom.</p> | 
|  | 587 <p>Atom-Bond-Atom properties-based keys: (264)</p> | 
|  | 588 <div class="OptionsBox"> | 
|  | 589     Key   Description | 
|  | 590 <br/>    59    C-C | 
|  | 591 <br/>    60    C-N | 
|  | 592 <br/>    61    C-O | 
|  | 593 <br/>    62    C-S | 
|  | 594 <br/>    63    C-Cl | 
|  | 595 <br/>    64    C-P | 
|  | 596 <br/>    65    C-F | 
|  | 597 <br/>    66    C-Br | 
|  | 598 <br/>    67    C-Si | 
|  | 599 <br/>    68    C-I | 
|  | 600 <br/>    69    C-X | 
|  | 601 <br/>    70    N-N | 
|  | 602 <br/>    71    N-O | 
|  | 603 <br/>    72    N-S | 
|  | 604 <br/>    73    N-Cl | 
|  | 605 <br/>    74    N-P | 
|  | 606 <br/>    75    N-F | 
|  | 607 <br/>    76    N-Br | 
|  | 608 <br/>    77    N-Si | 
|  | 609 <br/>    78    N-I | 
|  | 610 <br/>    79    N-X | 
|  | 611 <br/>    80    O-O | 
|  | 612 <br/>    81    O-S | 
|  | 613 <br/>    82    O-Cl | 
|  | 614 <br/>    83    O-P | 
|  | 615 <br/>    84    O-F | 
|  | 616 <br/>    85    O-Br | 
|  | 617 <br/>    86    O-Si | 
|  | 618 <br/>    87    O-I | 
|  | 619 <br/>    88    O-X | 
|  | 620 <br/>    89    S-S | 
|  | 621 <br/>    90    S-Cl | 
|  | 622 <br/>    91    S-P | 
|  | 623 <br/>    92    S-F | 
|  | 624 <br/>    93    S-Br | 
|  | 625 <br/>    94    S-Si | 
|  | 626 <br/>    95    S-I | 
|  | 627 <br/>    96    S-X | 
|  | 628 <br/>    97    Cl-Cl | 
|  | 629 <br/>    98    Cl-P | 
|  | 630 <br/>    99    Cl-F | 
|  | 631 <br/>    100   Cl-Br | 
|  | 632 <br/>    101   Cl-Si | 
|  | 633 <br/>    102   Cl-I | 
|  | 634 <br/>    103   Cl-X | 
|  | 635 <br/>    104   P-P | 
|  | 636 <br/>    105   P-F | 
|  | 637 <br/>    106   P-Br | 
|  | 638 <br/>    107   P-Si | 
|  | 639 <br/>    108   P-I | 
|  | 640 <br/>    109   P-X | 
|  | 641 <br/>    110   F-F | 
|  | 642 <br/>    111   F-Br | 
|  | 643 <br/>    112   F-Si | 
|  | 644 <br/>    113   F-I | 
|  | 645 <br/>    114   F-X | 
|  | 646 <br/>    115   Br-Br | 
|  | 647 <br/>    116   Br-Si | 
|  | 648 <br/>    117   Br-I | 
|  | 649 <br/>    118   Br-X | 
|  | 650 <br/>    119   Si-Si | 
|  | 651 <br/>    120   Si-I | 
|  | 652 <br/>    121   Si-X | 
|  | 653 <br/>    122   I-I | 
|  | 654 <br/>    123   I-X | 
|  | 655 <br/>    124   X-X | 
|  | 656 <br/>    125   C=C | 
|  | 657 <br/>    126   C=N | 
|  | 658 <br/>    127   C=O | 
|  | 659 <br/>    128   C=S | 
|  | 660 <br/>    129   C=Cl | 
|  | 661 <br/>    130   C=P | 
|  | 662 <br/>    131   C=F | 
|  | 663 <br/>    132   C=Br | 
|  | 664 <br/>    133   C=Si | 
|  | 665 <br/>    134   C=I | 
|  | 666 <br/>    135   C=X | 
|  | 667 <br/>    136   N=N | 
|  | 668 <br/>    137   N=O | 
|  | 669 <br/>    138   N=S | 
|  | 670 <br/>    139   N=Cl | 
|  | 671 <br/>    140   N=P | 
|  | 672 <br/>    141   N=F | 
|  | 673 <br/>    142   N=Br | 
|  | 674 <br/>    143   N=Si | 
|  | 675 <br/>    144   N=I | 
|  | 676 <br/>    145   N=X | 
|  | 677 <br/>    146   O=O | 
|  | 678 <br/>    147   O=S | 
|  | 679 <br/>    148   O=Cl | 
|  | 680 <br/>    149   O=P | 
|  | 681 <br/>    150   O=F | 
|  | 682 <br/>    151   O=Br | 
|  | 683 <br/>    152   O=Si | 
|  | 684 <br/>    153   O=I | 
|  | 685 <br/>    154   O=X | 
|  | 686 <br/>    155   S=S | 
|  | 687 <br/>    156   S=Cl | 
|  | 688 <br/>    157   S=P | 
|  | 689 <br/>    158   S=F | 
|  | 690 <br/>    159   S=Br | 
|  | 691 <br/>    160   S=Si | 
|  | 692 <br/>    161   S=I | 
|  | 693 <br/>    162   S=X | 
|  | 694 <br/>    163   Cl=Cl | 
|  | 695 <br/>    164   Cl=P | 
|  | 696 <br/>    165   Cl=F | 
|  | 697 <br/>    166   Cl=Br | 
|  | 698 <br/>    167   Cl=Si | 
|  | 699 <br/>    168   Cl=I | 
|  | 700 <br/>    169   Cl=X | 
|  | 701 <br/>    170   P=P | 
|  | 702 <br/>    171   P=F | 
|  | 703 <br/>    172   P=Br | 
|  | 704 <br/>    173   P=Si | 
|  | 705 <br/>    174   P=I | 
|  | 706 <br/>    175   P=X | 
|  | 707 <br/>    176   F=F | 
|  | 708 <br/>    177   F=Br | 
|  | 709 <br/>    178   F=Si | 
|  | 710 <br/>    179   F=I | 
|  | 711 <br/>    180   F=X | 
|  | 712 <br/>    181   Br=Br | 
|  | 713 <br/>    182   Br=Si | 
|  | 714 <br/>    183   Br=I | 
|  | 715 <br/>    184   Br=X | 
|  | 716 <br/>    185   Si=Si | 
|  | 717 <br/>    186   Si=I | 
|  | 718 <br/>    187   Si=X | 
|  | 719 <br/>    188   I=I | 
|  | 720 <br/>    189   I=X | 
|  | 721 <br/>    190   X=X | 
|  | 722 <br/>    191   C#C | 
|  | 723 <br/>    192   C#N | 
|  | 724 <br/>    193   C#O | 
|  | 725 <br/>    194   C#S | 
|  | 726 <br/>    195   C#Cl | 
|  | 727 <br/>    196   C#P | 
|  | 728 <br/>    197   C#F | 
|  | 729 <br/>    198   C#Br | 
|  | 730 <br/>    199   C#Si | 
|  | 731 <br/>    200   C#I | 
|  | 732 <br/>    201   C#X | 
|  | 733 <br/>    202   N#N | 
|  | 734 <br/>    203   N#O | 
|  | 735 <br/>    204   N#S | 
|  | 736 <br/>    205   N#Cl | 
|  | 737 <br/>    206   N#P | 
|  | 738 <br/>    207   N#F | 
|  | 739 <br/>    208   N#Br | 
|  | 740 <br/>    209   N#Si | 
|  | 741 <br/>    210   N#I | 
|  | 742 <br/>    211   N#X | 
|  | 743 <br/>    212   O#O | 
|  | 744 <br/>    213   O#S | 
|  | 745 <br/>    214   O#Cl | 
|  | 746 <br/>    215   O#P | 
|  | 747 <br/>    216   O#F | 
|  | 748 <br/>    217   O#Br | 
|  | 749 <br/>    218   O#Si | 
|  | 750 <br/>    219   O#I | 
|  | 751 <br/>    220   O#X | 
|  | 752 <br/>    221   S#S | 
|  | 753 <br/>    222   S#Cl | 
|  | 754 <br/>    223   S#P | 
|  | 755 <br/>    224   S#F | 
|  | 756 <br/>    225   S#Br | 
|  | 757 <br/>    226   S#Si | 
|  | 758 <br/>    227   S#I | 
|  | 759 <br/>    228   S#X | 
|  | 760 <br/>    229   Cl#Cl | 
|  | 761 <br/>    230   Cl#P | 
|  | 762 <br/>    231   Cl#F | 
|  | 763 <br/>    232   Cl#Br | 
|  | 764 <br/>    233   Cl#Si | 
|  | 765 <br/>    234   Cl#I | 
|  | 766 <br/>    235   Cl#X | 
|  | 767 <br/>    236   P#P | 
|  | 768 <br/>    237   P#F | 
|  | 769 <br/>    238   P#Br | 
|  | 770 <br/>    239   P#Si | 
|  | 771 <br/>    240   P#I | 
|  | 772 <br/>    241   P#X | 
|  | 773 <br/>    242   F#F | 
|  | 774 <br/>    243   F#Br | 
|  | 775 <br/>    244   F#Si | 
|  | 776 <br/>    245   F#I | 
|  | 777 <br/>    246   F#X | 
|  | 778 <br/>    247   Br#Br | 
|  | 779 <br/>    248   Br#Si | 
|  | 780 <br/>    249   Br#I | 
|  | 781 <br/>    250   Br#X | 
|  | 782 <br/>    251   Si#Si | 
|  | 783 <br/>    252   Si#I | 
|  | 784 <br/>    253   Si#X | 
|  | 785 <br/>    254   I#I | 
|  | 786 <br/>    255   I#X | 
|  | 787 <br/>    256   X#X | 
|  | 788 <br/>    257   C$C | 
|  | 789 <br/>    258   C$N | 
|  | 790 <br/>    259   C$O | 
|  | 791 <br/>    260   C$S | 
|  | 792 <br/>    261   C$Cl | 
|  | 793 <br/>    262   C$P | 
|  | 794 <br/>    263   C$F | 
|  | 795 <br/>    264   C$Br | 
|  | 796 <br/>    265   C$Si | 
|  | 797 <br/>    266   C$I | 
|  | 798 <br/>    267   C$X | 
|  | 799 <br/>    268   N$N | 
|  | 800 <br/>    269   N$O | 
|  | 801 <br/>    270   N$S | 
|  | 802 <br/>    271   N$Cl | 
|  | 803 <br/>    272   N$P | 
|  | 804 <br/>    273   N$F | 
|  | 805 <br/>    274   N$Br | 
|  | 806 <br/>    275   N$Si | 
|  | 807 <br/>    276   N$I | 
|  | 808 <br/>    277   N$X | 
|  | 809 <br/>    278   O$O | 
|  | 810 <br/>    279   O$S | 
|  | 811 <br/>    280   O$Cl | 
|  | 812 <br/>    281   O$P | 
|  | 813 <br/>    282   O$F | 
|  | 814 <br/>    283   O$Br | 
|  | 815 <br/>    284   O$Si | 
|  | 816 <br/>    285   O$I | 
|  | 817 <br/>    286   O$X | 
|  | 818 <br/>    287   S$S | 
|  | 819 <br/>    288   S$Cl | 
|  | 820 <br/>    289   S$P | 
|  | 821 <br/>    290   S$F | 
|  | 822 <br/>    291   S$Br | 
|  | 823 <br/>    292   S$Si | 
|  | 824 <br/>    293   S$I | 
|  | 825 <br/>    294   S$X | 
|  | 826 <br/>    295   Cl$Cl | 
|  | 827 <br/>    296   Cl$P | 
|  | 828 <br/>    297   Cl$F | 
|  | 829 <br/>    298   Cl$Br | 
|  | 830 <br/>    299   Cl$Si | 
|  | 831 <br/>    300   Cl$I | 
|  | 832 <br/>    301   Cl$X | 
|  | 833 <br/>    302   P$P | 
|  | 834 <br/>    303   P$F | 
|  | 835 <br/>    304   P$Br | 
|  | 836 <br/>    305   P$Si | 
|  | 837 <br/>    306   P$I | 
|  | 838 <br/>    307   P$X | 
|  | 839 <br/>    308   F$F | 
|  | 840 <br/>    309   F$Br | 
|  | 841 <br/>    310   F$Si | 
|  | 842 <br/>    311   F$I | 
|  | 843 <br/>    312   F$X | 
|  | 844 <br/>    313   Br$Br | 
|  | 845 <br/>    314   Br$Si | 
|  | 846 <br/>    315   Br$I | 
|  | 847 <br/>    316   Br$X | 
|  | 848 <br/>    317   Si$Si | 
|  | 849 <br/>    318   Si$I | 
|  | 850 <br/>    319   Si$X | 
|  | 851 <br/>    320   I$I | 
|  | 852 <br/>    321   I$X | 
|  | 853 <br/>    322   X$X</div> | 
|  | 854 </dd> | 
|  | 855 <dt><strong><strong>--OutDelim</strong> <em>comma | tab | semicolon</em></strong></dt> | 
|  | 856 <dd> | 
|  | 857 <p>Delimiter for output CSV/TSV text file(s). Possible values: <em>comma, tab, or semicolon</em> | 
|  | 858 Default value: <em>comma</em>.</p> | 
|  | 859 </dd> | 
|  | 860 <dt><strong><strong>--output</strong> <em>SD | FP | text | all</em></strong></dt> | 
|  | 861 <dd> | 
|  | 862 <p>Type of output files to generate. Possible values: <em>SD, FP, text, or all</em>. Default value: <em>text</em>.</p> | 
|  | 863 </dd> | 
|  | 864 <dt><strong><strong>-o, --overwrite</strong></strong></dt> | 
|  | 865 <dd> | 
|  | 866 <p>Overwrite existing files.</p> | 
|  | 867 </dd> | 
|  | 868 <dt><strong><strong>-q, --quote</strong> <em>Yes | No</em></strong></dt> | 
|  | 869 <dd> | 
|  | 870 <p>Put quote around column values in output CSV/TSV text file(s). Possible values: | 
|  | 871 <em>Yes or No</em>. Default value: <em>Yes</em>.</p> | 
|  | 872 </dd> | 
|  | 873 <dt><strong><strong>-r, --root</strong> <em>RootName</em></strong></dt> | 
|  | 874 <dd> | 
|  | 875 <p>New file name is generated using the root: <Root>.<Ext>. Default for new file | 
|  | 876 names: <SDFileName><MACCSKeysFP>.<Ext>. The file type determines <Ext> value. | 
|  | 877 The sdf, fpf, csv, and tsv <Ext> values are used for SD, FP, comma/semicolon, and tab | 
|  | 878 delimited text files, respectively.This option is ignored for multiple input files.</p> | 
|  | 879 </dd> | 
|  | 880 <dt><strong><strong>-s, --size</strong> <em>number</em></strong></dt> | 
|  | 881 <dd> | 
|  | 882 <p>Size of MACCS keys [ Ref 45-47 ] set to use during fingerprints generation. Possible values: <em>166 or 322</em>. | 
|  | 883 Default value: <em>166</em>.</p> | 
|  | 884 </dd> | 
|  | 885 <dt><strong><strong>-v, --VectorStringFormat</strong> <em>ValuesString | IDsAndValuesString | IDsAndValuesPairsString | ValuesAndIDsString | ValuesAndIDsPairsString</em></strong></dt> | 
|  | 886 <dd> | 
|  | 887 <p>Format of fingerprints vector string data in output SD, FP or CSV/TSV text file(s) specified by | 
|  | 888 <strong>--output</strong> used during <em>MACCSKeyCount</em> value of <strong>-m, --mode</strong> option. Possible | 
|  | 889 values: <em>ValuesString, IDsAndValuesString | IDsAndValuesPairsString | ValuesAndIDsString | | 
|  | 890 ValuesAndIDsPairsString</em>. Defaultvalue: <em>ValuesString</em>.</p> | 
|  | 891 <p>Examples:</p> | 
|  | 892 <div class="OptionsBox"> | 
|  | 893     FingerprintsVector;MACCSKeyCount;166;OrderedNumericalValues;ValuesStri | 
|  | 894 <br/>    ng;0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 
|  | 895 <br/>    0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 4 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 | 
|  | 896 <br/>    0 0 0 0 1 1 8 0 0 0 1 0 0 1 0 1 0 1 0 3 1 3 1 0 0 0 1 2 0 11 1 0 0 0 | 
|  | 897 <br/>    5 0 0 1 2 0 1 1 0 0 0 0 0 1 1 0 1 1 1 1 0 4 0 0 1 1 0 4 6 1 1 1 2 1 1 | 
|  | 898 <br/>    3 5 2 2 0 5 3 5 1 1 2 5 1 2 1 2 4 8 3 5 5 2 2 0 3 5 4 1</div> | 
|  | 899 <div class="OptionsBox"> | 
|  | 900     FingerprintsVector;MACCSKeyCount;322;OrderedNumericalValues;ValuesStri | 
|  | 901 <br/>    ng;14 8 2 0 2 0 4 4 2 1 4 0 0 2 5 10 5 2 1 0 0 2 0 5 13 3 28 5 5 3 0 0 | 
|  | 902 <br/>    0 4 2 1 1 0 1 1 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22 5 3 0 0 0 1 0 | 
|  | 903 <br/>    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 
|  | 904 <br/>    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 0 2 0 0 0 0 0 0 0 0 0 | 
|  | 905 <br/>    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...</div> | 
|  | 906 </dd> | 
|  | 907 <dt><strong><strong>-w, --WorkingDir</strong> <em>DirName</em></strong></dt> | 
|  | 908 <dd> | 
|  | 909 <p>Location of working directory. Default: current directory.</p> | 
|  | 910 </dd> | 
|  | 911 </dl> | 
|  | 912 <p> | 
|  | 913 </p> | 
|  | 914 <h2>EXAMPLES</h2> | 
|  | 915 <p>To generate MACCS keys fingerprints of size 166 in binary bit-vector string format | 
|  | 916 and create a SampleMACCS166FPBin.csv file containing sequential compound IDs along with | 
|  | 917 fingerprints bit-vector strings data, type:</p> | 
|  | 918 <div class="ExampleBox"> | 
|  | 919     % MACCSKeysFingerprints.pl -r SampleMACCS166FPBin -o Sample.sdf</div> | 
|  | 920 <p>To generate MACCS keys fingerprints of size 166 in binary bit-vector string format | 
|  | 921 and create SampleMACCS166FPBin.sdf, SampleMACCS166FPBin.csv and SampleMACCS166FPBin.csv | 
|  | 922 files containing sequential compound IDs in CSV file along with fingerprints bit-vector strings data, type:</p> | 
|  | 923 <div class="ExampleBox"> | 
|  | 924     % MACCSKeysFingerprints.pl --output all -r SampleMACCS166FPBin | 
|  | 925       -o Sample.sdf</div> | 
|  | 926 <p>To generate MACCS keys fingerprints of size 322 in binary bit-vector string format | 
|  | 927 and create a SampleMACCS322FPBin.csv file containing sequential compound IDs along with | 
|  | 928 fingerprints bit-vector strings data, type:</p> | 
|  | 929 <div class="ExampleBox"> | 
|  | 930     % MACCSKeysFingerprints.pl -size 322 -r SampleMACCS322FPBin -o Sample.sdf</div> | 
|  | 931 <p>To generate MACCS keys fingerprints of size 166 corresponding to count of keys in | 
|  | 932 ValuesString format and create a SampleMACCS166FPCount.csv file containing sequential | 
|  | 933 compound IDs along with fingerprints vector strings data, type:</p> | 
|  | 934 <div class="ExampleBox"> | 
|  | 935     % MACCSKeysFingerprints.pl -m MACCSKeyCount -r SampleMACCS166FPCount | 
|  | 936       -o Sample.sdf</div> | 
|  | 937 <p>To generate MACCS keys fingerprints of size 322 corresponding to count of keys in | 
|  | 938 ValuesString format and create a SampleMACCS322FPCount.csv file containing sequential | 
|  | 939 compound IDs along with fingerprints vector strings data, type:</p> | 
|  | 940 <div class="ExampleBox"> | 
|  | 941     % MACCSKeysFingerprints.pl -m MACCSKeyCount -size 322 | 
|  | 942       -r SampleMACCS322FPCount -o Sample.sdf</div> | 
|  | 943 <p>To generate MACCS keys fingerprints of size 166 in hexadecimal bit-vector string format with | 
|  | 944 ascending bits order and create a SampleMACCS166FPHex.csv file containing compound IDs | 
|  | 945 from MolName along with fingerprints bit-vector strings data, type:</p> | 
|  | 946 <div class="ExampleBox"> | 
|  | 947     % MACCSKeysFingerprints.pl -m MACCSKeyBits --size 166 --BitStringFormat | 
|  | 948       HexadecimalString --BitsOrder Ascending --DataFieldsMode CompoundID | 
|  | 949       --CompoundIDMode MolName -r SampleMACCS166FPBin -o Sample.sdf</div> | 
|  | 950 <p>To generate MACCS keys fingerprints of size 166 corresponding to count of keys in | 
|  | 951 IDsAndValuesString format and create a SampleMACCS166FPCount.csv file containing | 
|  | 952 compound IDs from MolName line along with fingerprints vector strings data, type:</p> | 
|  | 953 <div class="ExampleBox"> | 
|  | 954     % MACCSKeysFingerprints.pl -m MACCSKeyCount --size 166 | 
|  | 955       --VectorStringFormat IDsAndValuesString  --DataFieldsMode CompoundID | 
|  | 956       --CompoundIDMode MolName -r SampleMACCS166FPCount -o Sample.sdf</div> | 
|  | 957 <p>To generate MACCS keys fingerprints of size 166 corresponding to count of keys in | 
|  | 958 IDsAndValuesString format and create a SampleMACCS166FPCount.csv file containing | 
|  | 959 compound IDs using specified data field along with fingerprints vector strings data, type:</p> | 
|  | 960 <div class="ExampleBox"> | 
|  | 961     % MACCSKeysFingerprints.pl -m MACCSKeyCount --size 166 | 
|  | 962       --VectorStringFormat IDsAndValuesString  --DataFieldsMode CompoundID | 
|  | 963       --CompoundIDMode DataField --CompoundID Mol_ID -r | 
|  | 964       SampleMACCS166FPCount -o Sample.sdf</div> | 
|  | 965 <p>To generate MACCS keys fingerprints of size 322 corresponding to count of keys in | 
|  | 966 ValuesString format and create a SampleMACCS322FPCount.tsv file containing compound | 
|  | 967 IDs derived from combination of molecule name line and an explicit compound prefix | 
|  | 968 along with fingerprints vector strings data in a column labels MACCSKeyCountFP, type:</p> | 
|  | 969 <div class="ExampleBox"> | 
|  | 970     % MACCSKeysFingerprints.pl -m MACCSKeyCount -size 322 --DataFieldsMode | 
|  | 971       CompoundID --CompoundIDMode MolnameOrLabelPrefix --CompoundID Cmpd | 
|  | 972       --CompoundIDLabel MolID --FingerprintsLabel MACCSKeyCountFP --OutDelim | 
|  | 973       Tab -r SampleMACCS322FPCount -o Sample.sdf</div> | 
|  | 974 <p>To generate MACCS keys fingerprints of size 166 corresponding to count of keys in | 
|  | 975 ValuesString format and create a SampleMACCS166FPCount.csv file containing | 
|  | 976 specific data fields columns along with fingerprints vector strings data, type:</p> | 
|  | 977 <div class="ExampleBox"> | 
|  | 978     % MACCSKeysFingerprints.pl -m MACCSKeyCount --size 166 | 
|  | 979       --VectorStringFormat ValuesString --DataFieldsMode Specify --DataFields | 
|  | 980       Mol_ID  -r SampleMACCS166FPCount -o Sample.sdf</div> | 
|  | 981 <p>To generate MACCS keys fingerprints of size 322 corresponding to count of keys in | 
|  | 982 ValuesString format and create a SampleMACCS322FPCount.csv file containing | 
|  | 983 common data fields columns along with fingerprints vector strings data, type:</p> | 
|  | 984 <div class="ExampleBox"> | 
|  | 985     % MACCSKeysFingerprints.pl -m MACCSKeyCount --size 322 | 
|  | 986       --VectorStringFormat ValuesString --DataFieldsMode Common -r | 
|  | 987       SampleMACCS322FPCount -o Sample.sdf</div> | 
|  | 988 <p>To generate MACCS keys fingerprints of size 166 corresponding to count of keys in | 
|  | 989 ValuesString format and create SampleMACCS166FPCount.sdf, SampleMACCS166FPCount.fpf and | 
|  | 990 SampleMACCS166FPCount.csv files containing all data fields columns in CSV file | 
|  | 991 along with fingerprints vector strings data, type:</p> | 
|  | 992 <div class="ExampleBox"> | 
|  | 993     % MACCSKeysFingerprints.pl -m MACCSKeyCount --size 166 --output all | 
|  | 994       --VectorStringFormat ValuesString --DataFieldsMode All -r | 
|  | 995       SampleMACCS166FPCount -o Sample.sdf</div> | 
|  | 996 <p> | 
|  | 997 </p> | 
|  | 998 <h2>AUTHOR</h2> | 
|  | 999 <p><a href="mailto:msud@san.rr.com">Manish Sud</a></p> | 
|  | 1000 <p> | 
|  | 1001 </p> | 
|  | 1002 <h2>SEE ALSO</h2> | 
|  | 1003 <p><a href="./InfoFingerprintsFiles.html">InfoFingerprintsFiles.pl</a>, <a href="./SimilarityMatricesFingerprints.html">SimilarityMatricesFingerprints.pl</a>, <a href="./AtomNeighborhoodsFingerprints.html">AtomNeighborhoodsFingerprints.pl</a>,  | 
|  | 1004 <a href="./ExtendedConnectivityFingerprints.html">ExtendedConnectivityFingerprints.pl</a>, <a href="./PathLengthFingerprints.html">PathLengthFingerprints.pl</a>,  | 
|  | 1005 <a href="./TopologicalAtomPairsFingerprints.html">TopologicalAtomPairsFingerprints.pl</a>, <a href="./TopologicalAtomTorsionsFingerprints.html">TopologicalAtomTorsionsFingerprints.pl</a>,  | 
|  | 1006 <a href="./TopologicalPharmacophoreAtomPairsFingerprints.html">TopologicalPharmacophoreAtomPairsFingerprints.pl</a>, <a href="./TopologicalPharmacophoreAtomTripletsFingerprints.html">TopologicalPharmacophoreAtomTripletsFingerprints.pl</a> | 
|  | 1007 </p> | 
|  | 1008 <p> | 
|  | 1009 </p> | 
|  | 1010 <h2>COPYRIGHT</h2> | 
|  | 1011 <p>Copyright (C) 2015 Manish Sud. All rights reserved.</p> | 
|  | 1012 <p>This file is part of MayaChemTools.</p> | 
|  | 1013 <p>MayaChemTools is free software; you can redistribute it and/or modify it under | 
|  | 1014 the terms of the GNU Lesser General Public License as published by the Free | 
|  | 1015 Software Foundation; either version 3 of the License, or (at your option) | 
|  | 1016 any later version.</p> | 
|  | 1017 <p> </p><p> </p><div class="DocNav"> | 
|  | 1018 <table width="100%" border=0 cellpadding=0 cellspacing=2> | 
|  | 1019 <tr align="left" valign="top"><td width="33%" align="left"><a href="./JoinTextFiles.html" title="JoinTextFiles.html">Previous</a>  <a href="./index.html" title="Table of Contents">TOC</a>  <a href="./MergeTextFiles.html" title="MergeTextFiles.html">Next</a></td><td width="34%" align="middle"><strong>March 29, 2015</strong></td><td width="33%" align="right"><strong>MACCSKeysFingerprints.pl</strong></td></tr> | 
|  | 1020 </table> | 
|  | 1021 </div> | 
|  | 1022 <br /> | 
|  | 1023 <center> | 
|  | 1024 <img src="../../images/h2o2.png"> | 
|  | 1025 </center> | 
|  | 1026 </body> | 
|  | 1027 </html> |