0
|
1 <html>
|
|
2 <head>
|
|
3 <title>MayaChemTools:Documentation:MACCSKeysFingerprints.pl</title>
|
|
4 <meta http-equiv="content-type" content="text/html;charset=utf-8">
|
|
5 <link rel="stylesheet" type="text/css" href="../../css/MayaChemTools.css">
|
|
6 </head>
|
|
7 <body leftmargin="20" rightmargin="20" topmargin="10" bottommargin="10">
|
|
8 <br/>
|
|
9 <center>
|
|
10 <a href="http://www.mayachemtools.org" title="MayaChemTools Home"><img src="../../images/MayaChemToolsLogo.gif" border="0" alt="MayaChemTools"></a>
|
|
11 </center>
|
|
12 <br/>
|
|
13 <div class="DocNav">
|
|
14 <table width="100%" border=0 cellpadding=0 cellspacing=2>
|
|
15 <tr align="left" valign="top"><td width="33%" align="left"><a href="./JoinTextFiles.html" title="JoinTextFiles.html">Previous</a> <a href="./index.html" title="Table of Contents">TOC</a> <a href="./MergeTextFiles.html" title="MergeTextFiles.html">Next</a></td><td width="34%" align="middle"><strong>MACCSKeysFingerprints.pl</strong></td><td width="33%" align="right"><a href="././code/MACCSKeysFingerprints.html" title="View source code">Code</a> | <a href="./../pdf/MACCSKeysFingerprints.pdf" title="PDF US Letter Size">PDF</a> | <a href="./../pdfgreen/MACCSKeysFingerprints.pdf" title="PDF US Letter Size with narrow margins: www.changethemargins.com">PDFGreen</a> | <a href="./../pdfa4/MACCSKeysFingerprints.pdf" title="PDF A4 Size">PDFA4</a> | <a href="./../pdfa4green/MACCSKeysFingerprints.pdf" title="PDF A4 Size with narrow margins: www.changethemargins.com">PDFA4Green</a></td></tr>
|
|
16 </table>
|
|
17 </div>
|
|
18 <p>
|
|
19 </p>
|
|
20 <h2>NAME</h2>
|
|
21 <p>MACCSKeysFingerprints.pl - Generate MACCS key fingerprints for SD files</p>
|
|
22 <p>
|
|
23 </p>
|
|
24 <h2>SYNOPSIS</h2>
|
|
25 <p>MACCSKeysFingerprints.pl SDFile(s)...</p>
|
|
26 <p>MACCSKeysFingerprints.pl [<strong>--AromaticityModel</strong> <em>AromaticityModelType</em>]
|
|
27 [<strong>--BitsOrder</strong> <em>Ascending | Descending</em>]
|
|
28 [<strong>-b, --BitStringFormat</strong> <em>BinaryString | HexadecimalString</em>]
|
|
29 [<strong>--CompoundID</strong> <em>DataFieldName or LabelPrefixString</em>] [<strong>--CompoundIDLabel</strong> <em>text</em>]
|
|
30 [<strong>--CompoundIDMode</strong> <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em>]
|
|
31 [<strong>--DataFields</strong> <em>"FieldLabel1,FieldLabel2,..."</em>] [<strong>-d, --DataFieldsMode</strong> <em>All | Common | Specify | CompoundID</em>]
|
|
32 [<strong>-f, --Filter</strong> <em>Yes | No</em>] [<strong>--FingerprintsLabel</strong> <em>text</em>] [<strong>-h, --help</strong>] [<strong>-k, --KeepLargestComponent</strong> <em>Yes | No</em>]
|
|
33 [<strong>-m, --mode</strong> <em>MACCSKeyBits | MACCSKeyCount</em>] [<strong>--OutDelim</strong> <em>comma | tab | semicolon</em>]
|
|
34 [<strong>--output</strong> <em>SD | FP | text | all</em>] [<strong>-o, --overwrite</strong>]
|
|
35 [<strong>-q, --quote</strong> <em>Yes | No</em>] [<strong>-r, --root</strong> <em>RootName</em>] [<strong>-s, --size</strong> <em>number</em>]
|
|
36 [<strong>-v, --VectorStringFormat</strong> <em>IDsAndValuesString | IDsAndValuesPairsString | ValuesAndIDsString | ValuesAndIDsPairsString</em>]
|
|
37 [<strong>-w, --WorkingDir</strong> <em>DirName</em>]</p>
|
|
38 <p>
|
|
39 </p>
|
|
40 <h2>DESCRIPTION</h2>
|
|
41 <p>Generate MACCS (Molecular ACCess System) keys fingerprints [ Ref 45-47 ] for <em>SDFile(s)</em>
|
|
42 and create appropriate SD, FP or CSV/TSV text file(s) containing fingerprints bit-vector or
|
|
43 vector strings corresponding to molecular fingerprints.</p>
|
|
44 <p>Multiple SDFile names are separated by spaces. The valid file extensions are <em>.sdf</em>
|
|
45 and <em>.sd</em>. All other file names are ignored. All the SD files in a current directory
|
|
46 can be specified either by <em>*.sdf</em> or the current directory name.</p>
|
|
47 <p>For each MACCS keys definition, atoms are processed to determine their membership to the key
|
|
48 and the appropriate molecular fingerprints strings are generated. An atom can belong to multiple
|
|
49 MACCS keys.</p>
|
|
50 <p>For <em>MACCSKeyBits</em> value of <strong>-m, --mode</strong> option, a fingerprint bit-vector string containing
|
|
51 zeros and ones is generated and for <em>MACCSKeyCount</em> value, a fingerprint vector string
|
|
52 corresponding to number of MACCS keys [ Ref 45-47 ] is generated.</p>
|
|
53 <p><em>MACCSKeyBits | MACCSKeyCount</em> values for <strong>-m, --mode</strong> option along with two possible
|
|
54 <em>166 | 322</em> values of <strong>-s, --size</strong> supports generation of four different types of MACCS
|
|
55 keys fingerprint: <em>MACCS166KeyBits, MACCS166KeyCount, MACCS322KeyBits, MACCS322KeyCount</em>.</p>
|
|
56 <p>Example of <em>SD</em> file containing MAACS keys fingerprints string data:</p>
|
|
57 <div class="OptionsBox">
|
|
58 ... ...
|
|
59 <br/> ... ...
|
|
60 <br/> $$$$
|
|
61 <br/> ... ...
|
|
62 <br/> ... ...
|
|
63 <br/> ... ...
|
|
64 <br/> 41 44 0 0 0 0 0 0 0 0999 V2000
|
|
65 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
|
|
66 <br/> ... ...
|
|
67 <br/> 2 3 1 0 0 0 0
|
|
68 <br/> ... ...
|
|
69 <br/> M END
|
|
70 <br/> > <CmpdID>
|
|
71 <br/> Cmpd1</div>
|
|
72 <div class="OptionsBox">
|
|
73 > <MACCSKeysFingerprints>
|
|
74 <br/> FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;000000000
|
|
75 <br/> 00000000000000000000000000000000100100001001000000001001000000001110001
|
|
76 <br/> 00101010111100011011000100110110000011011110100110111111111111011111111
|
|
77 <br/> 11111111110111000</div>
|
|
78 <div class="OptionsBox">
|
|
79 $$$$
|
|
80 <br/> ... ...
|
|
81 <br/> ... ...</div>
|
|
82 <p>Example of <em>FP</em> file containing MAACS keys fingerprints string data:</p>
|
|
83 <div class="OptionsBox">
|
|
84 #
|
|
85 <br/> # Package = MayaChemTools 7.4
|
|
86 <br/> # Release Date = Oct 21, 2010
|
|
87 <br/> #
|
|
88 <br/> # TimeStamp = Fri Mar 11 14:57:24 2011
|
|
89 <br/> #
|
|
90 <br/> # FingerprintsStringType = FingerprintsBitVector
|
|
91 <br/> #
|
|
92 <br/> # Description = MACCSKeyBits
|
|
93 <br/> # Size = 166
|
|
94 <br/> # BitStringFormat = BinaryString
|
|
95 <br/> # BitsOrder = Ascending
|
|
96 <br/> #
|
|
97 <br/> Cmpd1 00000000000000000000000000000000000000000100100001001000000001...
|
|
98 <br/> Cmpd2 00000000000000000000000010000000001000000010000000001000000000...
|
|
99 <br/> ... ...
|
|
100 <br/> ... ..</div>
|
|
101 <p>Example of CSV <em>Text</em> file containing MAACS keys fingerprints string data:</p>
|
|
102 <div class="OptionsBox">
|
|
103 "CompoundID","MACCSKeysFingerprints"
|
|
104 <br/> "Cmpd1","FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;
|
|
105 <br/> 00000000000000000000000000000000000000000100100001001000000001001000000
|
|
106 <br/> 00111000100101010111100011011000100110110000011011110100110111111111111
|
|
107 <br/> 01111111111111111110111000"
|
|
108 <br/> ... ...
|
|
109 <br/> ... ...</div>
|
|
110 <p>The current release of MayaChemTools generates the following types of MACCS keys
|
|
111 fingerprints bit-vector and vector strings:</p>
|
|
112 <div class="OptionsBox">
|
|
113 FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;00000000
|
|
114 <br/> 0000000000000000000000000000000001001000010010000000010010000000011100
|
|
115 <br/> 0100101010111100011011000100110110000011011110100110111111111111011111
|
|
116 <br/> 11111111111110111000</div>
|
|
117 <div class="OptionsBox">
|
|
118 FingerprintsBitVector;MACCSKeyBits;166;HexadecimalString;Ascending;000
|
|
119 <br/> 000000021210210e845f8d8c60b79dffbffffd1</div>
|
|
120 <div class="OptionsBox">
|
|
121 FingerprintsBitVector;MACCSKeyBits;322;BinaryString;Ascending;11101011
|
|
122 <br/> 1110011111100101111111000111101100110000000000000011100010000000000000
|
|
123 <br/> 0000000000000000000000000000000000000000000000101000000000000000000000
|
|
124 <br/> 0000000000000000000000000000000000000000000000000000000000000000000000
|
|
125 <br/> 0000000000000000000000000000000000000011000000000000000000000000000000
|
|
126 <br/> 0000000000000000000000000000000000000000</div>
|
|
127 <div class="OptionsBox">
|
|
128 FingerprintsBitVector;MACCSKeyBits;322;HexadecimalString;Ascending;7d7
|
|
129 <br/> e7af3edc000c1100000000000000500000000000000000000000000000000300000000
|
|
130 <br/> 000000000</div>
|
|
131 <div class="OptionsBox">
|
|
132 FingerprintsVector;MACCSKeyCount;166;OrderedNumericalValues;ValuesStri
|
|
133 <br/> ng;0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
|
|
134 <br/> 0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 4 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0
|
|
135 <br/> 0 0 0 0 1 1 8 0 0 0 1 0 0 1 0 1 0 1 0 3 1 3 1 0 0 0 1 2 0 11 1 0 0 0
|
|
136 <br/> 5 0 0 1 2 0 1 1 0 0 0 0 0 1 1 0 1 1 1 1 0 4 0 0 1 1 0 4 6 1 1 1 2 1 1
|
|
137 <br/> 3 5 2 2 0 5 3 5 1 1 2 5 1 2 1 2 4 8 3 5 5 2 2 0 3 5 4 1</div>
|
|
138 <div class="OptionsBox">
|
|
139 FingerprintsVector;MACCSKeyCount;322;OrderedNumericalValues;ValuesStri
|
|
140 <br/> ng;14 8 2 0 2 0 4 4 2 1 4 0 0 2 5 10 5 2 1 0 0 2 0 5 13 3 28 5 5 3 0 0
|
|
141 <br/> 0 4 2 1 1 0 1 1 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22 5 3 0 0 0 1 0
|
|
142 <br/> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
|
|
143 <br/> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 0 2 0 0 0 0 0 0 0 0 0
|
|
144 <br/> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...</div>
|
|
145 <p>
|
|
146 </p>
|
|
147 <h2>OPTIONS</h2>
|
|
148 <dl>
|
|
149 <dt><strong><strong>--AromaticityModel</strong> <em>MDLAromaticityModel | TriposAromaticityModel | MMFFAromaticityModel | ChemAxonBasicAromaticityModel | ChemAxonGeneralAromaticityModel | DaylightAromaticityModel | MayaChemToolsAromaticityModel</em></strong></dt>
|
|
150 <dd>
|
|
151 <p>Specify aromaticity model to use during detection of aromaticity. Possible values in the current
|
|
152 release are: <em>MDLAromaticityModel, TriposAromaticityModel, MMFFAromaticityModel,
|
|
153 ChemAxonBasicAromaticityModel, ChemAxonGeneralAromaticityModel, DaylightAromaticityModel
|
|
154 or MayaChemToolsAromaticityModel</em>. Default value: <em>MayaChemToolsAromaticityModel</em>.</p>
|
|
155 <p>The supported aromaticity model names along with model specific control parameters
|
|
156 are defined in <strong>AromaticityModelsData.csv</strong>, which is distributed with the current release
|
|
157 and is available under <strong>lib/data</strong> directory. <strong>Molecule.pm</strong> module retrieves data from
|
|
158 this file during class instantiation and makes it available to method <strong>DetectAromaticity</strong>
|
|
159 for detecting aromaticity corresponding to a specific model.</p>
|
|
160 </dd>
|
|
161 <dt><strong><strong>--BitsOrder</strong> <em>Ascending | Descending</em></strong></dt>
|
|
162 <dd>
|
|
163 <p>Bits order to use during generation of fingerprints bit-vector string for <em>MACCSKeyBits</em> value of
|
|
164 <strong>-m, --mode</strong> option. Possible values: <em>Ascending, Descending</em>. Default: <em>Ascending</em>.</p>
|
|
165 <p><em>Ascending</em> bit order which corresponds to first bit in each byte as the lowest bit as
|
|
166 opposed to the highest bit.</p>
|
|
167 <p>Internally, bits are stored in <em>Ascending</em> order using Perl vec function. Regardless
|
|
168 of machine order, big-endian or little-endian, vec function always considers first
|
|
169 string byte as the lowest byte and first bit within each byte as the lowest bit.</p>
|
|
170 </dd>
|
|
171 <dt><strong><strong>-b, --BitStringFormat</strong> <em>BinaryString | HexadecimalString</em></strong></dt>
|
|
172 <dd>
|
|
173 <p>Format of fingerprints bit-vector string data in output SD, FP or CSV/TSV text file(s) specified by
|
|
174 <strong>--output</strong> used during <em>MACCSKeyBits</em> value of <strong>-m, --mode</strong> option. Possible
|
|
175 values: <em>BinaryString, HexadecimalString</em>. Default value: <em>BinaryString</em>.</p>
|
|
176 <p><em>BinaryString</em> corresponds to an ASCII string containing 1s and 0s. <em>HexadecimalString</em>
|
|
177 contains bit values in ASCII hexadecimal format.</p>
|
|
178 <p>Examples:</p>
|
|
179 <div class="OptionsBox">
|
|
180 FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;00000000
|
|
181 <br/> 0000000000000000000000000000000001001000010010000000010010000000011100
|
|
182 <br/> 0100101010111100011011000100110110000011011110100110111111111111011111
|
|
183 <br/> 11111111111110111000</div>
|
|
184 <div class="OptionsBox">
|
|
185 FingerprintsBitVector;MACCSKeyBits;166;HexadecimalString;Ascending;000
|
|
186 <br/> 000000021210210e845f8d8c60b79dffbffffd1</div>
|
|
187 <div class="OptionsBox">
|
|
188 FingerprintsBitVector;MACCSKeyBits;322;BinaryString;Ascending;11101011
|
|
189 <br/> 1110011111100101111111000111101100110000000000000011100010000000000000
|
|
190 <br/> 0000000000000000000000000000000000000000000000101000000000000000000000
|
|
191 <br/> 0000000000000000000000000000000000000000000000000000000000000000000000
|
|
192 <br/> 0000000000000000000000000000000000000011000000000000000000000000000000
|
|
193 <br/> 0000000000000000000000000000000000000000</div>
|
|
194 <div class="OptionsBox">
|
|
195 FingerprintsBitVector;MACCSKeyBits;322;HexadecimalString;Ascending;7d7
|
|
196 <br/> e7af3edc000c1100000000000000500000000000000000000000000000000300000000
|
|
197 <br/> 000000000</div>
|
|
198 </dd>
|
|
199 <dt><strong><strong>--CompoundID</strong> <em>DataFieldName or LabelPrefixString</em></strong></dt>
|
|
200 <dd>
|
|
201 <p>This value is <strong>--CompoundIDMode</strong> specific and indicates how compound ID is generated.</p>
|
|
202 <p>For <em>DataField</em> value of <strong>--CompoundIDMode</strong> option, it corresponds to datafield label name
|
|
203 whose value is used as compound ID; otherwise, it's a prefix string used for generating compound
|
|
204 IDs like LabelPrefixString<Number>. Default value, <em>Cmpd</em>, generates compound IDs which
|
|
205 look like Cmpd<Number>.</p>
|
|
206 <p>Examples for <em>DataField</em> value of <strong>--CompoundIDMode</strong>:</p>
|
|
207 <div class="OptionsBox">
|
|
208 MolID
|
|
209 <br/> ExtReg</div>
|
|
210 <p>Examples for <em>LabelPrefix</em> or <em>MolNameOrLabelPrefix</em> value of <strong>--CompoundIDMode</strong>:</p>
|
|
211 <div class="OptionsBox">
|
|
212 Compound</div>
|
|
213 <p>The value specified above generates compound IDs which correspond to Compound<Number>
|
|
214 instead of default value of Cmpd<Number>.</p>
|
|
215 </dd>
|
|
216 <dt><strong><strong>--CompoundIDLabel</strong> <em>text</em></strong></dt>
|
|
217 <dd>
|
|
218 <p>Specify compound ID column label for FP or CSV/TSV text file(s) used during <em>CompoundID</em> value
|
|
219 of <strong>--DataFieldsMode</strong> option. Default: <em>CompoundID</em>.</p>
|
|
220 </dd>
|
|
221 <dt><strong><strong>--CompoundIDMode</strong> <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em></strong></dt>
|
|
222 <dd>
|
|
223 <p>Specify how to generate compound IDs and write to FP or CSV/TSV text file(s) along with generated
|
|
224 fingerprints for <em>FP | text | all</em> values of <strong>--output</strong> option: use a <em>SDFile(s)</em> datafield value;
|
|
225 use molname line from <em>SDFile(s)</em>; generate a sequential ID with specific prefix; use combination
|
|
226 of both MolName and LabelPrefix with usage of LabelPrefix values for empty molname lines.</p>
|
|
227 <p>Possible values: <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em>.
|
|
228 Default: <em>LabelPrefix</em>.</p>
|
|
229 <p>For <em>MolNameAndLabelPrefix</em> value of <strong>--CompoundIDMode</strong>, molname line in <em>SDFile(s)</em> takes
|
|
230 precedence over sequential compound IDs generated using <em>LabelPrefix</em> and only empty molname
|
|
231 values are replaced with sequential compound IDs.</p>
|
|
232 <p>This is only used for <em>CompoundID</em> value of <strong>--DataFieldsMode</strong> option.</p>
|
|
233 </dd>
|
|
234 <dt><strong><strong>--DataFields</strong> <em>"FieldLabel1,FieldLabel2,..."</em></strong></dt>
|
|
235 <dd>
|
|
236 <p>Comma delimited list of <em>SDFiles(s)</em> data fields to extract and write to CSV/TSV text file(s) along
|
|
237 with generated fingerprints for <em>text | all</em> values of <strong>--output</strong> option.</p>
|
|
238 <p>This is only used for <em>Specify</em> value of <strong>--DataFieldsMode</strong> option.</p>
|
|
239 <p>Examples:</p>
|
|
240 <div class="OptionsBox">
|
|
241 Extreg
|
|
242 <br/> MolID,CompoundName</div>
|
|
243 </dd>
|
|
244 <dt><strong><strong>-d, --DataFieldsMode</strong> <em>All | Common | Specify | CompoundID</em></strong></dt>
|
|
245 <dd>
|
|
246 <p>Specify how data fields in <em>SDFile(s)</em> are transferred to output CSV/TSV text file(s) along
|
|
247 with generated fingerprints for <em>text | all</em> values of <strong>--output</strong> option: transfer all SD
|
|
248 data field; transfer SD data files common to all compounds; extract specified data fields;
|
|
249 generate a compound ID using molname line, a compound prefix, or a combination of both.
|
|
250 Possible values: <em>All | Common | specify | CompoundID</em>. Default value: <em>CompoundID</em>.</p>
|
|
251 </dd>
|
|
252 <dt><strong><strong>-f, --Filter</strong> <em>Yes | No</em></strong></dt>
|
|
253 <dd>
|
|
254 <p>Specify whether to check and filter compound data in SDFile(s). Possible values: <em>Yes or No</em>.
|
|
255 Default value: <em>Yes</em>.</p>
|
|
256 <p>By default, compound data is checked before calculating fingerprints and compounds containing
|
|
257 atom data corresponding to non-element symbols or no atom data are ignored.</p>
|
|
258 </dd>
|
|
259 <dt><strong><strong>--FingerprintsLabel</strong> <em>text</em></strong></dt>
|
|
260 <dd>
|
|
261 <p>SD data label or text file column label to use for fingerprints string in output SD or
|
|
262 CSV/TSV text file(s) specified by <strong>--output</strong>. Default value: <em>MACCSKeyFingerprints</em>.</p>
|
|
263 </dd>
|
|
264 <dt><strong><strong>-h, --help</strong></strong></dt>
|
|
265 <dd>
|
|
266 <p>Print this help message.</p>
|
|
267 </dd>
|
|
268 <dt><strong><strong>-k, --KeepLargestComponent</strong> <em>Yes | No</em></strong></dt>
|
|
269 <dd>
|
|
270 <p>Generate fingerprints for only the largest component in molecule. Possible values:
|
|
271 <em>Yes or No</em>. Default value: <em>Yes</em>.</p>
|
|
272 <p>For molecules containing multiple connected components, fingerprints can be generated
|
|
273 in two different ways: use all connected components or just the largest connected
|
|
274 component. By default, all atoms except for the largest connected component are
|
|
275 deleted before generation of fingerprints.</p>
|
|
276 </dd>
|
|
277 <dt><strong><strong>-m, --mode</strong> <em>MACCSKeyBits | MACCSKeyCount</em></strong></dt>
|
|
278 <dd>
|
|
279 <p>Specify type of MACCS keys [ Ref 45-47 ] fingerprints to generate for molecules in <em>SDFile(s)</em>.
|
|
280 Possible values: <em>MACCSKeyBits, MACCSKeyCount</em>. Default value: <em>MACCSKeyBits</em>.</p>
|
|
281 <p>For <em>MACCSKeyBits</em> value of <strong>-m, --mode</strong> option, a fingerprint bit-vector string containing
|
|
282 zeros and ones is generated and for <em>MACCSKeyCount</em> value, a fingerprint vector string
|
|
283 corresponding to number of MACCS keys is generated.</p>
|
|
284 <p><em>MACCSKeyBits | MACCSKeyCount</em> values for <strong>-m, --mode</strong> option along with two possible
|
|
285 <em>166 | 322</em> values of <strong>-s, --size</strong> supports generation of four different types of MACCS
|
|
286 keys fingerprint: <em>MACCS166KeyBits, MACCS166KeyCount, MACCS322KeyBits, MACCS322KeyCount</em>.</p>
|
|
287 <p>Definition of MACCS keys uses the following atom and bond symbols to define atom and
|
|
288 bond environments:</p>
|
|
289 <div class="OptionsBox">
|
|
290 Atom symbols for 166 keys [ Ref 47 ]:</div>
|
|
291 <div class="OptionsBox">
|
|
292 A : Any valid periodic table element symbol
|
|
293 <br/> Q : Hetro atoms; any non-C or non-H atom
|
|
294 <br/> X : Halogens; F, Cl, Br, I
|
|
295 <br/> Z : Others; other than H, C, N, O, Si, P, S, F, Cl, Br, I</div>
|
|
296 <div class="OptionsBox">
|
|
297 Atom symbols for 322 keys [ Ref 46 ]:</div>
|
|
298 <div class="OptionsBox">
|
|
299 A : Any valid periodic table element symbol
|
|
300 <br/> Q : Hetro atoms; any non-C or non-H atom
|
|
301 <br/> X : Others; other than H, C, N, O, Si, P, S, F, Cl, Br, I
|
|
302 <br/> Z is neither defined nor used</div>
|
|
303 <div class="OptionsBox">
|
|
304 Bond types:</div>
|
|
305 <div class="OptionsBox">
|
|
306 - : Single
|
|
307 <br/> = : Double
|
|
308 <br/> T : Triple
|
|
309 <br/> # : Triple
|
|
310 <br/> ~ : Single or double query bond
|
|
311 <br/> % : An aromatic query bond</div>
|
|
312 <div class="OptionsBox">
|
|
313 None : Any bond type; no explicit bond specified</div>
|
|
314 <div class="OptionsBox">
|
|
315 $ : Ring bond; $ before a bond type specifies ring bond
|
|
316 <br/> ! : Chain or non-ring bond; ! before a bond type specifies chain bond</div>
|
|
317 <div class="OptionsBox">
|
|
318 @ : A ring linkage and the number following it specifies the
|
|
319 atoms position in the line, thus @1 means linked back to the first
|
|
320 atom in the list.</div>
|
|
321 <div class="OptionsBox">
|
|
322 Aromatic: Kekule or Arom5</div>
|
|
323 <div class="OptionsBox">
|
|
324 Kekule: Bonds in 6-membered rings with alternate single/double bonds
|
|
325 or perimeter bonds
|
|
326 <br/> Arom5: Bonds in 5-membered rings with two double bonds and a hetro
|
|
327 atom at the apex of the ring.</div>
|
|
328 <p>MACCS 166 keys [ Ref 45-47 ] are defined as follows:</p>
|
|
329 <div class="OptionsBox">
|
|
330 Key Description</div>
|
|
331 <div class="OptionsBox">
|
|
332 1 ISOTOPE
|
|
333 <br/> 2 103 < ATOMIC NO. < 256
|
|
334 <br/> 3 GROUP IVA,VA,VIA PERIODS 4-6 (Ge...)
|
|
335 <br/> 4 ACTINIDE
|
|
336 <br/> 5 GROUP IIIB,IVB (Sc...)
|
|
337 <br/> 6 LANTHANIDE
|
|
338 <br/> 7 GROUP VB,VIB,VIIB (V...)
|
|
339 <br/> 8 QAAA@1
|
|
340 <br/> 9 GROUP VIII (Fe...)
|
|
341 <br/> 10 GROUP IIA (ALKALINE EARTH)
|
|
342 <br/> 11 4M RING
|
|
343 <br/> 12 GROUP IB,IIB (Cu...)
|
|
344 <br/> 13 ON(C)C
|
|
345 <br/> 14 S-S
|
|
346 <br/> 15 OC(O)O
|
|
347 <br/> 16 QAA@1
|
|
348 <br/> 17 CTC
|
|
349 <br/> 18 GROUP IIIA (B...)
|
|
350 <br/> 19 7M RING
|
|
351 <br/> 20 SI
|
|
352 <br/> 21 C=C(Q)Q
|
|
353 <br/> 22 3M RING
|
|
354 <br/> 23 NC(O)O
|
|
355 <br/> 24 N-O
|
|
356 <br/> 25 NC(N)N
|
|
357 <br/> 26 C$=C($A)$A
|
|
358 <br/> 27 I
|
|
359 <br/> 28 QCH2Q
|
|
360 <br/> 29 P
|
|
361 <br/> 30 CQ(C)(C)A
|
|
362 <br/> 31 QX
|
|
363 <br/> 32 CSN
|
|
364 <br/> 33 NS
|
|
365 <br/> 34 CH2=A
|
|
366 <br/> 35 GROUP IA (ALKALI METAL)
|
|
367 <br/> 36 S HETEROCYCLE
|
|
368 <br/> 37 NC(O)N
|
|
369 <br/> 38 NC(C)N
|
|
370 <br/> 39 OS(O)O
|
|
371 <br/> 40 S-O
|
|
372 <br/> 41 CTN
|
|
373 <br/> 42 F
|
|
374 <br/> 43 QHAQH
|
|
375 <br/> 44 OTHER
|
|
376 <br/> 45 C=CN
|
|
377 <br/> 46 BR
|
|
378 <br/> 47 SAN
|
|
379 <br/> 48 OQ(O)O
|
|
380 <br/> 49 CHARGE
|
|
381 <br/> 50 C=C(C)C
|
|
382 <br/> 51 CSO
|
|
383 <br/> 52 NN
|
|
384 <br/> 53 QHAAAQH
|
|
385 <br/> 54 QHAAQH
|
|
386 <br/> 55 OSO
|
|
387 <br/> 56 ON(O)C
|
|
388 <br/> 57 O HETEROCYCLE
|
|
389 <br/> 58 QSQ
|
|
390 <br/> 59 Snot%A%A
|
|
391 <br/> 60 S=O
|
|
392 <br/> 61 AS(A)A
|
|
393 <br/> 62 A$A!A$A
|
|
394 <br/> 63 N=O
|
|
395 <br/> 64 A$A!S
|
|
396 <br/> 65 C%N
|
|
397 <br/> 66 CC(C)(C)A
|
|
398 <br/> 67 QS
|
|
399 <br/> 68 QHQH (&...)
|
|
400 <br/> 69 QQH
|
|
401 <br/> 70 QNQ
|
|
402 <br/> 71 NO
|
|
403 <br/> 72 OAAO
|
|
404 <br/> 73 S=A
|
|
405 <br/> 74 CH3ACH3
|
|
406 <br/> 75 A!N$A
|
|
407 <br/> 76 C=C(A)A
|
|
408 <br/> 77 NAN
|
|
409 <br/> 78 C=N
|
|
410 <br/> 79 NAAN
|
|
411 <br/> 80 NAAAN
|
|
412 <br/> 81 SA(A)A
|
|
413 <br/> 82 ACH2QH
|
|
414 <br/> 83 QAAAA@1
|
|
415 <br/> 84 NH2
|
|
416 <br/> 85 CN(C)C
|
|
417 <br/> 86 CH2QCH2
|
|
418 <br/> 87 X!A$A
|
|
419 <br/> 88 S
|
|
420 <br/> 89 OAAAO
|
|
421 <br/> 90 QHAACH2A
|
|
422 <br/> 91 QHAAACH2A
|
|
423 <br/> 92 OC(N)C
|
|
424 <br/> 93 QCH3
|
|
425 <br/> 94 QN
|
|
426 <br/> 95 NAAO
|
|
427 <br/> 96 5M RING
|
|
428 <br/> 97 NAAAO
|
|
429 <br/> 98 QAAAAA@1
|
|
430 <br/> 99 C=C
|
|
431 <br/> 100 ACH2N
|
|
432 <br/> 101 8M RING
|
|
433 <br/> 102 QO
|
|
434 <br/> 103 CL
|
|
435 <br/> 104 QHACH2A
|
|
436 <br/> 105 A$A($A)$A
|
|
437 <br/> 106 QA(Q)Q
|
|
438 <br/> 107 XA(A)A
|
|
439 <br/> 108 CH3AAACH2A
|
|
440 <br/> 109 ACH2O
|
|
441 <br/> 110 NCO
|
|
442 <br/> 111 NACH2A
|
|
443 <br/> 112 AA(A)(A)A
|
|
444 <br/> 113 Onot%A%A
|
|
445 <br/> 114 CH3CH2A
|
|
446 <br/> 115 CH3ACH2A
|
|
447 <br/> 116 CH3AACH2A
|
|
448 <br/> 117 NAO
|
|
449 <br/> 118 ACH2CH2A > 1
|
|
450 <br/> 119 N=A
|
|
451 <br/> 120 HETEROCYCLIC ATOM > 1 (&...)
|
|
452 <br/> 121 N HETEROCYCLE
|
|
453 <br/> 122 AN(A)A
|
|
454 <br/> 123 OCO
|
|
455 <br/> 124 QQ
|
|
456 <br/> 125 AROMATIC RING > 1
|
|
457 <br/> 126 A!O!A
|
|
458 <br/> 127 A$A!O > 1 (&...)
|
|
459 <br/> 128 ACH2AAACH2A
|
|
460 <br/> 129 ACH2AACH2A
|
|
461 <br/> 130 QQ > 1 (&...)
|
|
462 <br/> 131 QH > 1
|
|
463 <br/> 132 OACH2A
|
|
464 <br/> 133 A$A!N
|
|
465 <br/> 134 X (HALOGEN)
|
|
466 <br/> 135 Nnot%A%A
|
|
467 <br/> 136 O=A > 1
|
|
468 <br/> 137 HETEROCYCLE
|
|
469 <br/> 138 QCH2A > 1 (&...)
|
|
470 <br/> 139 OH
|
|
471 <br/> 140 O > 3 (&...)
|
|
472 <br/> 141 CH3 > 2 (&...)
|
|
473 <br/> 142 N > 1
|
|
474 <br/> 143 A$A!O
|
|
475 <br/> 144 Anot%A%Anot%A
|
|
476 <br/> 145 6M RING > 1
|
|
477 <br/> 146 O > 2
|
|
478 <br/> 147 ACH2CH2A
|
|
479 <br/> 148 AQ(A)A
|
|
480 <br/> 149 CH3 > 1
|
|
481 <br/> 150 A!A$A!A
|
|
482 <br/> 151 NH
|
|
483 <br/> 152 OC(C)C
|
|
484 <br/> 153 QCH2A
|
|
485 <br/> 154 C=O
|
|
486 <br/> 155 A!CH2!A
|
|
487 <br/> 156 NA(A)A
|
|
488 <br/> 157 C-O
|
|
489 <br/> 158 C-N
|
|
490 <br/> 159 O > 1
|
|
491 <br/> 160 CH3
|
|
492 <br/> 161 N
|
|
493 <br/> 162 AROMATIC
|
|
494 <br/> 163 6M RING
|
|
495 <br/> 164 O
|
|
496 <br/> 165 RING
|
|
497 <br/> 166 FRAGMENTS</div>
|
|
498 <p>MACCS 322 keys set as defined in tables 1, 2 and 3 [ Ref 46 ] include:</p>
|
|
499 <div class="OptionsBox">
|
|
500 . 26 atom properties of type P, as listed in Table 1
|
|
501 <br/> . 32 one-atom environments, as listed in Table 3
|
|
502 <br/> . 264 atom-bond-atom combinations listed in Table 4</div>
|
|
503 <p>Total number of keys in three tables is : 322</p>
|
|
504 <p>Atom symbol, X, used for 322 keys [ Ref 46 ] doesn't refer to Halogens as it does for 166 keys. In
|
|
505 order to keep the definition of 322 keys consistent with the published definitions, the symbol X is
|
|
506 used to imply "others" atoms, but it's internally mapped to symbol X as defined for 166 keys
|
|
507 during the generation of key values.</p>
|
|
508 <p>Atom properties-based keys (26):</p>
|
|
509 <div class="OptionsBox">
|
|
510 Key Description
|
|
511 <br/> 1 A(AAA) or AA(A)A - atom with at least three neighbors
|
|
512 <br/> 2 Q - heteroatom
|
|
513 <br/> 3 Anot%not-A - atom involved in one or more multiple bonds, not aromatic
|
|
514 <br/> 4 A(AAAA) or AA(A)(A)A - atom with at least four neighbors
|
|
515 <br/> 5 A(QQ) or QA(Q) - atom with at least two heteroatom neighbors
|
|
516 <br/> 6 A(QQQ) or QA(Q)Q - atom with at least three heteroatom neighbors
|
|
517 <br/> 7 QH - heteroatom with at least one hydrogen attached
|
|
518 <br/> 8 CH2(AA) or ACH2A - carbon with at least two single bonds and at least
|
|
519 two hydrogens attached
|
|
520 <br/> 9 CH3(A) or ACH3 - carbon with at least one single bond and at least three
|
|
521 hydrogens attached
|
|
522 <br/> 10 Halogen
|
|
523 <br/> 11 A(-A-A-A) or A-A(-A)-A - atom has at least three single bonds
|
|
524 <br/> 12 AAAAAA@1 > 2 - atom is in at least two different six-membered rings
|
|
525 <br/> 13 A($A$A$A) or A$A($A)$A - atom has more than two ring bonds
|
|
526 <br/> 14 A$A!A$A - atom is at a ring/chain boundary. When a comparison is done
|
|
527 with another atom the path passes through the chain bond.
|
|
528 <br/> 15 Anot%A%Anot%A - atom is at an aromatic/nonaromatic boundary. When a
|
|
529 comparison is done with another atom the path
|
|
530 passes through the aromatic bond.
|
|
531 <br/> 16 A!A!A - atom with more than one chain bond
|
|
532 <br/> 17 A!A$A!A - atom is at a ring/chain boundary. When a comparison is done
|
|
533 with another atom the path passes through the ring bond.
|
|
534 <br/> 18 A%Anot%A%A - atom is at an aromatic/nonaromatic boundary. When a
|
|
535 comparison is done with another atom the
|
|
536 path passes through the nonaromatic bond.
|
|
537 <br/> 19 HETEROCYCLE - atom is a heteroatom in a ring.
|
|
538 <br/> 20 rare properties: atom with five or more neighbors, atom in
|
|
539 four or more rings, or atom types other than
|
|
540 H, C, N, O, S, F, Cl, Br, or I
|
|
541 <br/> 21 rare properties: atom has a charge, is an isotope, has two or
|
|
542 more multiple bonds, or has a triple bond.
|
|
543 <br/> 22 N - nitrogen
|
|
544 <br/> 23 S - sulfur
|
|
545 <br/> 24 O - oxygen
|
|
546 <br/> 25 A(AA)A(A)A(AA) - atom has two neighbors, each with three or
|
|
547 more neighbors (including the central atom).
|
|
548 <br/> 26 CHACH2 - atom has two hydrocarbon (CH2) neighbors</div>
|
|
549 <p>Atomic environments properties-based keys (32):</p>
|
|
550 <div class="OptionsBox">
|
|
551 Key Description
|
|
552 <br/> 27 C(CC)
|
|
553 <br/> 28 C(CCC)
|
|
554 <br/> 29 C(CN)
|
|
555 <br/> 30 C(CCN)
|
|
556 <br/> 31 C(NN)
|
|
557 <br/> 32 C(NNC)
|
|
558 <br/> 33 C(NNN)
|
|
559 <br/> 34 C(CO)
|
|
560 <br/> 35 C(CCO)
|
|
561 <br/> 36 C(NO)
|
|
562 <br/> 37 C(NCO)
|
|
563 <br/> 38 C(NNO)
|
|
564 <br/> 39 C(OO)
|
|
565 <br/> 40 C(COO)
|
|
566 <br/> 41 C(NOO)
|
|
567 <br/> 42 C(OOO)
|
|
568 <br/> 43 Q(CC)
|
|
569 <br/> 44 Q(CCC)
|
|
570 <br/> 45 Q(CN)
|
|
571 <br/> 46 Q(CCN)
|
|
572 <br/> 47 Q(NN)
|
|
573 <br/> 48 Q(CNN)
|
|
574 <br/> 49 Q(NNN)
|
|
575 <br/> 50 Q(CO)
|
|
576 <br/> 51 Q(CCO)
|
|
577 <br/> 52 Q(NO)
|
|
578 <br/> 53 Q(CNO)
|
|
579 <br/> 54 Q(NNO)
|
|
580 <br/> 55 Q(OO)
|
|
581 <br/> 56 Q(COO)
|
|
582 <br/> 57 Q(NOO)
|
|
583 <br/> 58 Q(OOO)</div>
|
|
584 <p>Note: The first symbol is the central atom, with atoms bonded to the central atom listed in
|
|
585 parentheses. Q is any non-C, non-H atom. If only two atoms are in parentheses, there is
|
|
586 no implication concerning the other atoms bonded to the central atom.</p>
|
|
587 <p>Atom-Bond-Atom properties-based keys: (264)</p>
|
|
588 <div class="OptionsBox">
|
|
589 Key Description
|
|
590 <br/> 59 C-C
|
|
591 <br/> 60 C-N
|
|
592 <br/> 61 C-O
|
|
593 <br/> 62 C-S
|
|
594 <br/> 63 C-Cl
|
|
595 <br/> 64 C-P
|
|
596 <br/> 65 C-F
|
|
597 <br/> 66 C-Br
|
|
598 <br/> 67 C-Si
|
|
599 <br/> 68 C-I
|
|
600 <br/> 69 C-X
|
|
601 <br/> 70 N-N
|
|
602 <br/> 71 N-O
|
|
603 <br/> 72 N-S
|
|
604 <br/> 73 N-Cl
|
|
605 <br/> 74 N-P
|
|
606 <br/> 75 N-F
|
|
607 <br/> 76 N-Br
|
|
608 <br/> 77 N-Si
|
|
609 <br/> 78 N-I
|
|
610 <br/> 79 N-X
|
|
611 <br/> 80 O-O
|
|
612 <br/> 81 O-S
|
|
613 <br/> 82 O-Cl
|
|
614 <br/> 83 O-P
|
|
615 <br/> 84 O-F
|
|
616 <br/> 85 O-Br
|
|
617 <br/> 86 O-Si
|
|
618 <br/> 87 O-I
|
|
619 <br/> 88 O-X
|
|
620 <br/> 89 S-S
|
|
621 <br/> 90 S-Cl
|
|
622 <br/> 91 S-P
|
|
623 <br/> 92 S-F
|
|
624 <br/> 93 S-Br
|
|
625 <br/> 94 S-Si
|
|
626 <br/> 95 S-I
|
|
627 <br/> 96 S-X
|
|
628 <br/> 97 Cl-Cl
|
|
629 <br/> 98 Cl-P
|
|
630 <br/> 99 Cl-F
|
|
631 <br/> 100 Cl-Br
|
|
632 <br/> 101 Cl-Si
|
|
633 <br/> 102 Cl-I
|
|
634 <br/> 103 Cl-X
|
|
635 <br/> 104 P-P
|
|
636 <br/> 105 P-F
|
|
637 <br/> 106 P-Br
|
|
638 <br/> 107 P-Si
|
|
639 <br/> 108 P-I
|
|
640 <br/> 109 P-X
|
|
641 <br/> 110 F-F
|
|
642 <br/> 111 F-Br
|
|
643 <br/> 112 F-Si
|
|
644 <br/> 113 F-I
|
|
645 <br/> 114 F-X
|
|
646 <br/> 115 Br-Br
|
|
647 <br/> 116 Br-Si
|
|
648 <br/> 117 Br-I
|
|
649 <br/> 118 Br-X
|
|
650 <br/> 119 Si-Si
|
|
651 <br/> 120 Si-I
|
|
652 <br/> 121 Si-X
|
|
653 <br/> 122 I-I
|
|
654 <br/> 123 I-X
|
|
655 <br/> 124 X-X
|
|
656 <br/> 125 C=C
|
|
657 <br/> 126 C=N
|
|
658 <br/> 127 C=O
|
|
659 <br/> 128 C=S
|
|
660 <br/> 129 C=Cl
|
|
661 <br/> 130 C=P
|
|
662 <br/> 131 C=F
|
|
663 <br/> 132 C=Br
|
|
664 <br/> 133 C=Si
|
|
665 <br/> 134 C=I
|
|
666 <br/> 135 C=X
|
|
667 <br/> 136 N=N
|
|
668 <br/> 137 N=O
|
|
669 <br/> 138 N=S
|
|
670 <br/> 139 N=Cl
|
|
671 <br/> 140 N=P
|
|
672 <br/> 141 N=F
|
|
673 <br/> 142 N=Br
|
|
674 <br/> 143 N=Si
|
|
675 <br/> 144 N=I
|
|
676 <br/> 145 N=X
|
|
677 <br/> 146 O=O
|
|
678 <br/> 147 O=S
|
|
679 <br/> 148 O=Cl
|
|
680 <br/> 149 O=P
|
|
681 <br/> 150 O=F
|
|
682 <br/> 151 O=Br
|
|
683 <br/> 152 O=Si
|
|
684 <br/> 153 O=I
|
|
685 <br/> 154 O=X
|
|
686 <br/> 155 S=S
|
|
687 <br/> 156 S=Cl
|
|
688 <br/> 157 S=P
|
|
689 <br/> 158 S=F
|
|
690 <br/> 159 S=Br
|
|
691 <br/> 160 S=Si
|
|
692 <br/> 161 S=I
|
|
693 <br/> 162 S=X
|
|
694 <br/> 163 Cl=Cl
|
|
695 <br/> 164 Cl=P
|
|
696 <br/> 165 Cl=F
|
|
697 <br/> 166 Cl=Br
|
|
698 <br/> 167 Cl=Si
|
|
699 <br/> 168 Cl=I
|
|
700 <br/> 169 Cl=X
|
|
701 <br/> 170 P=P
|
|
702 <br/> 171 P=F
|
|
703 <br/> 172 P=Br
|
|
704 <br/> 173 P=Si
|
|
705 <br/> 174 P=I
|
|
706 <br/> 175 P=X
|
|
707 <br/> 176 F=F
|
|
708 <br/> 177 F=Br
|
|
709 <br/> 178 F=Si
|
|
710 <br/> 179 F=I
|
|
711 <br/> 180 F=X
|
|
712 <br/> 181 Br=Br
|
|
713 <br/> 182 Br=Si
|
|
714 <br/> 183 Br=I
|
|
715 <br/> 184 Br=X
|
|
716 <br/> 185 Si=Si
|
|
717 <br/> 186 Si=I
|
|
718 <br/> 187 Si=X
|
|
719 <br/> 188 I=I
|
|
720 <br/> 189 I=X
|
|
721 <br/> 190 X=X
|
|
722 <br/> 191 C#C
|
|
723 <br/> 192 C#N
|
|
724 <br/> 193 C#O
|
|
725 <br/> 194 C#S
|
|
726 <br/> 195 C#Cl
|
|
727 <br/> 196 C#P
|
|
728 <br/> 197 C#F
|
|
729 <br/> 198 C#Br
|
|
730 <br/> 199 C#Si
|
|
731 <br/> 200 C#I
|
|
732 <br/> 201 C#X
|
|
733 <br/> 202 N#N
|
|
734 <br/> 203 N#O
|
|
735 <br/> 204 N#S
|
|
736 <br/> 205 N#Cl
|
|
737 <br/> 206 N#P
|
|
738 <br/> 207 N#F
|
|
739 <br/> 208 N#Br
|
|
740 <br/> 209 N#Si
|
|
741 <br/> 210 N#I
|
|
742 <br/> 211 N#X
|
|
743 <br/> 212 O#O
|
|
744 <br/> 213 O#S
|
|
745 <br/> 214 O#Cl
|
|
746 <br/> 215 O#P
|
|
747 <br/> 216 O#F
|
|
748 <br/> 217 O#Br
|
|
749 <br/> 218 O#Si
|
|
750 <br/> 219 O#I
|
|
751 <br/> 220 O#X
|
|
752 <br/> 221 S#S
|
|
753 <br/> 222 S#Cl
|
|
754 <br/> 223 S#P
|
|
755 <br/> 224 S#F
|
|
756 <br/> 225 S#Br
|
|
757 <br/> 226 S#Si
|
|
758 <br/> 227 S#I
|
|
759 <br/> 228 S#X
|
|
760 <br/> 229 Cl#Cl
|
|
761 <br/> 230 Cl#P
|
|
762 <br/> 231 Cl#F
|
|
763 <br/> 232 Cl#Br
|
|
764 <br/> 233 Cl#Si
|
|
765 <br/> 234 Cl#I
|
|
766 <br/> 235 Cl#X
|
|
767 <br/> 236 P#P
|
|
768 <br/> 237 P#F
|
|
769 <br/> 238 P#Br
|
|
770 <br/> 239 P#Si
|
|
771 <br/> 240 P#I
|
|
772 <br/> 241 P#X
|
|
773 <br/> 242 F#F
|
|
774 <br/> 243 F#Br
|
|
775 <br/> 244 F#Si
|
|
776 <br/> 245 F#I
|
|
777 <br/> 246 F#X
|
|
778 <br/> 247 Br#Br
|
|
779 <br/> 248 Br#Si
|
|
780 <br/> 249 Br#I
|
|
781 <br/> 250 Br#X
|
|
782 <br/> 251 Si#Si
|
|
783 <br/> 252 Si#I
|
|
784 <br/> 253 Si#X
|
|
785 <br/> 254 I#I
|
|
786 <br/> 255 I#X
|
|
787 <br/> 256 X#X
|
|
788 <br/> 257 C$C
|
|
789 <br/> 258 C$N
|
|
790 <br/> 259 C$O
|
|
791 <br/> 260 C$S
|
|
792 <br/> 261 C$Cl
|
|
793 <br/> 262 C$P
|
|
794 <br/> 263 C$F
|
|
795 <br/> 264 C$Br
|
|
796 <br/> 265 C$Si
|
|
797 <br/> 266 C$I
|
|
798 <br/> 267 C$X
|
|
799 <br/> 268 N$N
|
|
800 <br/> 269 N$O
|
|
801 <br/> 270 N$S
|
|
802 <br/> 271 N$Cl
|
|
803 <br/> 272 N$P
|
|
804 <br/> 273 N$F
|
|
805 <br/> 274 N$Br
|
|
806 <br/> 275 N$Si
|
|
807 <br/> 276 N$I
|
|
808 <br/> 277 N$X
|
|
809 <br/> 278 O$O
|
|
810 <br/> 279 O$S
|
|
811 <br/> 280 O$Cl
|
|
812 <br/> 281 O$P
|
|
813 <br/> 282 O$F
|
|
814 <br/> 283 O$Br
|
|
815 <br/> 284 O$Si
|
|
816 <br/> 285 O$I
|
|
817 <br/> 286 O$X
|
|
818 <br/> 287 S$S
|
|
819 <br/> 288 S$Cl
|
|
820 <br/> 289 S$P
|
|
821 <br/> 290 S$F
|
|
822 <br/> 291 S$Br
|
|
823 <br/> 292 S$Si
|
|
824 <br/> 293 S$I
|
|
825 <br/> 294 S$X
|
|
826 <br/> 295 Cl$Cl
|
|
827 <br/> 296 Cl$P
|
|
828 <br/> 297 Cl$F
|
|
829 <br/> 298 Cl$Br
|
|
830 <br/> 299 Cl$Si
|
|
831 <br/> 300 Cl$I
|
|
832 <br/> 301 Cl$X
|
|
833 <br/> 302 P$P
|
|
834 <br/> 303 P$F
|
|
835 <br/> 304 P$Br
|
|
836 <br/> 305 P$Si
|
|
837 <br/> 306 P$I
|
|
838 <br/> 307 P$X
|
|
839 <br/> 308 F$F
|
|
840 <br/> 309 F$Br
|
|
841 <br/> 310 F$Si
|
|
842 <br/> 311 F$I
|
|
843 <br/> 312 F$X
|
|
844 <br/> 313 Br$Br
|
|
845 <br/> 314 Br$Si
|
|
846 <br/> 315 Br$I
|
|
847 <br/> 316 Br$X
|
|
848 <br/> 317 Si$Si
|
|
849 <br/> 318 Si$I
|
|
850 <br/> 319 Si$X
|
|
851 <br/> 320 I$I
|
|
852 <br/> 321 I$X
|
|
853 <br/> 322 X$X</div>
|
|
854 </dd>
|
|
855 <dt><strong><strong>--OutDelim</strong> <em>comma | tab | semicolon</em></strong></dt>
|
|
856 <dd>
|
|
857 <p>Delimiter for output CSV/TSV text file(s). Possible values: <em>comma, tab, or semicolon</em>
|
|
858 Default value: <em>comma</em>.</p>
|
|
859 </dd>
|
|
860 <dt><strong><strong>--output</strong> <em>SD | FP | text | all</em></strong></dt>
|
|
861 <dd>
|
|
862 <p>Type of output files to generate. Possible values: <em>SD, FP, text, or all</em>. Default value: <em>text</em>.</p>
|
|
863 </dd>
|
|
864 <dt><strong><strong>-o, --overwrite</strong></strong></dt>
|
|
865 <dd>
|
|
866 <p>Overwrite existing files.</p>
|
|
867 </dd>
|
|
868 <dt><strong><strong>-q, --quote</strong> <em>Yes | No</em></strong></dt>
|
|
869 <dd>
|
|
870 <p>Put quote around column values in output CSV/TSV text file(s). Possible values:
|
|
871 <em>Yes or No</em>. Default value: <em>Yes</em>.</p>
|
|
872 </dd>
|
|
873 <dt><strong><strong>-r, --root</strong> <em>RootName</em></strong></dt>
|
|
874 <dd>
|
|
875 <p>New file name is generated using the root: <Root>.<Ext>. Default for new file
|
|
876 names: <SDFileName><MACCSKeysFP>.<Ext>. The file type determines <Ext> value.
|
|
877 The sdf, fpf, csv, and tsv <Ext> values are used for SD, FP, comma/semicolon, and tab
|
|
878 delimited text files, respectively.This option is ignored for multiple input files.</p>
|
|
879 </dd>
|
|
880 <dt><strong><strong>-s, --size</strong> <em>number</em></strong></dt>
|
|
881 <dd>
|
|
882 <p>Size of MACCS keys [ Ref 45-47 ] set to use during fingerprints generation. Possible values: <em>166 or 322</em>.
|
|
883 Default value: <em>166</em>.</p>
|
|
884 </dd>
|
|
885 <dt><strong><strong>-v, --VectorStringFormat</strong> <em>ValuesString | IDsAndValuesString | IDsAndValuesPairsString | ValuesAndIDsString | ValuesAndIDsPairsString</em></strong></dt>
|
|
886 <dd>
|
|
887 <p>Format of fingerprints vector string data in output SD, FP or CSV/TSV text file(s) specified by
|
|
888 <strong>--output</strong> used during <em>MACCSKeyCount</em> value of <strong>-m, --mode</strong> option. Possible
|
|
889 values: <em>ValuesString, IDsAndValuesString | IDsAndValuesPairsString | ValuesAndIDsString |
|
|
890 ValuesAndIDsPairsString</em>. Defaultvalue: <em>ValuesString</em>.</p>
|
|
891 <p>Examples:</p>
|
|
892 <div class="OptionsBox">
|
|
893 FingerprintsVector;MACCSKeyCount;166;OrderedNumericalValues;ValuesStri
|
|
894 <br/> ng;0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
|
|
895 <br/> 0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 4 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0
|
|
896 <br/> 0 0 0 0 1 1 8 0 0 0 1 0 0 1 0 1 0 1 0 3 1 3 1 0 0 0 1 2 0 11 1 0 0 0
|
|
897 <br/> 5 0 0 1 2 0 1 1 0 0 0 0 0 1 1 0 1 1 1 1 0 4 0 0 1 1 0 4 6 1 1 1 2 1 1
|
|
898 <br/> 3 5 2 2 0 5 3 5 1 1 2 5 1 2 1 2 4 8 3 5 5 2 2 0 3 5 4 1</div>
|
|
899 <div class="OptionsBox">
|
|
900 FingerprintsVector;MACCSKeyCount;322;OrderedNumericalValues;ValuesStri
|
|
901 <br/> ng;14 8 2 0 2 0 4 4 2 1 4 0 0 2 5 10 5 2 1 0 0 2 0 5 13 3 28 5 5 3 0 0
|
|
902 <br/> 0 4 2 1 1 0 1 1 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22 5 3 0 0 0 1 0
|
|
903 <br/> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
|
|
904 <br/> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 0 2 0 0 0 0 0 0 0 0 0
|
|
905 <br/> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...</div>
|
|
906 </dd>
|
|
907 <dt><strong><strong>-w, --WorkingDir</strong> <em>DirName</em></strong></dt>
|
|
908 <dd>
|
|
909 <p>Location of working directory. Default: current directory.</p>
|
|
910 </dd>
|
|
911 </dl>
|
|
912 <p>
|
|
913 </p>
|
|
914 <h2>EXAMPLES</h2>
|
|
915 <p>To generate MACCS keys fingerprints of size 166 in binary bit-vector string format
|
|
916 and create a SampleMACCS166FPBin.csv file containing sequential compound IDs along with
|
|
917 fingerprints bit-vector strings data, type:</p>
|
|
918 <div class="ExampleBox">
|
|
919 % MACCSKeysFingerprints.pl -r SampleMACCS166FPBin -o Sample.sdf</div>
|
|
920 <p>To generate MACCS keys fingerprints of size 166 in binary bit-vector string format
|
|
921 and create SampleMACCS166FPBin.sdf, SampleMACCS166FPBin.csv and SampleMACCS166FPBin.csv
|
|
922 files containing sequential compound IDs in CSV file along with fingerprints bit-vector strings data, type:</p>
|
|
923 <div class="ExampleBox">
|
|
924 % MACCSKeysFingerprints.pl --output all -r SampleMACCS166FPBin
|
|
925 -o Sample.sdf</div>
|
|
926 <p>To generate MACCS keys fingerprints of size 322 in binary bit-vector string format
|
|
927 and create a SampleMACCS322FPBin.csv file containing sequential compound IDs along with
|
|
928 fingerprints bit-vector strings data, type:</p>
|
|
929 <div class="ExampleBox">
|
|
930 % MACCSKeysFingerprints.pl -size 322 -r SampleMACCS322FPBin -o Sample.sdf</div>
|
|
931 <p>To generate MACCS keys fingerprints of size 166 corresponding to count of keys in
|
|
932 ValuesString format and create a SampleMACCS166FPCount.csv file containing sequential
|
|
933 compound IDs along with fingerprints vector strings data, type:</p>
|
|
934 <div class="ExampleBox">
|
|
935 % MACCSKeysFingerprints.pl -m MACCSKeyCount -r SampleMACCS166FPCount
|
|
936 -o Sample.sdf</div>
|
|
937 <p>To generate MACCS keys fingerprints of size 322 corresponding to count of keys in
|
|
938 ValuesString format and create a SampleMACCS322FPCount.csv file containing sequential
|
|
939 compound IDs along with fingerprints vector strings data, type:</p>
|
|
940 <div class="ExampleBox">
|
|
941 % MACCSKeysFingerprints.pl -m MACCSKeyCount -size 322
|
|
942 -r SampleMACCS322FPCount -o Sample.sdf</div>
|
|
943 <p>To generate MACCS keys fingerprints of size 166 in hexadecimal bit-vector string format with
|
|
944 ascending bits order and create a SampleMACCS166FPHex.csv file containing compound IDs
|
|
945 from MolName along with fingerprints bit-vector strings data, type:</p>
|
|
946 <div class="ExampleBox">
|
|
947 % MACCSKeysFingerprints.pl -m MACCSKeyBits --size 166 --BitStringFormat
|
|
948 HexadecimalString --BitsOrder Ascending --DataFieldsMode CompoundID
|
|
949 --CompoundIDMode MolName -r SampleMACCS166FPBin -o Sample.sdf</div>
|
|
950 <p>To generate MACCS keys fingerprints of size 166 corresponding to count of keys in
|
|
951 IDsAndValuesString format and create a SampleMACCS166FPCount.csv file containing
|
|
952 compound IDs from MolName line along with fingerprints vector strings data, type:</p>
|
|
953 <div class="ExampleBox">
|
|
954 % MACCSKeysFingerprints.pl -m MACCSKeyCount --size 166
|
|
955 --VectorStringFormat IDsAndValuesString --DataFieldsMode CompoundID
|
|
956 --CompoundIDMode MolName -r SampleMACCS166FPCount -o Sample.sdf</div>
|
|
957 <p>To generate MACCS keys fingerprints of size 166 corresponding to count of keys in
|
|
958 IDsAndValuesString format and create a SampleMACCS166FPCount.csv file containing
|
|
959 compound IDs using specified data field along with fingerprints vector strings data, type:</p>
|
|
960 <div class="ExampleBox">
|
|
961 % MACCSKeysFingerprints.pl -m MACCSKeyCount --size 166
|
|
962 --VectorStringFormat IDsAndValuesString --DataFieldsMode CompoundID
|
|
963 --CompoundIDMode DataField --CompoundID Mol_ID -r
|
|
964 SampleMACCS166FPCount -o Sample.sdf</div>
|
|
965 <p>To generate MACCS keys fingerprints of size 322 corresponding to count of keys in
|
|
966 ValuesString format and create a SampleMACCS322FPCount.tsv file containing compound
|
|
967 IDs derived from combination of molecule name line and an explicit compound prefix
|
|
968 along with fingerprints vector strings data in a column labels MACCSKeyCountFP, type:</p>
|
|
969 <div class="ExampleBox">
|
|
970 % MACCSKeysFingerprints.pl -m MACCSKeyCount -size 322 --DataFieldsMode
|
|
971 CompoundID --CompoundIDMode MolnameOrLabelPrefix --CompoundID Cmpd
|
|
972 --CompoundIDLabel MolID --FingerprintsLabel MACCSKeyCountFP --OutDelim
|
|
973 Tab -r SampleMACCS322FPCount -o Sample.sdf</div>
|
|
974 <p>To generate MACCS keys fingerprints of size 166 corresponding to count of keys in
|
|
975 ValuesString format and create a SampleMACCS166FPCount.csv file containing
|
|
976 specific data fields columns along with fingerprints vector strings data, type:</p>
|
|
977 <div class="ExampleBox">
|
|
978 % MACCSKeysFingerprints.pl -m MACCSKeyCount --size 166
|
|
979 --VectorStringFormat ValuesString --DataFieldsMode Specify --DataFields
|
|
980 Mol_ID -r SampleMACCS166FPCount -o Sample.sdf</div>
|
|
981 <p>To generate MACCS keys fingerprints of size 322 corresponding to count of keys in
|
|
982 ValuesString format and create a SampleMACCS322FPCount.csv file containing
|
|
983 common data fields columns along with fingerprints vector strings data, type:</p>
|
|
984 <div class="ExampleBox">
|
|
985 % MACCSKeysFingerprints.pl -m MACCSKeyCount --size 322
|
|
986 --VectorStringFormat ValuesString --DataFieldsMode Common -r
|
|
987 SampleMACCS322FPCount -o Sample.sdf</div>
|
|
988 <p>To generate MACCS keys fingerprints of size 166 corresponding to count of keys in
|
|
989 ValuesString format and create SampleMACCS166FPCount.sdf, SampleMACCS166FPCount.fpf and
|
|
990 SampleMACCS166FPCount.csv files containing all data fields columns in CSV file
|
|
991 along with fingerprints vector strings data, type:</p>
|
|
992 <div class="ExampleBox">
|
|
993 % MACCSKeysFingerprints.pl -m MACCSKeyCount --size 166 --output all
|
|
994 --VectorStringFormat ValuesString --DataFieldsMode All -r
|
|
995 SampleMACCS166FPCount -o Sample.sdf</div>
|
|
996 <p>
|
|
997 </p>
|
|
998 <h2>AUTHOR</h2>
|
|
999 <p><a href="mailto:msud@san.rr.com">Manish Sud</a></p>
|
|
1000 <p>
|
|
1001 </p>
|
|
1002 <h2>SEE ALSO</h2>
|
|
1003 <p><a href="./InfoFingerprintsFiles.html">InfoFingerprintsFiles.pl</a>, <a href="./SimilarityMatricesFingerprints.html">SimilarityMatricesFingerprints.pl</a>, <a href="./AtomNeighborhoodsFingerprints.html">AtomNeighborhoodsFingerprints.pl</a>, 
|
|
1004 <a href="./ExtendedConnectivityFingerprints.html">ExtendedConnectivityFingerprints.pl</a>, <a href="./PathLengthFingerprints.html">PathLengthFingerprints.pl</a>, 
|
|
1005 <a href="./TopologicalAtomPairsFingerprints.html">TopologicalAtomPairsFingerprints.pl</a>, <a href="./TopologicalAtomTorsionsFingerprints.html">TopologicalAtomTorsionsFingerprints.pl</a>, 
|
|
1006 <a href="./TopologicalPharmacophoreAtomPairsFingerprints.html">TopologicalPharmacophoreAtomPairsFingerprints.pl</a>, <a href="./TopologicalPharmacophoreAtomTripletsFingerprints.html">TopologicalPharmacophoreAtomTripletsFingerprints.pl</a>
|
|
1007 </p>
|
|
1008 <p>
|
|
1009 </p>
|
|
1010 <h2>COPYRIGHT</h2>
|
|
1011 <p>Copyright (C) 2015 Manish Sud. All rights reserved.</p>
|
|
1012 <p>This file is part of MayaChemTools.</p>
|
|
1013 <p>MayaChemTools is free software; you can redistribute it and/or modify it under
|
|
1014 the terms of the GNU Lesser General Public License as published by the Free
|
|
1015 Software Foundation; either version 3 of the License, or (at your option)
|
|
1016 any later version.</p>
|
|
1017 <p> </p><p> </p><div class="DocNav">
|
|
1018 <table width="100%" border=0 cellpadding=0 cellspacing=2>
|
|
1019 <tr align="left" valign="top"><td width="33%" align="left"><a href="./JoinTextFiles.html" title="JoinTextFiles.html">Previous</a> <a href="./index.html" title="Table of Contents">TOC</a> <a href="./MergeTextFiles.html" title="MergeTextFiles.html">Next</a></td><td width="34%" align="middle"><strong>March 29, 2015</strong></td><td width="33%" align="right"><strong>MACCSKeysFingerprints.pl</strong></td></tr>
|
|
1020 </table>
|
|
1021 </div>
|
|
1022 <br />
|
|
1023 <center>
|
|
1024 <img src="../../images/h2o2.png">
|
|
1025 </center>
|
|
1026 </body>
|
|
1027 </html>
|