comparison mayachemtools/docs/scripts/html/AtomNeighborhoodsFingerprints.html @ 0:73ae111cf86f draft

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 11:55:01 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:73ae111cf86f
1 <html>
2 <head>
3 <title>MayaChemTools:Documentation:AtomNeighborhoodsFingerprints.pl</title>
4 <meta http-equiv="content-type" content="text/html;charset=utf-8">
5 <link rel="stylesheet" type="text/css" href="../../css/MayaChemTools.css">
6 </head>
7 <body leftmargin="20" rightmargin="20" topmargin="10" bottommargin="10">
8 <br/>
9 <center>
10 <a href="http://www.mayachemtools.org" title="MayaChemTools Home"><img src="../../images/MayaChemToolsLogo.gif" border="0" alt="MayaChemTools"></a>
11 </center>
12 <br/>
13 <div class="DocNav">
14 <table width="100%" border=0 cellpadding=0 cellspacing=2>
15 <tr align="left" valign="top"><td width="33%" align="left"><a href="./AnalyzeTextFilesData.html" title="AnalyzeTextFilesData.html">Previous</a>&nbsp;&nbsp;<a href="./index.html" title="Table of Contents">TOC</a>&nbsp;&nbsp;<a href="./AtomTypesFingerprints.html" title="AtomTypesFingerprints.html">Next</a></td><td width="34%" align="middle"><strong>AtomNeighborhoodsFingerprints.pl</strong></td><td width="33%" align="right"><a href="././code/AtomNeighborhoodsFingerprints.html" title="View source code">Code</a>&nbsp;|&nbsp;<a href="./../pdf/AtomNeighborhoodsFingerprints.pdf" title="PDF US Letter Size">PDF</a>&nbsp;|&nbsp;<a href="./../pdfgreen/AtomNeighborhoodsFingerprints.pdf" title="PDF US Letter Size with narrow margins: www.changethemargins.com">PDFGreen</a>&nbsp;|&nbsp;<a href="./../pdfa4/AtomNeighborhoodsFingerprints.pdf" title="PDF A4 Size">PDFA4</a>&nbsp;|&nbsp;<a href="./../pdfa4green/AtomNeighborhoodsFingerprints.pdf" title="PDF A4 Size with narrow margins: www.changethemargins.com">PDFA4Green</a></td></tr>
16 </table>
17 </div>
18 <p>
19 </p>
20 <h2>NAME</h2>
21 <p>AtomNeighborhoodsFingerprints.pl - Generate atom neighborhoods fingerprints for SD files</p>
22 <p>
23 </p>
24 <h2>SYNOPSIS</h2>
25 <p>AtomNeighborhoodsFingerprints.pl SDFile(s)...</p>
26 <p>AtomNeighborhoodsFingerprints.pl [<strong>--AromaticityModel</strong> <em>AromaticityModelType</em>]
27 [<strong>-a, --AtomIdentifierType</strong> <em>AtomicInvariantsAtomTypes |
28 DREIDINGAtomTypes | EStateAtomTypes | MMFF94AtomTypes | SLogPAtomTypes | SYBYLAtomTypes | TPSAAtomTypes | UFFAtomTypes</em>]
29 [<strong>--AtomicInvariantsToUse</strong> <em>&quot;AtomicInvariant,AtomicInvariant...&quot;</em>]
30 [<strong>--FunctionalClassesToUse</strong> <em>&quot;FunctionalClass1,FunctionalClass2...&quot;</em>]
31 [<strong>--CompoundID</strong> <em>DataFieldName or LabelPrefixString</em>] [<strong>--CompoundIDLabel</strong> <em>text</em>]
32 [<strong>--CompoundIDMode</strong>] [<strong>--DataFields</strong> <em>&quot;FieldLabel1,FieldLabel2,...&quot;</em>]
33 [<strong>-d, --DataFieldsMode</strong> <em>All | Common | Specify | CompoundID</em>] [<strong>-f, --Filter</strong> <em>Yes | No</em>]
34 [<strong>--FingerprintsLabel</strong> <em>text</em>] [<strong>-h, --help</strong>] [<strong>-k, --KeepLargestComponent</strong> <em>Yes | No</em>]
35 [<strong>--MinNeighborhoodRadius</strong> <em>number</em>] [<strong>--MaxNeighborhoodRadius</strong> <em>number</em>]
36 [<strong>--OutDelim</strong> <em>comma | tab | semicolon</em>] [<strong>--output</strong> <em>SD | FP | text | all</em>] [<strong>-o, --overwrite</strong>]
37 [<strong>-q, --quote</strong> <em>Yes | No</em>] [<strong>-r, --root</strong> <em>RootName</em>]
38 [<strong>-w, --WorkingDir</strong> dirname] SDFile(s)...</p>
39 <p>
40 </p>
41 <h2>DESCRIPTION</h2>
42 <p>Generate atom neighborhoods fingerprints [ Ref 53-56, Ref 73 ] for <em>SDFile(s)</em> and create appropriate
43 SD, FP or CSV/TSV text file(s) containing fingerprints vector strings corresponding to molecular fingerprints.</p>
44 <p>Multiple SDFile names are separated by spaces. The valid file extensions are <em>.sdf</em>
45 and <em>.sd</em>. All other file names are ignored. All the SD files in a current directory
46 can be specified either by <em>*.sdf</em> or the current directory name.</p>
47 <p>The current release of MayaChemTools supports generation of atom neighborhoods fingerprints
48 corresponding to following <strong>-a, --AtomIdentifierTypes</strong>:</p>
49 <div class="OptionsBox">
50 AtomicInvariantsAtomTypes, DREIDINGAtomTypes, EStateAtomTypes,
51 <br/> FunctionalClassAtomTypes, MMFF94AtomTypes, SLogPAtomTypes,
52 <br/> SYBYLAtomTypes, TPSAAtomTypes, UFFAtomTypes</div>
53 <p>Based on the values specified for <strong>-a, --AtomIdentifierType</strong> and <strong>--AtomicInvariantsToUse</strong>,
54 initial atom types are assigned to all non-hydrogen atoms in a molecule. Using atom neighborhoods
55 around each non-hydrogen central atom corresponding to radii between specified values
56 <strong>--MinNeighborhoodRadius</strong> and <strong>--MaxNeighborhoodRadius</strong>, unique atom types at
57 each radii level are counted and an atom neighborhood identifier is generated.</p>
58 <p>The format of an atom neighborhood identifier around a central non-hydrogen atom at a
59 specific radius is:</p>
60 <div class="OptionsBox">
61 NR&lt;n&gt;-&lt;AtomType&gt;-ATC&lt;n&gt;</div>
62 <div class="OptionsBox">
63 NR: Neighborhood radius
64 <br/> AtomType: Assigned atom type
65 <br/> ATC: Atom type count</div>
66 <p>The atom neighborhood identifier for a non-hydrogen central atom corresponding to all specified radii
67 is generated by concatenating neighborhood identifiers at each radii by colon as a delimiter:</p>
68 <div class="OptionsBox">
69 NR&lt;n&gt;-&lt;AtomType&gt;-ATC&lt;n&gt;:NR&lt;n&gt;-&lt;AtomType&gt;-ATC&lt;n&gt;:...</div>
70 <p>The atom neighborhood identifiers for all non-hydrogen central atoms at all specified radii are
71 concatenated using space as a delimiter and constitute atom neighborhood fingerprint of the molecule.</p>
72 <p>Example of <em>SD</em> file containing atom neighborhood fingerprints string data:</p>
73 <div class="OptionsBox">
74 ... ...
75 <br/> ... ...
76 <br/> $$$$
77 <br/> ... ...
78 <br/> ... ...
79 <br/> ... ...
80 <br/> 41 44 0 0 0 0 0 0 0 0999 V2000
81 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
82 <br/> ... ...
83 <br/> 2 3 1 0 0 0 0
84 <br/> ... ...
85 <br/> M END
86 <br/> &gt; &lt;CmpdID&gt;
87 <br/> Cmpd1</div>
88 <div class="OptionsBox">
89 &gt; &lt;AtomNeighborhoodsFingerprints&gt;
90 <br/> FingerprintsVector;AtomNeighborhoods:AtomicInvariantsAtomTypes:MinRadiu
91 <br/> s0:MaxRadius2;41;AlphaNumericalValues;ValuesString;NR0-C.X1.BO1.H3-ATC1
92 <br/> :NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-ATC1 NR0-C.X1.B
93 <br/> O1.H3-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-ATC1
94 <br/> NR0-C.X2.BO2.H2-ATC1:NR1-C.X2.BO2.H2-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2-C...</div>
95 <div class="OptionsBox">
96 $$$$
97 <br/> ... ...
98 <br/> ... ...</div>
99 <p>Example of <em>FP</em> file containing atom neighborhood fingerprints string data:</p>
100 <div class="OptionsBox">
101 #
102 <br/> # Package = MayaChemTools 7.4
103 <br/> # Release Date = Oct 21, 2010
104 <br/> #
105 <br/> # TimeStamp = Fri Mar 11 14:15:27 2011
106 <br/> #
107 <br/> # FingerprintsStringType = FingerprintsVector
108 <br/> #
109 <br/> # Description = AtomNeighborhoods:AtomicInvariantsAtomTypes:MinRadiu...
110 <br/> # VectorStringFormat = ValuesString
111 <br/> # VectorValuesType = AlphaNumericalValues
112 <br/> #
113 <br/> Cmpd1 41;NR0-C.X1.BO1.H3-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-A...
114 <br/> Cmpd2 23;NR0-C.X1.BO1.H3-ATC1:NR1-C.X2.BO2.H2-ATC1:NR2-C.X3.BO3.H1-A...
115 <br/> ... ...
116 <br/> ... ..</div>
117 <p>Example of CSV <em>Text</em> file containing atom neighborhood fingerprints string data:</p>
118 <div class="OptionsBox">
119 &quot;CompoundID&quot;,&quot;AtomNeighborhoodsFingerprints&quot;
120 <br/> &quot;Cmpd1&quot;,&quot;FingerprintsVector;AtomNeighborhoods:AtomicInvariantsAtomTypes
121 <br/> :MinRadius0:MaxRadius2;41;AlphaNumericalValues;ValuesString;NR0-C.X1.B
122 <br/> O1.H3-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-ATC1
123 <br/> NR0-C.X1.BO1.H3-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3
124 <br/> .BO4-ATC1 NR0-C.X2.BO2.H2-ATC1:NR1-C.X2.BO2.H2-ATC1:NR1-C.X3.BO3.H1...&quot;
125 <br/> ... ...
126 <br/> ... ...</div>
127 <p>The current release of MayaChemTools generates the following types of atom neighborhoods
128 fingerprints vector strings:</p>
129 <div class="OptionsBox">
130 FingerprintsVector;AtomNeighborhoods:AtomicInvariantsAtomTypes:MinRadi
131 <br/> us0:MaxRadius2;41;AlphaNumericalValues;ValuesString;NR0-C.X1.BO1.H3-AT
132 <br/> C1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-ATC1 NR0-C.X
133 <br/> 1.BO1.H3-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-A
134 <br/> TC1 NR0-C.X2.BO2.H2-ATC1:NR1-C.X2.BO2.H2-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2
135 <br/> -C.X2.BO2.H2-ATC1:NR2-N.X3.BO3-ATC1:NR2-O.X1.BO1.H1-ATC1 NR0-C.X2.B...</div>
136 <div class="OptionsBox">
137 FingerprintsVector;AtomNeighborhoods:DREIDINGAtomTypes:MinRadius0:MaxR
138 <br/> adius2;41;AlphaNumericalValues;ValuesString;NR0-C_2-ATC1:NR1-C_3-ATC1:
139 <br/> NR1-O_2-ATC1:NR1-O_3-ATC1:NR2-C_3-ATC1 NR0-C_2-ATC1:NR1-C_R-ATC1:NR1-N
140 <br/> _3-ATC1:NR1-O_2-ATC1:NR2-C_R-ATC3 NR0-C_3-ATC1:NR1-C_2-ATC1:NR1-C_3-AT
141 <br/> C1:NR2-C_3-ATC1:NR2-O_2-ATC1:NR2-O_3-ATC2 NR0-C_3-ATC1:NR1-C_3-ATC1:NR
142 <br/> 1-N_R-ATC1:NR2-C_3-ATC1:NR2-C_R-ATC2 NR0-C_3-ATC1:NR1-C_3-ATC1:NR2-...</div>
143 <div class="OptionsBox">
144 FingerprintsVector;AtomNeighborhoods:EStateAtomTypes:MinRadius0:MaxRad
145 <br/> ius2;41;AlphaNumericalValues;ValuesString;NR0-aaCH-ATC1:NR1-aaCH-ATC1:
146 <br/> NR1-aasC-ATC1:NR2-aaCH-ATC1:NR2-aasC-ATC1:NR2-sF-ATC1 NR0-aaCH-ATC1:NR
147 <br/> 1-aaCH-ATC1:NR1-aasC-ATC1:NR2-aaCH-ATC1:NR2-aasC-ATC1:NR2-sF-ATC1 NR0-
148 <br/> aaCH-ATC1:NR1-aaCH-ATC1:NR1-aasC-ATC1:NR2-aaCH-ATC1:NR2-aasC-ATC2 NR0-
149 <br/> aaCH-ATC1:NR1-aaCH-ATC1:NR1-aasC-ATC1:NR2-aaCH-ATC1:NR2-aasC-ATC2 N...</div>
150 <div class="OptionsBox">
151 FingerprintsVector;AtomNeighborhoods:FunctionalClassAtomTypes:MinRadiu
152 <br/> s0:MaxRadius2;41;AlphaNumericalValues;ValuesString;NR0-Ar-ATC1:NR1-Ar-
153 <br/> ATC1:NR1-Ar.HBA-ATC1:NR1-None-ATC1:NR2-Ar-ATC2:NR2-None-ATC4 NR0-Ar-AT
154 <br/> C1:NR1-Ar-ATC2:NR1-Ar.HBA-ATC1:NR2-Ar-ATC5:NR2-None-ATC1 NR0-Ar-ATC1:N
155 <br/> R1-Ar-ATC2:NR1-HBD-ATC1:NR2-Ar-ATC2:NR2-None-ATC1 NR0-Ar-ATC1:NR1-Ar-A
156 <br/> TC2:NR1-Hal-ATC1:NR2-Ar-ATC2 NR0-Ar-ATC1:NR1-Ar-ATC2:NR1-None-ATC1:...</div>
157 <div class="OptionsBox">
158 FingerprintsVector;AtomNeighborhoods:MMFF94AtomTypes:MinRadius0:MaxRad
159 <br/> ius2;41;AlphaNumericalValues;ValuesString;NR0-C5A-ATC1:NR1-C5B-ATC1:NR
160 <br/> 1-CB-ATC1:NR1-N5-ATC1:NR2-C5A-ATC1:NR2-C5B-ATC1:NR2-CB-ATC3:NR2-CR-ATC
161 <br/> 1 NR0-C5A-ATC1:NR1-C5B-ATC1:NR1-CR-ATC1:NR1-N5-ATC1:NR2-C5A-ATC1:NR2-C
162 <br/> 5B-ATC1:NR2-C=ON-ATC1:NR2-CR-ATC3 NR0-C5B-ATC1:NR1-C5A-ATC1:NR1-C5B-AT
163 <br/> C1:NR1-C=ON-ATC1:NR2-C5A-ATC1:NR2-CB-ATC1:NR2-CR-ATC1:NR2-N5-ATC1:N...</div>
164 <div class="OptionsBox">
165 FingerprintsVector;AtomNeighborhoods:SLogPAtomTypes:MinRadius0:MaxRadi
166 <br/> us2;41;AlphaNumericalValues;ValuesString;NR0-C1-ATC1:NR1-C10-ATC1:NR1-
167 <br/> CS-ATC1:NR2-C1-ATC1:NR2-N11-ATC1:NR2-O2-ATC1 NR0-C1-ATC1:NR1-C11-ATC1:
168 <br/> NR2-C1-ATC1:NR2-C21-ATC1 NR0-C1-ATC1:NR1-C11-ATC1:NR2-C1-ATC1:NR2-C21-
169 <br/> ATC1 NR0-C1-ATC1:NR1-C5-ATC1:NR1-CS-ATC1:NR2-C1-ATC1:NR2-O2-ATC2:NR2-O
170 <br/> 9-ATC1 NR0-C1-ATC1:NR1-CS-ATC2:NR2-C1-ATC2:NR2-O2-ATC2 NR0-C10-ATC1...</div>
171 <div class="OptionsBox">
172 FingerprintsVector;AtomNeighborhoods:SYBYLAtomTypes:MinRadius0:MaxRadi
173 <br/> us2;41;AlphaNumericalValues;ValuesString;NR0-C.2-ATC1:NR1-C.3-ATC1:NR1
174 <br/> -O.co2-ATC2:NR2-C.3-ATC1 NR0-C.2-ATC1:NR1-C.ar-ATC1:NR1-N.am-ATC1:NR1-
175 <br/> O.2-ATC1:NR2-C.ar-ATC3 NR0-C.3-ATC1:NR1-C.2-ATC1:NR1-C.3-ATC1:NR2-C.3-
176 <br/> ATC1:NR2-O.3-ATC1:NR2-O.co2-ATC2 NR0-C.3-ATC1:NR1-C.3-ATC1:NR1-N.ar-AT
177 <br/> C1:NR2-C.3-ATC1:NR2-C.ar-ATC2 NR0-C.3-ATC1:NR1-C.3-ATC1:NR2-C.3-ATC...</div>
178 <div class="OptionsBox">
179 FingerprintsVector;AtomNeighborhoods:TPSAAtomTypes:MinRadius0:MaxRadiu
180 <br/> s2;41;AlphaNumericalValues;ValuesString;NR0-N21-ATC1:NR1-None-ATC3:NR2
181 <br/> -None-ATC5 NR0-N7-ATC1:NR1-None-ATC2:NR2-None-ATC3:NR2-O3-ATC1 NR0-Non
182 <br/> e-ATC1:NR1-N21-ATC1:NR1-None-ATC1:NR2-None-ATC3 NR0-None-ATC1:NR1-N21-
183 <br/> ATC1:NR1-None-ATC2:NR2-None-ATC6 NR0-None-ATC1:NR1-N21-ATC1:NR1-None-A
184 <br/> TC2:NR2-None-ATC6 NR0-None-ATC1:NR1-N7-ATC1:NR1-None-ATC1:NR1-O3-AT...</div>
185 <div class="OptionsBox">
186 FingerprintsVector;AtomNeighborhoods:UFFAtomTypes:MinRadius0:MaxRadius
187 <br/> 2;41;AlphaNumericalValues;ValuesString;NR0-C_2-ATC1:NR1-C_3-ATC1:NR1-O
188 <br/> _2-ATC1:NR1-O_3-ATC1:NR2-C_3-ATC1 NR0-C_2-ATC1:NR1-C_R-ATC1:NR1-N_3-AT
189 <br/> C1:NR1-O_2-ATC1:NR2-C_R-ATC3 NR0-C_3-ATC1:NR1-C_2-ATC1:NR1-C_3-ATC1:NR
190 <br/> 2-C_3-ATC1:NR2-O_2-ATC1:NR2-O_3-ATC2 NR0-C_3-ATC1:NR1-C_3-ATC1:NR1-N_R
191 <br/> -ATC1:NR2-C_3-ATC1:NR2-C_R-ATC2 NR0-C_3-ATC1:NR1-C_3-ATC1:NR2-C_3-A...</div>
192 <p>
193 </p>
194 <h2>OPTIONS</h2>
195 <dl>
196 <dt><strong><strong>--AromaticityModel</strong> <em>MDLAromaticityModel | TriposAromaticityModel | MMFFAromaticityModel | ChemAxonBasicAromaticityModel | ChemAxonGeneralAromaticityModel | DaylightAromaticityModel | MayaChemToolsAromaticityModel</em></strong></dt>
197 <dd>
198 <p>Specify aromaticity model to use during detection of aromaticity. Possible values in the current
199 release are: <em>MDLAromaticityModel, TriposAromaticityModel, MMFFAromaticityModel,
200 ChemAxonBasicAromaticityModel, ChemAxonGeneralAromaticityModel, DaylightAromaticityModel
201 or MayaChemToolsAromaticityModel</em>. Default value: <em>MayaChemToolsAromaticityModel</em>.</p>
202 <p>The supported aromaticity model names along with model specific control parameters
203 are defined in <strong>AromaticityModelsData.csv</strong>, which is distributed with the current release
204 and is available under <strong>lib/data</strong> directory. <strong>Molecule.pm</strong> module retrieves data from
205 this file during class instantiation and makes it available to method <strong>DetectAromaticity</strong>
206 for detecting aromaticity corresponding to a specific model.</p>
207 </dd>
208 <dt><strong><strong>-a, --AtomIdentifierType</strong> <em>AtomicInvariantsAtomTypes | DREIDINGAtomTypes | EStateAtomTypes | FunctionalClassAtomTypes | MMFF94AtomTypes | SLogPAtomTypes | SYBYLAtomTypes | TPSAAtomTypes | UFFAtomTypes</em></strong></dt>
209 <dd>
210 <p>Specify atom identifier type to use for assignment of initial atom identifier to non-hydrogen
211 atoms during calculation of atom neighborhoods fingerprints. Possible values in the current
212 release are: <em>AtomicInvariantsAtomTypes, DREIDINGAtomTypes, EStateAtomTypes,
213 FunctionalClassAtomTypes, MMFF94AtomTypes, SLogPAtomTypes, SYBYLAtomTypes,
214 TPSAAtomTypes, UFFAtomTypes</em>. Default value: <em>AtomicInvariantsAtomTypes</em>.</p>
215 </dd>
216 <dt><strong><strong>--AtomicInvariantsToUse</strong> <em>&quot;AtomicInvariant,AtomicInvariant...&quot;</em></strong></dt>
217 <dd>
218 <p>This value is used during <em>AtomicInvariantsAtomTypes</em> value of <strong>a, --AtomIdentifierType</strong>
219 option. It's a list of comma separated valid atomic invariant atom types.</p>
220 <p>Possible values for atomic invariants are: <em>AS, X, BO, LBO, SB, DB, TB,
221 H, Ar, RA, FC, MN, SM</em>. Default value: <em>AS,X,BO,H,FC</em>.</p>
222 <p>The atomic invariants abbreviations correspond to:</p>
223 <div class="OptionsBox">
224 AS = Atom symbol corresponding to element symbol</div>
225 <div class="OptionsBox">
226 X&lt;n&gt; = Number of non-hydrogen atom neighbors or heavy atoms
227 <br/> BO&lt;n&gt; = Sum of bond orders to non-hydrogen atom neighbors or heavy atoms
228 <br/> LBO&lt;n&gt; = Largest bond order of non-hydrogen atom neighbors or heavy atoms
229 <br/> SB&lt;n&gt; = Number of single bonds to non-hydrogen atom neighbors or heavy atoms
230 <br/> DB&lt;n&gt; = Number of double bonds to non-hydrogen atom neighbors or heavy atoms
231 <br/> TB&lt;n&gt; = Number of triple bonds to non-hydrogen atom neighbors or heavy atoms
232 <br/> H&lt;n&gt; = Number of implicit and explicit hydrogens for atom
233 <br/> Ar = Aromatic annotation indicating whether atom is aromatic
234 <br/> RA = Ring atom annotation indicating whether atom is a ring
235 <br/> FC&lt;+n/-n&gt; = Formal charge assigned to atom
236 <br/> MN&lt;n&gt; = Mass number indicating isotope other than most abundant isotope
237 <br/> SM&lt;n&gt; = Spin multiplicity of atom. Possible values: 1 (singlet), 2 (doublet) or
238 3 (triplet)</div>
239 <p>Atom type generated by AtomTypes::AtomicInvariantsAtomTypes class corresponds to:</p>
240 <div class="OptionsBox">
241 AS.X&lt;n&gt;.BO&lt;n&gt;.LBO&lt;n&gt;.&lt;SB&gt;&lt;n&gt;.&lt;DB&gt;&lt;n&gt;.&lt;TB&gt;&lt;n&gt;.H&lt;n&gt;.Ar.RA.FC&lt;+n/-n&gt;.MN&lt;n&gt;.SM&lt;n&gt;</div>
242 <p>Except for AS which is a required atomic invariant in atom types, all other atomic invariants are
243 optional. Atom type specification doesn't include atomic invariants with zero or undefined values.</p>
244 <p>In addition to usage of abbreviations for specifying atomic invariants, the following descriptive words
245 are also allowed:</p>
246 <div class="OptionsBox">
247 X : NumOfNonHydrogenAtomNeighbors or NumOfHeavyAtomNeighbors
248 <br/> BO : SumOfBondOrdersToNonHydrogenAtoms or SumOfBondOrdersToHeavyAtoms
249 <br/> LBO : LargestBondOrderToNonHydrogenAtoms or LargestBondOrderToHeavyAtoms
250 <br/> SB : NumOfSingleBondsToNonHydrogenAtoms or NumOfSingleBondsToHeavyAtoms
251 <br/> DB : NumOfDoubleBondsToNonHydrogenAtoms or NumOfDoubleBondsToHeavyAtoms
252 <br/> TB : NumOfTripleBondsToNonHydrogenAtoms or NumOfTripleBondsToHeavyAtoms
253 <br/> H : NumOfImplicitAndExplicitHydrogens
254 <br/> Ar : Aromatic
255 <br/> RA : RingAtom
256 <br/> FC : FormalCharge
257 <br/> MN : MassNumber
258 <br/> SM : SpinMultiplicity</div>
259 <p><em>AtomTypes::AtomicInvariantsAtomTypes</em> module is used to assign atomic invariant
260 atom types.</p>
261 </dd>
262 <dt><strong><strong>--FunctionalClassesToUse</strong> <em>&quot;FunctionalClass1,FunctionalClass2...&quot;</em></strong></dt>
263 <dd>
264 <p>This value is used during <em>FunctionalClassAtomTypes</em> value of <strong>a, --AtomIdentifierType</strong>
265 option. It's a list of comma separated valid functional classes.</p>
266 <p>Possible values for atom functional classes are: <em>Ar, CA, H, HBA, HBD, Hal, NI, PI, RA</em>.
267 Default value [ Ref 24 ]: <em>HBD,HBA,PI,NI,Ar,Hal</em>.</p>
268 <p>The functional class abbreviations correspond to:</p>
269 <div class="OptionsBox">
270 HBD: HydrogenBondDonor
271 <br/> HBA: HydrogenBondAcceptor
272 <br/> PI : PositivelyIonizable
273 <br/> NI : NegativelyIonizable
274 <br/> Ar : Aromatic
275 <br/> Hal : Halogen
276 <br/> H : Hydrophobic
277 <br/> RA : RingAtom
278 <br/> CA : ChainAtom</div>
279 <div class="OptionsBox">
280 Functional class atom type specification for an atom corresponds to:</div>
281 <div class="OptionsBox">
282 Ar.CA.H.HBA.HBD.Hal.NI.PI.RA</div>
283 <p><em>AtomTypes::FunctionalClassAtomTypes</em> module is used to assign functional class atom
284 types. It uses following definitions [ Ref 60-61, Ref 65-66 ]:</p>
285 <div class="OptionsBox">
286 HydrogenBondDonor: NH, NH2, OH
287 <br/> HydrogenBondAcceptor: N[!H], O
288 <br/> PositivelyIonizable: +, NH2
289 <br/> NegativelyIonizable: -, C(=O)OH, S(=O)OH, P(=O)OH</div>
290 </dd>
291 <dt><strong><strong>--CompoundID</strong> <em>DataFieldName or LabelPrefixString</em></strong></dt>
292 <dd>
293 <p>This value is <strong>--CompoundIDMode</strong> specific and indicates how compound ID is generated.</p>
294 <p>For <em>DataField</em> value of <strong>--CompoundIDMode</strong> option, it corresponds to datafield label name
295 whose value is used as compound ID; otherwise, it's a prefix string used for generating compound
296 IDs like LabelPrefixString&lt;Number&gt;. Default value, <em>Cmpd</em>, generates compound IDs which
297 look like Cmpd&lt;Number&gt;.</p>
298 <p>Examples for <em>DataField</em> value of <strong>--CompoundIDMode</strong>:</p>
299 <div class="OptionsBox">
300 MolID
301 <br/> ExtReg</div>
302 <p>Examples for <em>LabelPrefix</em> or <em>MolNameOrLabelPrefix</em> value of <strong>--CompoundIDMode</strong>:</p>
303 <div class="OptionsBox">
304 Compound</div>
305 <p>The value specified above generates compound IDs which correspond to Compound&lt;Number&gt;
306 instead of default value of Cmpd&lt;Number&gt;.</p>
307 </dd>
308 <dt><strong><strong>--CompoundIDLabel</strong> <em>text</em></strong></dt>
309 <dd>
310 <p>Specify compound ID column label for FP or CSV/TSV text file(s) used during <em>CompoundID</em> value
311 of <strong>--DataFieldsMode</strong> option. Default: <em>CompoundID</em>.</p>
312 </dd>
313 <dt><strong><strong>--CompoundIDMode</strong> <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em></strong></dt>
314 <dd>
315 <p>Specify how to generate compound IDs and write to FP or CSV/TSV text file(s) along with generated
316 fingerprints for <em>FP | text | all</em> values of <strong>--output</strong> option: use a <em>SDFile(s)</em> datafield value;
317 use molname line from <em>SDFile(s)</em>; generate a sequential ID with specific prefix; use combination
318 of both MolName and LabelPrefix with usage of LabelPrefix values for empty molname lines.</p>
319 <p>Possible values: <em>DataField | MolName | LabelPrefix | MolNameOrLabelPrefix</em>.
320 Default: <em>LabelPrefix</em>.</p>
321 <p>For <em>MolNameAndLabelPrefix</em> value of <strong>--CompoundIDMode</strong>, molname line in <em>SDFile(s)</em> takes
322 precedence over sequential compound IDs generated using <em>LabelPrefix</em> and only empty molname
323 values are replaced with sequential compound IDs.</p>
324 <p>This is only used for <em>CompoundID</em> value of <strong>--DataFieldsMode</strong> option.</p>
325 </dd>
326 <dt><strong><strong>--DataFields</strong> <em>&quot;FieldLabel1,FieldLabel2,...&quot;</em></strong></dt>
327 <dd>
328 <p>Comma delimited list of <em>SDFiles(s)</em> data fields to extract and write to CSV/TSV text file(s) along
329 with generated fingerprints for <em>text | all</em> values of <strong>--output</strong> option.</p>
330 <p>This is only used for <em>Specify</em> value of <strong>--DataFieldsMode</strong> option.</p>
331 <p>Examples:</p>
332 <div class="OptionsBox">
333 Extreg
334 <br/> MolID,CompoundName</div>
335 </dd>
336 <dt><strong><strong>-d, --DataFieldsMode</strong> <em>All | Common | Specify | CompoundID</em></strong></dt>
337 <dd>
338 <p>Specify how data fields in <em>SDFile(s)</em> are transferred to output CSV/TSV text file(s) along
339 with generated fingerprints for <em>text | all</em> values of <strong>--output</strong> option: transfer all SD
340 data field; transfer SD data files common to all compounds; extract specified data fields;
341 generate a compound ID using molname line, a compound prefix, or a combination of both.
342 Possible values: <em>All | Common | specify | CompoundID</em>. Default value: <em>CompoundID</em>.</p>
343 </dd>
344 <dt><strong><strong>-f, --Filter</strong> <em>Yes | No</em></strong></dt>
345 <dd>
346 <p>Specify whether to check and filter compound data in SDFile(s). Possible values: <em>Yes or No</em>.
347 Default value: <em>Yes</em>.</p>
348 <p>By default, compound data is checked before calculating fingerprints and compounds containing
349 atom data corresponding to non-element symbols or no atom data are ignored.</p>
350 </dd>
351 <dt><strong><strong>--FingerprintsLabel</strong> <em>text</em></strong></dt>
352 <dd>
353 <p>SD data label or text file column label to use for fingerprints string in output SD or
354 CSV/TSV text file(s) specified by <strong>--output</strong>. Default value: <em>AtomNeighborhoodsFingerprints</em>.</p>
355 </dd>
356 <dt><strong><strong>-h, --help</strong></strong></dt>
357 <dd>
358 <p>Print this help message.</p>
359 </dd>
360 <dt><strong><strong>-k, --KeepLargestComponent</strong> <em>Yes | No</em></strong></dt>
361 <dd>
362 <p>Generate fingerprints for only the largest component in molecule. Possible values:
363 <em>Yes or No</em>. Default value: <em>Yes</em>.</p>
364 <p>For molecules containing multiple connected components, fingerprints can be generated
365 in two different ways: use all connected components or just the largest connected
366 component. By default, all atoms except for the largest connected component are
367 deleted before generation of fingerprints.</p>
368 </dd>
369 <dt><strong><strong>--MinNeighborhoodRadius</strong> <em>number</em></strong></dt>
370 <dd>
371 <p>Minimum atom neighborhood radius for generating atom neighborhoods. Default value: <em>0</em>.
372 Valid values: positive integers and less than <strong>--MaxNeighborhoodRadius</strong>. Neighborhood
373 radius of zero corresponds to list of non-hydrogen atoms.</p>
374 </dd>
375 <dt><strong><strong>--MaxNeighborhoodRadius</strong> <em>number</em></strong></dt>
376 <dd>
377 <p>Maximum atom neighborhood radius for generating atom neighborhoods. Default value: <em>2</em>.
378 Valid values: positive integers and greater than <strong>--MineighborhoodRadius</strong>.</p>
379 </dd>
380 <dt><strong><strong>--OutDelim</strong> <em>comma | tab | semicolon</em></strong></dt>
381 <dd>
382 <p>Delimiter for output CSV/TSV text file(s). Possible values: <em>comma, tab, or semicolon</em>
383 Default value: <em>comma</em>.</p>
384 </dd>
385 <dt><strong><strong>--output</strong> <em>SD | FP | text | all</em></strong></dt>
386 <dd>
387 <p>Type of output files to generate. Possible values: <em>SD, FP, text, or all</em>. Default value: <em>text</em>.</p>
388 </dd>
389 <dt><strong><strong>-o, --overwrite</strong></strong></dt>
390 <dd>
391 <p>Overwrite existing files.</p>
392 </dd>
393 <dt><strong><strong>-q, --quote</strong> <em>Yes | No</em></strong></dt>
394 <dd>
395 <p>Put quote around column values in output CSV/TSV text file(s). Possible values:
396 <em>Yes or No</em>. Default value: <em>Yes</em>.</p>
397 </dd>
398 <dt><strong><strong>-r, --root</strong> <em>RootName</em></strong></dt>
399 <dd>
400 <p>New file name is generated using the root: &lt;Root&gt;.&lt;Ext&gt;. Default for new file names:
401 &lt;SDFileName&gt;&lt;AtomNeighborhoodsFP&gt;.&lt;Ext&gt;. The file type determines &lt;Ext&gt;
402 value. The sdf, fpf, csv, and tsv &lt;Ext&gt; values are used for SD, comma/semicolon, and tab
403 delimited text files, respectively.This option is ignored for multiple input files.</p>
404 </dd>
405 <dt><strong><strong>-w, --WorkingDir</strong> <em>DirName</em></strong></dt>
406 <dd>
407 <p>Location of working directory. Default: current directory.</p>
408 </dd>
409 </dl>
410 <p>
411 </p>
412 <h2>EXAMPLES</h2>
413 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
414 2 using atomic invariants atom types in vector string format and create a SampleANFP.csv
415 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
416 <div class="ExampleBox">
417 % AtomNeighborhoodsFingerprints.pl -r SampleANFP -o Sample.sdf</div>
418 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
419 2 using DREIDING atom types in vector string format and create a SampleANFP.csv
420 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
421 <div class="ExampleBox">
422 % AtomNeighborhoodsFingerprints.pl -a DREIDINGAtomTypes -r SampleANFP
423 -o Sample.sdf</div>
424 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
425 2 using EStateAtomTypes types in vector string format and create a SampleANFP.csv
426 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
427 <div class="ExampleBox">
428 % AtomNeighborhoodsFingerprints.pl -a EStateAtomTypes -r SampleANFP
429 -o Sample.sdf</div>
430 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
431 2 using SYBYL atom types in vector string format and create a SampleANFP.csv
432 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
433 <div class="ExampleBox">
434 % AtomNeighborhoodsFingerprints.pl -a SYBYLAtomTypes -r SampleANFP
435 -o Sample.sdf</div>
436 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
437 2 using FunctionalClass atom types in vector string format and create a SampleANFP.csv
438 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
439 <div class="ExampleBox">
440 % AtomNeighborhoodsFingerprints.pl -a FunctionalClassAtomTypes
441 -r SampleANFP -o Sample.sdf</div>
442 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
443 2 using MMFF94 atom types in vector string format and create a SampleANFP.csv
444 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
445 <div class="ExampleBox">
446 % AtomNeighborhoodsFingerprints.pl -a MMFF94AtomTypes -r SampleANFP
447 -o Sample.sdf</div>
448 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
449 2 using SLogP atom types in vector string format and create a SampleANFP.csv
450 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
451 <div class="ExampleBox">
452 % AtomNeighborhoodsFingerprints.pl -a SLogPAtomTypes -r SampleANFP
453 -o Sample.sdf</div>
454 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
455 2 using SYBYL atom types in vector string format and create a SampleANFP.csv
456 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
457 <div class="ExampleBox">
458 % AtomNeighborhoodsFingerprints.pl -a SYBYLAtomTypes -r SampleANFP
459 -o Sample.sdf</div>
460 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
461 2 using TPSA atom types in vector string format and create a SampleANFP.csv
462 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
463 <div class="ExampleBox">
464 % AtomNeighborhoodsFingerprints.pl -a TPSAAtomTypes -r SampleANFP
465 -o Sample.sdf</div>
466 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
467 2 using UFF atom types in vector string format and create a SampleANFP.csv
468 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
469 <div class="ExampleBox">
470 % AtomNeighborhoodsFingerprints.pl -a UFFAtomTypes -r SampleANFP
471 -o Sample.sdf</div>
472 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
473 2 using atomic invariants atom types in vector string format and create SampleANFP.sdf,
474 SampleANFP.fpf and SampleANFP.csv files containing sequential compound IDs in CSV file along
475 with fingerprints vector strings data, type:</p>
476 <div class="ExampleBox">
477 % AtomNeighborhoodsFingerprints.pl --output all -r SampleANFP
478 -o Sample.sdf</div>
479 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 1 to
480 3 using atomic invariants atom types in vector string format and create a SampleANFP.csv
481 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
482 <div class="ExampleBox">
483 % AtomNeighborhoodsFingerprints.pl -a AtomicInvariantsAtomTypes
484 --MinNeighborhoodRadius 1 --MaxNeighborhoodRadius 3 -r SampleANFP
485 -o Sample.sdf</div>
486 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
487 2 using only AS,X atomic invariants atom types in vector string format and create a SampleANFP.csv
488 file containing sequential compound IDs along with fingerprints vector strings data, type:</p>
489 <div class="ExampleBox">
490 % AtomNeighborhoodsFingerprints.pl -a AtomicInvariantsAtomTypes
491 --AtomicInvariantsToUse &quot;AS,X&quot; --MinNeighborhoodRadius 0
492 --MaxNeighborhoodRadius 3 -r SampleANFP -o Sample.sdf</div>
493 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
494 2 using atomic invariants atom types in vector string format and create a SampleANFP.csv
495 file containing compound ID from molecule name line along with fingerprints vector strings data, type:</p>
496 <div class="ExampleBox">
497 % AtomNeighborhoodsFingerprints.pl -a AtomicInvariantsAtomTypes
498 --DataFieldsMode CompoundID --CompoundIDMode MolName
499 -r SampleANFP -o Sample.sdf</div>
500 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
501 2 using atomic invariants atom types in vector string format and create a SampleANFP.csv
502 file containing compound IDs using specified data field along with fingerprints vector strings
503 data, type:</p>
504 <div class="ExampleBox">
505 % AtomNeighborhoodsFingerprints.pl -a AtomicInvariantsAtomTypes
506 --DataFieldsMode CompoundID --CompoundIDMode DataField --CompoundID
507 Mol_ID -r SampleANFP -o Sample.sdf</div>
508 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
509 2 using atomic invariants atom types in vector string format and create a SampleANFP.csv
510 file containing compound ID using combination of molecule name line and an explicit compound
511 prefix along with fingerprints vector strings data, type:</p>
512 <div class="ExampleBox">
513 % AtomNeighborhoodsFingerprints.pl -a AtomicInvariantsAtomTypes
514 --DataFieldsMode CompoundID --CompoundIDMode MolnameOrLabelPrefix
515 --CompoundID Cmpd --CompoundIDLabel MolID -r SampleANFP -o Sample.sdf</div>
516 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
517 2 using atomic invariants atom types in vector string format and create a SampleANFP.csv
518 file containing specific data fields columns along with fingerprints vector strings
519 data, type:</p>
520 <div class="ExampleBox">
521 % AtomNeighborhoodsFingerprints.pl -a AtomicInvariantsAtomTypes
522 --DataFieldsMode Specify --DataFields Mol_ID -r SampleANFP
523 -o Sample.sdf</div>
524 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
525 2 using atomic invariants atom types in vector string format and create a SampleANFP.csv
526 file containing common data fields columns along with fingerprints vector strings
527 data, type:</p>
528 <div class="ExampleBox">
529 % AtomNeighborhoodsFingerprints.pl -a AtomicInvariantsAtomTypes
530 --DataFieldsMode Common -r SampleANFP -o Sample.sdf</div>
531 <p>To generate atom neighborhoods fingerprints corresponding to atom neighborhood radii from 0 to
532 2 using atomic invariants atom types in vector string format and create SampleANFP.sdf,
533 SampleANFP.fpf and SampleANFP.csv files containing all data fields columns in CSV file along with
534 fingerprints data, type:</p>
535 <div class="ExampleBox">
536 % AtomNeighborhoodsFingerprints.pl -a AtomicInvariantsAtomTypes
537 --DataFieldsMode All --output all -r SampleANFP
538 -o Sample.sdf</div>
539 <p>
540 </p>
541 <h2>AUTHOR</h2>
542 <p><a href="mailto:msud@san.rr.com">Manish Sud</a></p>
543 <p>
544 </p>
545 <h2>SEE ALSO</h2>
546 <p><a href="./InfoFingerprintsFiles.html">InfoFingerprintsFiles.pl</a>,&nbsp<a href="./SimilarityMatricesFingerprints.html">SimilarityMatricesFingerprints.pl</a>,&nbsp<a href="./SimilaritySearchingFingerprints.html">SimilaritySearchingFingerprints.pl</a>,&nbsp
547 <a href="./ExtendedConnectivityFingerprints.html">ExtendedConnectivityFingerprints.pl</a>,&nbsp<a href="./MACCSKeysFingerprints.html">MACCSKeysFingerprints.pl</a>,&nbsp<a href="./PathLengthFingerprints.html">PathLengthFingerprints.pl</a>,&nbsp
548 <a href="./TopologicalAtomPairsFingerprints.html">TopologicalAtomPairsFingerprints.pl</a>,&nbsp<a href="./TopologicalAtomTorsionsFingerprints.html">TopologicalAtomTorsionsFingerprints.pl</a>,&nbsp
549 <a href="./TopologicalPharmacophoreAtomPairsFingerprints.html">TopologicalPharmacophoreAtomPairsFingerprints.pl</a>,&nbsp<a href="./TopologicalPharmacophoreAtomTripletsFingerprints.html">TopologicalPharmacophoreAtomTripletsFingerprints.pl</a>
550 </p>
551 <p>
552 </p>
553 <h2>COPYRIGHT</h2>
554 <p>Copyright (C) 2015 Manish Sud. All rights reserved.</p>
555 <p>This file is part of MayaChemTools.</p>
556 <p>MayaChemTools is free software; you can redistribute it and/or modify it under
557 the terms of the GNU Lesser General Public License as published by the Free
558 Software Foundation; either version 3 of the License, or (at your option)
559 any later version.</p>
560 <p>&nbsp</p><p>&nbsp</p><div class="DocNav">
561 <table width="100%" border=0 cellpadding=0 cellspacing=2>
562 <tr align="left" valign="top"><td width="33%" align="left"><a href="./AnalyzeTextFilesData.html" title="AnalyzeTextFilesData.html">Previous</a>&nbsp;&nbsp;<a href="./index.html" title="Table of Contents">TOC</a>&nbsp;&nbsp;<a href="./AtomTypesFingerprints.html" title="AtomTypesFingerprints.html">Next</a></td><td width="34%" align="middle"><strong>March 29, 2015</strong></td><td width="33%" align="right"><strong>AtomNeighborhoodsFingerprints.pl</strong></td></tr>
563 </table>
564 </div>
565 <br />
566 <center>
567 <img src="../../images/h2o2.png">
568 </center>
569 </body>
570 </html>