annotate mayachemtools/docs/scripts/txt/PathLengthFingerprints.txt @ 9:ab29fa5c8c1f draft default tip

Uploaded
author deepakjadmin
date Thu, 15 Dec 2016 14:18:03 -0500
parents 73ae111cf86f
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
1 NAME
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
2 PathLengthFingerprints.pl - Generate atom path length based fingerprints
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
3 for SD files
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
4
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
5 SYNOPSIS
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
6 PathLengthFingerprints.pl SDFile(s)...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
7
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
8 PathLengthFingerprints.pl [--AromaticityModel *AromaticityModelType*]
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
9 [-a, --AtomIdentifierType *AtomicInvariantsAtomTypes*]
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
10 [--AtomicInvariantsToUse *"AtomicInvariant1,AtomicInvariant2..."*]
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
11 [--FunctionalClassesToUse *"FunctionalClass1,FunctionalClass2..."*]
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
12 [--BitsOrder *Ascending | Descending*] [-b, --BitStringFormat
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
13 *BinaryString | HexadecimalString*] [--CompoundID *DataFieldName or
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
14 LabelPrefixString*] [--CompoundIDLabel *text*] [--CompoundIDMode
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
15 *DataField | MolName | LabelPrefix | MolNameOrLabelPrefix*]
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
16 [--DataFields *"FieldLabel1,FieldLabel2,... "*] [-d, --DataFieldsMode
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
17 *All | Common | Specify | CompoundID*] [--DetectAromaticity *Yes | No*]
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
18 [-f, --Filter *Yes | No*] [--FingerprintsLabel *text*] [--fold *Yes |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
19 No*] [--FoldedSize *number*] [-h, --help] [-i, --IgnoreHydrogens *Yes |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
20 No*] [-k, --KeepLargestComponent *Yes | No*] [-m, --mode *PathLengthBits
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
21 | PathLengthCount*] [--MinPathLength *number*] [--MaxPathLength
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
22 *number*] [-n, --NumOfBitsToSetPerPath *number*] [--OutDelim *comma |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
23 tab | semicolon*] [--output *SD | FP | text | all*] [-q, --quote *Yes |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
24 No*] [-r, --root *RootName*] [-p, --PathMode *AtomPathsWithoutRings |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
25 AtomPathsWithRings | AllAtomPathsWithoutRings | AllAtomPathsWithRings*]
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
26 [-s, --size *number*] [-u, --UseBondSymbols *Yes | No*]
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
27 [--UsePerlCoreRandom *Yes | No*] [--UseUniquePaths *Yes | No*] [-q,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
28 --quote *Yes | No*] [-r, --root *RootName*] [-v, --VectorStringFormat
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
29 *IDsAndValuesString | IDsAndValuesPairsString | ValuesAndIDsString |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
30 ValuesAndIDsPairsString*] [-w, --WorkingDir dirname] SDFile(s)...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
31
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
32 DESCRIPTION
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
33 Generate atom path length fingerprints for *SDFile(s)* and create
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
34 appropriate SD, FP or CSV/TSV text file(s) containing fingerprints
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
35 bit-vector or vector strings corresponding to molecular fingerprints.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
36
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
37 Multiple SDFile names are separated by spaces. The valid file extensions
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
38 are *.sdf* and *.sd*. All other file names are ignored. All the SD files
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
39 in a current directory can be specified either by **.sdf* or the current
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
40 directory name.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
41
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
42 The current release of MayaChemTools supports generation of path length
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
43 fingerprints corresponding to following -a, --AtomIdentifierTypes:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
44
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
45 AtomicInvariantsAtomTypes, DREIDINGAtomTypes, EStateAtomTypes,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
46 FunctionalClassAtomTypes, MMFF94AtomTypes, SLogPAtomTypes,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
47 SYBYLAtomTypes, TPSAAtomTypes, UFFAtomTypes
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
48
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
49 Based on the values specified for -p, --PathMode, --MinPathLength and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
50 --MaxPathLength, all appropriate atom paths are generated for each atom
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
51 in the molecule and collected in a list and the list is filtered to
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
52 remove any structurally duplicate paths as indicated by the value of
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
53 --UseUniquePaths option.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
54
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
55 For each atom path in the filtered atom paths list, an atom path string
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
56 is created using value of -a, --AtomIdentifierType and specified values
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
57 to use for a particular atom identifier type. Value of -u,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
58 --UseBondSymbols controls whether bond order symbols are used during
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
59 generation of atom path string. For each atom path, only
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
60 lexicographically smaller atom path strings are kept.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
61
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
62 For *PathLengthBits* value of -m, --mode option, each atom path is
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
63 hashed to a 32 bit unsigned integer key using TextUtil::HashCode
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
64 function. Using the hash key as a seed for a random number generator, a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
65 random integer value between 0 and --Size is used to set corresponding
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
66 bits in the fingerprint bit-vector string. Value of
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
67 --NumOfBitsToSetPerPath option controls the number of time a random
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
68 number is generated to set corresponding bits.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
69
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
70 For * PathLengthCount* value of -m, --mode option, the number of times
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
71 an atom path appears is tracked and a fingerprints count-string
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
72 corresponding to count of atom paths is generated.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
73
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
74 Example of *SD* file containing path length fingerprints string data:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
75
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
76 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
77 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
78 $$$$
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
79 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
80 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
81 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
82 41 44 0 0 0 0 0 0 0 0999 V2000
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
83 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
84 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
85 2 3 1 0 0 0 0
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
86 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
87 M END
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
88 > <CmpdID>
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
89 Cmpd1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
90
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
91 > <PathLengthFingerprints>
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
92 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLengt
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
93 h1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a49913991a66
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
94 03130b0a19e8051c89184414953800cc2151082844a201042800130860308e8204d4028
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
95 00831048940e44281c00060449a5000ac80c894114e006321264401600846c050164462
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
96 08190410805000304a10205b0100e04c0038ba0fad0209c0ca8b1200012268b61c0026a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
97 aa0660a11014a011d46
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
98
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
99 $$$$
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
100 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
101 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
102
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
103 Example of *FP* file containing path length fingerprints string data:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
104
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
105 #
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
106 # Package = MayaChemTools 7.4
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
107 # ReleaseDate = Oct 21, 2010
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
108 #
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
109 # TimeStamp = Mon Mar 7 15:14:01 2011
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
110 #
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
111 # FingerprintsStringType = FingerprintsBitVector
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
112 #
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
113 # Description = PathLengthBits:AtomicInvariantsAtomTypes:MinLength1:...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
114 # Size = 1024
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
115 # BitStringFormat = HexadecimalString
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
116 # BitsOrder = Ascending
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
117 #
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
118 Cmpd1 9c8460989ec8a49913991a6603130b0a19e8051c89184414953800cc21510...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
119 Cmpd2 000000249400840040100042011001001980410c000000001010088001120...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
120 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
121 ... ..
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
122
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
123 Example of CSV *Text* file containing pathlength fingerprints string
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
124 data:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
125
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
126 "CompoundID","PathLengthFingerprints"
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
127 "Cmpd1","FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
128 :MinLength1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a4
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
129 9913991a6603130b0a19e8051c89184414953800cc2151082844a20104280013086030
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
130 8e8204d402800831048940e44281c00060449a5000ac80c894114e006321264401..."
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
131 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
132 ... ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
133
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
134 The current release of MayaChemTools generates the following types of
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
135 path length fingerprints bit-vector and vector strings:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
136
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
137 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
138 th1:MaxLength8;1024;BinaryString;Ascending;001000010011010101011000110
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
139 0100010101011000101001011100110001000010001001101000001001001001001000
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
140 0010110100000111001001000001001010100100100000000011000000101001011100
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
141 0010000001000101010100000100111100110111011011011000000010110111001101
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
142 0101100011000000010001000011000010100011101100001000001000100000000...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
143
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
144 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
145 th1:MaxLength8;1024;HexadecimalString;Ascending;48caa1315d82d91122b029
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
146 42861c9409a4208182d12015509767bd0867653604481a8b1288000056090583603078
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
147 9cedae54e26596889ab121309800900490515224208421502120a0dd9200509723ae89
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
148 00024181b86c0122821d4e4880c38620dab280824b455404009f082003d52c212b4e6d
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
149 6ea05280140069c780290c43
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
150
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
151 FingerprintsVector;PathLengthCount:AtomicInvariantsAtomTypes:MinLength
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
152 1:MaxLength8;432;NumericalValues;IDsAndValuesPairsString;C.X1.BO1.H3 2
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
153 C.X2.BO2.H2 4 C.X2.BO3.H1 14 C.X3.BO3.H1 3 C.X3.BO4 10 F.X1.BO1 1 N.X
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
154 2.BO2.H1 1 N.X3.BO3 1 O.X1.BO1.H1 3 O.X1.BO2 2 C.X1.BO1.H3C.X3.BO3.H1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
155 2 C.X2.BO2.H2C.X2.BO2.H2 1 C.X2.BO2.H2C.X3.BO3.H1 4 C.X2.BO2.H2C.X3.BO
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
156 4 1 C.X2.BO2.H2N.X3.BO3 1 C.X2.BO3.H1:C.X2.BO3.H1 10 C.X2.BO3.H1:C....
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
157
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
158 FingerprintsVector;PathLengthCount:DREIDINGAtomTypes:MinLength1:MaxLen
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
159 gth8;410;NumericalValues;IDsAndValuesPairsString;C_2 2 C_3 9 C_R 22 F_
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
160 1 N_3 1 N_R 1 O_2 2 O_3 3 C_2=O_2 2 C_2C_3 1 C_2C_R 1 C_2N_3 1 C_2O_3
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
161 1 C_3C_3 7 C_3C_R 1 C_3N_R 1 C_3O_3 2 C_R:C_R 21 C_R:N_R 2 C_RC_R 2 C
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
162 _RF_ 1 C_RN_3 1 C_2C_3C_3 1 C_2C_R:C_R 2 C_2N_3C_R 1 C_3C_2=O_2 1 C_3C
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
163 _2O_3 1 C_3C_3C_3 5 C_3C_3C_R 2 C_3C_3N_R 1 C_3C_3O_3 4 C_3C_R:C_R ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
164
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
165 FingerprintsVector;PathLengthCount:EStateAtomTypes:MinLength1:MaxLengt
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
166 h8;454;NumericalValues;IDsAndValuesPairsString;aaCH 14 aasC 8 aasN 1 d
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
167 O 2 dssC 2 sCH3 2 sF 1 sOH 3 ssCH2 4 ssNH 1 sssCH 3 aaCH:aaCH 10 aaCH:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
168 aasC 8 aasC:aasC 3 aasC:aasN 2 aasCaasC 2 aasCdssC 1 aasCsF 1 aasCssNH
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
169 1 aasCsssCH 1 aasNssCH2 1 dO=dssC 2 dssCsOH 1 dssCssCH2 1 dssCssNH 1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
170 sCH3sssCH 2 sOHsssCH 2 ssCH2ssCH2 1 ssCH2sssCH 4 aaCH:aaCH:aaCH 6 a...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
171
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
172 FingerprintsVector;PathLengthCount:FunctionalClassAtomTypes:MinLength1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
173 :MaxLength8;404;NumericalValues;IDsAndValuesPairsString;Ar 22 Ar.HBA 1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
174 HBA 2 HBA.HBD 3 HBD 1 Hal 1 NI 1 None 10 Ar.HBA:Ar 2 Ar.HBANone 1 Ar:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
175 Ar 21 ArAr 2 ArHBD 1 ArHal 1 ArNone 2 HBA.HBDNI 1 HBA.HBDNone 2 HBA=NI
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
176 1 HBA=None 1 HBDNone 1 NINone 1 NoneNone 7 Ar.HBA:Ar:Ar 2 Ar.HBA:ArAr
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
177 1 Ar.HBA:ArNone 1 Ar.HBANoneNone 1 Ar:Ar.HBA:Ar 1 Ar:Ar.HBANone 2 ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
178
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
179 FingerprintsVector;PathLengthCount:MMFF94AtomTypes:MinLength1:MaxLengt
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
180 h8;463;NumericalValues;IDsAndValuesPairsString;C5A 2 C5B 2 C=ON 1 CB 1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
181 8 COO 1 CR 9 F 1 N5 1 NC=O 1 O=CN 1 O=CO 1 OC=O 1 OR 2 C5A:C5B 2 C5A:N
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
182 5 2 C5ACB 1 C5ACR 1 C5B:C5B 1 C5BC=ON 1 C5BCB 1 C=ON=O=CN 1 C=ONNC=O 1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
183 CB:CB 18 CBF 1 CBNC=O 1 COO=O=CO 1 COOCR 1 COOOC=O 1 CRCR 7 CRN5 1 CR
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
184 OR 2 C5A:C5B:C5B 2 C5A:C5BC=ON 1 C5A:C5BCB 1 C5A:N5:C5A 1 C5A:N5CR ...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
185
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
186 FingerprintsVector;PathLengthCount:SLogPAtomTypes:MinLength1:MaxLength
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
187 8;518;NumericalValues;IDsAndValuesPairsString;C1 5 C10 1 C11 1 C14 1 C
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
188 18 14 C20 4 C21 2 C22 1 C5 2 CS 2 F 1 N11 1 N4 1 O10 1 O2 3 O9 1 C10C1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
189 1 C10N11 1 C11C1 2 C11C21 1 C14:C18 2 C14F 1 C18:C18 10 C18:C20 4 C18
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
190 :C22 2 C1C5 1 C1CS 4 C20:C20 1 C20:C21 1 C20:N11 1 C20C20 2 C21:C21 1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
191 C21:N11 1 C21C5 1 C22N4 1 C5=O10 1 C5=O9 1 C5N4 1 C5O2 1 CSO2 2 C10...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
192
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
193 FingerprintsVector;PathLengthCount:SYBYLAtomTypes:MinLength1:MaxLength
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
194 8;412;NumericalValues;IDsAndValuesPairsString;C.2 2 C.3 9 C.ar 22 F 1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
195 N.am 1 N.ar 1 O.2 1 O.3 2 O.co2 2 C.2=O.2 1 C.2=O.co2 1 C.2C.3 1 C.2C.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
196 ar 1 C.2N.am 1 C.2O.co2 1 C.3C.3 7 C.3C.ar 1 C.3N.ar 1 C.3O.3 2 C.ar:C
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
197 .ar 21 C.ar:N.ar 2 C.arC.ar 2 C.arF 1 C.arN.am 1 C.2C.3C.3 1 C.2C.ar:C
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
198 .ar 2 C.2N.amC.ar 1 C.3C.2=O.co2 1 C.3C.2O.co2 1 C.3C.3C.3 5 C.3C.3...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
199
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
200 FingerprintsVector;PathLengthCount:TPSAAtomTypes:MinLength1:MaxLength8
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
201 ;331;NumericalValues;IDsAndValuesPairsString;N21 1 N7 1 None 34 O3 2 O
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
202 4 3 N21:None 2 N21None 1 N7None 2 None:None 21 None=O3 2 NoneNone 13 N
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
203 oneO4 3 N21:None:None 2 N21:NoneNone 2 N21NoneNone 1 N7None:None 2 N7N
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
204 one=O3 1 N7NoneNone 1 None:N21:None 1 None:N21None 2 None:None:None 20
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
205 None:NoneNone 12 NoneN7None 1 NoneNone=O3 2 NoneNoneNone 8 NoneNon...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
206
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
207 FingerprintsVector;PathLengthCount:UFFAtomTypes:MinLength1:MaxLength8;
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
208 410;NumericalValues;IDsAndValuesPairsString;C_2 2 C_3 9 C_R 22 F_ 1 N_
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
209 3 1 N_R 1 O_2 2 O_3 3 C_2=O_2 2 C_2C_3 1 C_2C_R 1 C_2N_3 1 C_2O_3 1 C_
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
210 3C_3 7 C_3C_R 1 C_3N_R 1 C_3O_3 2 C_R:C_R 21 C_R:N_R 2 C_RC_R 2 C_RF_
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
211 1 C_RN_3 1 C_2C_3C_3 1 C_2C_R:C_R 2 C_2N_3C_R 1 C_3C_2=O_2 1 C_3C_2O_3
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
212 1 C_3C_3C_3 5 C_3C_3C_R 2 C_3C_3N_R 1 C_3C_3O_3 4 C_3C_R:C_R 1 C_3...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
213
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
214 OPTIONS
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
215 --AromaticityModel *MDLAromaticityModel | TriposAromaticityModel |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
216 MMFFAromaticityModel | ChemAxonBasicAromaticityModel |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
217 ChemAxonGeneralAromaticityModel | DaylightAromaticityModel |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
218 MayaChemToolsAromaticityModel*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
219 Specify aromaticity model to use during detection of aromaticity.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
220 Possible values in the current release are: *MDLAromaticityModel,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
221 TriposAromaticityModel, MMFFAromaticityModel,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
222 ChemAxonBasicAromaticityModel, ChemAxonGeneralAromaticityModel,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
223 DaylightAromaticityModel or MayaChemToolsAromaticityModel*. Default
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
224 value: *MayaChemToolsAromaticityModel*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
225
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
226 The supported aromaticity model names along with model specific
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
227 control parameters are defined in AromaticityModelsData.csv, which
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
228 is distributed with the current release and is available under
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
229 lib/data directory. Molecule.pm module retrieves data from this file
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
230 during class instantiation and makes it available to method
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
231 DetectAromaticity for detecting aromaticity corresponding to a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
232 specific model.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
233
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
234 This option is ignored during *No* value of --DetectAromaticity
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
235 option.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
236
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
237 -a, --AtomIdentifierType *AtomicInvariantsAtomTypes | DREIDINGAtomTypes
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
238 | EStateAtomTypes | FunctionalClassAtomTypes | MMFF94AtomTypes |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
239 SLogPAtomTypes | SYBYLAtomTypes | TPSAAtomTypes | UFFAtomTypes*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
240 Specify atom identifier type to use for assignment of atom types to
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
241 hydrogen and/or non-hydrogen atoms during calculation of atom types
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
242 fingerprints. Possible values in the current release are:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
243 *AtomicInvariantsAtomTypes, DREIDINGAtomTypes, EStateAtomTypes,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
244 FunctionalClassAtomTypes, MMFF94AtomTypes, SLogPAtomTypes,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
245 SYBYLAtomTypes, TPSAAtomTypes, UFFAtomTypes*. Default value:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
246 *AtomicInvariantsAtomTypes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
247
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
248 -a, --AtomIdentifierType *AtomicInvariantsAtomTypes | DREIDINGAtomTypes
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
249 | EStateAtomTypes | FunctionalClassAtomTypes | MMFF94AtomTypes |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
250 SLogPAtomTypes | SYBYLAtomTypes | TPSAAtomTypes | UFFAtomTypes*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
251 Specify atom identifier type to use during generation of atom path
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
252 strings corresponding to path length fingerprints. Possible values
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
253 in the current release are: *AtomicInvariantsAtomTypes,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
254 DREIDINGAtomTypes, EStateAtomTypes, FunctionalClassAtomTypes,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
255 MMFF94AtomTypes, SLogPAtomTypes, SYBYLAtomTypes, TPSAAtomTypes,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
256 UFFAtomTypes*. Default value: *AtomicInvariantsAtomTypes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
257
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
258 --AtomicInvariantsToUse *"AtomicInvariant1,AtomicInvariant2..."*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
259 This value is used during *AtomicInvariantsAtomTypes* value of a,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
260 --AtomIdentifierType option. It's a list of comma separated valid
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
261 atomic invariant atom types.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
262
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
263 Possible values for atomic invariants are: *AS, X, BO, LBO, SB, DB,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
264 TB, H, Ar, RA, FC, MN, SM*. Default value: *AS*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
265
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
266 The atomic invariants abbreviations correspond to:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
267
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
268 AS = Atom symbol corresponding to element symbol
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
269
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
270 X<n> = Number of non-hydrogen atom neighbors or heavy atoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
271 BO<n> = Sum of bond orders to non-hydrogen atom neighbors or heavy atoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
272 LBO<n> = Largest bond order of non-hydrogen atom neighbors or heavy atoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
273 SB<n> = Number of single bonds to non-hydrogen atom neighbors or heavy atoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
274 DB<n> = Number of double bonds to non-hydrogen atom neighbors or heavy atoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
275 TB<n> = Number of triple bonds to non-hydrogen atom neighbors or heavy atoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
276 H<n> = Number of implicit and explicit hydrogens for atom
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
277 Ar = Aromatic annotation indicating whether atom is aromatic
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
278 RA = Ring atom annotation indicating whether atom is a ring
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
279 FC<+n/-n> = Formal charge assigned to atom
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
280 MN<n> = Mass number indicating isotope other than most abundant isotope
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
281 SM<n> = Spin multiplicity of atom. Possible values: 1 (singlet), 2 (doublet) or
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
282 3 (triplet)
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
283
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
284 Atom type generated by AtomTypes::AtomicInvariantsAtomTypes class
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
285 corresponds to:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
286
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
287 AS.X<n>.BO<n>.LBO<n>.<SB><n>.<DB><n>.<TB><n>.H<n>.Ar.RA.FC<+n/-n>.MN<n>.SM<n>
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
288
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
289 Except for AS which is a required atomic invariant in atom types,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
290 all other atomic invariants are optional. Atom type specification
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
291 doesn't include atomic invariants with zero or undefined values.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
292
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
293 In addition to usage of abbreviations for specifying atomic
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
294 invariants, the following descriptive words are also allowed:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
295
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
296 X : NumOfNonHydrogenAtomNeighbors or NumOfHeavyAtomNeighbors
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
297 BO : SumOfBondOrdersToNonHydrogenAtoms or SumOfBondOrdersToHeavyAtoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
298 LBO : LargestBondOrderToNonHydrogenAtoms or LargestBondOrderToHeavyAtoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
299 SB : NumOfSingleBondsToNonHydrogenAtoms or NumOfSingleBondsToHeavyAtoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
300 DB : NumOfDoubleBondsToNonHydrogenAtoms or NumOfDoubleBondsToHeavyAtoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
301 TB : NumOfTripleBondsToNonHydrogenAtoms or NumOfTripleBondsToHeavyAtoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
302 H : NumOfImplicitAndExplicitHydrogens
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
303 Ar : Aromatic
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
304 RA : RingAtom
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
305 FC : FormalCharge
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
306 MN : MassNumber
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
307 SM : SpinMultiplicity
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
308
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
309 Examples:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
310
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
311 Benzene: Using value of *AS* for --AtomicInvariantsToUse, *Yes* for
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
312 UseBondSymbols, and * AllAtomPathsWithRings* for -p, --PathMode,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
313 atom path strings generated are:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
314
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
315 C C:C C:C:C C:C:C:C C:C:C:C:C C:C:C:C:C:C C:C:C:C:C:C:C
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
316
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
317 And using *AS,X,BO* for --AtomicInvariantsToUse generates following
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
318 atom path strings:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
319
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
320 C.X2.BO3 C.X2.BO3:C.X2.BO3 C.X2.BO3:C.X2.BO3:C.X2.BO3
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
321 C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
322 C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
323 C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
324 C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
325
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
326 Urea: Using value of *AS* for --AtomicInvariantsToUse, *Yes* for
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
327 UseBondSymbols, and * AllAtomPathsWithRings* for -p, --PathMode,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
328 atom path strings are:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
329
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
330 C N O C=O CN NC=O NCN
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
331
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
332 And using *AS,X,BO* for --AtomicInvariantsToUse generates following
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
333 atom path strings:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
334
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
335 C.X3.BO4 N.X1.BO1 O.X1.BO2 C.X3.BO4=O.X1.BO2
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
336 C.X3.BO4N.X1.BO1 N.X1.BO1C.X3.BO4=O.X1.BO2
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
337 N.X1.BO1C.X3.BO4N.X1.BO1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
338
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
339 --FunctionalClassesToUse *"FunctionalClass1,FunctionalClass2..."*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
340 This value is used during *FunctionalClassAtomTypes* value of a,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
341 --AtomIdentifierType option. It's a list of comma separated valid
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
342 functional classes.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
343
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
344 Possible values for atom functional classes are: *Ar, CA, H, HBA,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
345 HBD, Hal, NI, PI, RA*. Default value [ Ref 24 ]:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
346 *HBD,HBA,PI,NI,Ar,Hal*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
347
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
348 The functional class abbreviations correspond to:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
349
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
350 HBD: HydrogenBondDonor
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
351 HBA: HydrogenBondAcceptor
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
352 PI : PositivelyIonizable
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
353 NI : NegativelyIonizable
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
354 Ar : Aromatic
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
355 Hal : Halogen
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
356 H : Hydrophobic
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
357 RA : RingAtom
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
358 CA : ChainAtom
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
359
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
360 Functional class atom type specification for an atom corresponds to:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
361
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
362 Ar.CA.H.HBA.HBD.Hal.NI.PI.RA
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
363
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
364 *AtomTypes::FunctionalClassAtomTypes* module is used to assign
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
365 functional class atom types. It uses following definitions [ Ref
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
366 60-61, Ref 65-66 ]:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
367
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
368 HydrogenBondDonor: NH, NH2, OH
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
369 HydrogenBondAcceptor: N[!H], O
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
370 PositivelyIonizable: +, NH2
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
371 NegativelyIonizable: -, C(=O)OH, S(=O)OH, P(=O)OH
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
372
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
373 --BitsOrder *Ascending | Descending*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
374 Bits order to use during generation of fingerprints bit-vector
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
375 string for *PathLengthBits* value of -m, --mode option. Possible
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
376 values: *Ascending, Descending*. Default: *Ascending*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
377
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
378 *Ascending* bit order which corresponds to first bit in each byte as
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
379 the lowest bit as opposed to the highest bit.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
380
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
381 Internally, bits are stored in *Ascending* order using Perl vec
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
382 function. Regardless of machine order, big-endian or little-endian,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
383 vec function always considers first string byte as the lowest byte
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
384 and first bit within each byte as the lowest bit.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
385
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
386 -b, --BitStringFormat *BinaryString | HexadecimalString*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
387 Format of fingerprints bit-vector string data in output SD, FP or
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
388 CSV/TSV text file(s) specified by --output used during
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
389 *PathLengthBits* value of -m, --mode option. Possible values:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
390 *BinaryString, HexadecimalString*. Default value:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
391 *HexadecimalString*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
392
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
393 *BinaryString* corresponds to an ASCII string containing 1s and 0s.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
394 *HexadecimalString* contains bit values in ASCII hexadecimal format.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
395
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
396 Examples:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
397
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
398 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
399 th1:MaxLength8;1024;BinaryString;Ascending;001000010011010101011000110
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
400 0100010101011000101001011100110001000010001001101000001001001001001000
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
401 0010110100000111001001000001001010100100100000000011000000101001011100
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
402 0010000001000101010100000100111100110111011011011000000010110111001101
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
403 0101100011000000010001000011000010100011101100001000001000100000000...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
404
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
405 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
406 th1:MaxLength8;1024;HexadecimalString;Ascending;48caa1315d82d91122b029
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
407 42861c9409a4208182d12015509767bd0867653604481a8b1288000056090583603078
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
408 9cedae54e26596889ab121309800900490515224208421502120a0dd9200509723ae89
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
409 00024181b86c0122821d4e4880c38620dab280824b455404009f082003d52c212b4e6d
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
410 6ea05280140069c780290c43
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
411
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
412 --CompoundID *DataFieldName or LabelPrefixString*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
413 This value is --CompoundIDMode specific and indicates how compound
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
414 ID is generated.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
415
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
416 For *DataField* value of --CompoundIDMode option, it corresponds to
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
417 datafield label name whose value is used as compound ID; otherwise,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
418 it's a prefix string used for generating compound IDs like
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
419 LabelPrefixString<Number>. Default value, *Cmpd*, generates compound
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
420 IDs which look like Cmpd<Number>.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
421
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
422 Examples for *DataField* value of --CompoundIDMode:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
423
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
424 MolID
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
425 ExtReg
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
426
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
427 Examples for *LabelPrefix* or *MolNameOrLabelPrefix* value of
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
428 --CompoundIDMode:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
429
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
430 Compound
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
431
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
432 The value specified above generates compound IDs which correspond to
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
433 Compound<Number> instead of default value of Cmpd<Number>.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
434
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
435 --CompoundIDLabel *text*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
436 Specify compound ID column label for FP or CSV/TSV text file(s) used
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
437 during *CompoundID* value of --DataFieldsMode option. Default:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
438 *CompoundID*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
439
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
440 --CompoundIDMode *DataField | MolName | LabelPrefix |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
441 MolNameOrLabelPrefix*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
442 Specify how to generate compound IDs and write to FP or CSV/TSV text
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
443 file(s) along with generated fingerprints for *FP | text | all*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
444 values of --output option: use a *SDFile(s)* datafield value; use
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
445 molname line from *SDFile(s)*; generate a sequential ID with
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
446 specific prefix; use combination of both MolName and LabelPrefix
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
447 with usage of LabelPrefix values for empty molname lines.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
448
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
449 Possible values: *DataField | MolName | LabelPrefix |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
450 MolNameOrLabelPrefix*. Default: *LabelPrefix*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
451
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
452 For *MolNameAndLabelPrefix* value of --CompoundIDMode, molname line
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
453 in *SDFile(s)* takes precedence over sequential compound IDs
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
454 generated using *LabelPrefix* and only empty molname values are
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
455 replaced with sequential compound IDs.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
456
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
457 This is only used for *CompoundID* value of --DataFieldsMode option.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
458
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
459 --DataFields *"FieldLabel1,FieldLabel2,... "*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
460 Comma delimited list of *SDFiles(s)* data fields to extract and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
461 write to CSV/TSV text file(s) along with generated fingerprints for
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
462 *text | all* values of --output option.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
463
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
464 This is only used for *Specify* value of --DataFieldsMode option.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
465
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
466 Examples:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
467
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
468 Extreg
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
469 MolID,CompoundName
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
470
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
471 -d, --DataFieldsMode *All | Common | Specify | CompoundID*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
472 Specify how data fields in *SDFile(s)* are transferred to output
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
473 CSV/TSV text file(s) along with generated fingerprints for *text |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
474 all* values of --output option: transfer all SD data field; transfer
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
475 SD data files common to all compounds; extract specified data
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
476 fields; generate a compound ID using molname line, a compound
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
477 prefix, or a combination of both. Possible values: *All | Common |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
478 specify | CompoundID*. Default value: *CompoundID*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
479
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
480 --DetectAromaticity *Yes | No*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
481 Detect aromaticity before generating fingerprints. Possible values:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
482 *Yes or No*. Default value: *Yes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
483
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
484 *No* --DetectAromaticity forces usage of atom and bond aromaticity
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
485 values from *SDFile(s)* and skips the step which detects and assigns
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
486 aromaticity.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
487
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
488 *No* --DetectAromaticity value is only allowed uring
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
489 *AtomicInvariantsAtomTypes* value of -a, --AtomIdentifierType
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
490 options; for all possible values -a, --AtomIdentifierType values, it
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
491 must be *Yes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
492
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
493 -f, --Filter *Yes | No*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
494 Specify whether to check and filter compound data in SDFile(s).
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
495 Possible values: *Yes or No*. Default value: *Yes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
496
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
497 By default, compound data is checked before calculating fingerprints
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
498 and compounds containing atom data corresponding to non-element
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
499 symbols or no atom data are ignored.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
500
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
501 --FingerprintsLabel *text*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
502 SD data label or text file column label to use for fingerprints
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
503 string in output SD or CSV/TSV text file(s) specified by --output.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
504 Default value: *PathLenghFingerprints*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
505
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
506 --fold *Yes | No*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
507 Fold fingerprints to increase bit density during *PathLengthBits*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
508 value of -m, --mode option. Possible values: *Yes or No*. Default
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
509 value: *No*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
510
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
511 --FoldedSize *number*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
512 Size of folded fingerprint during *PathLengthBits* value of -m,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
513 --mode option. Default value: *256*. Valid values correspond to any
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
514 positive integer which is less than -s, --size and meets the
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
515 criteria for its value.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
516
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
517 Examples:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
518
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
519 128
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
520 512
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
521
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
522 -h, --help
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
523 Print this help message
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
524
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
525 -i, --IgnoreHydrogens *Yes | No*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
526 Ignore hydrogens during fingerprints generation. Possible values:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
527 *Yes or No*. Default value: *Yes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
528
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
529 For *yes* value of -i, --IgnoreHydrogens, any explicit hydrogens are
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
530 also used for generation of atoms path lengths and fingerprints;
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
531 implicit hydrogens are still ignored.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
532
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
533 -k, --KeepLargestComponent *Yes | No*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
534 Generate fingerprints for only the largest component in molecule.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
535 Possible values: *Yes or No*. Default value: *Yes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
536
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
537 For molecules containing multiple connected components, fingerprints
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
538 can be generated in two different ways: use all connected components
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
539 or just the largest connected component. By default, all atoms
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
540 except for the largest connected component are deleted before
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
541 generation of fingerprints.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
542
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
543 -m, --mode *PathLengthBits | PathLengthCount*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
544 Specify type of path length fingerprints to generate for molecules
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
545 in *SDFile(s)*. Possible values: *PathLengthBits, PathLengthCount*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
546 Default value: *PathLengthBits*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
547
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
548 For *PathLengthBits* value of -m, --mode option, a fingerprint
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
549 bit-vector string containing zeros and ones is generated and for
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
550 *PathLengthCount* value, a fingerprint vector string corresponding
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
551 to number of atom paths is generated.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
552
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
553 --MinPathLength *number*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
554 Minimum atom path length to include in fingerprints. Default value:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
555 *1*. Valid values: positive integers and less than --MaxPathLength.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
556 Path length of 1 correspond to a path containing only one atom.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
557
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
558 --MaxPathLength *number*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
559 Maximum atom path length to include in fingerprints. Default value:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
560 *8*. Valid values: positive integers and greater than
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
561 --MinPathLength.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
562
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
563 -n, --NumOfBitsToSetPerPath *number*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
564 Number of bits to set per path during generation of fingerprints
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
565 bit-vector string for *PathLengthBits* value of -m, --mode option.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
566 Default value: *1*. Valid values: positive integers.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
567
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
568 --OutDelim *comma | tab | semicolon*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
569 Delimiter for output CSV/TSV text file(s). Possible values: *comma,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
570 tab, or semicolon* Default value: *comma*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
571
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
572 --output *SD | FP | text | all*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
573 Type of output files to generate. Possible values: *SD, FP, text, or
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
574 all*. Default value: *text*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
575
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
576 -o, --overwrite
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
577 Overwrite existing files.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
578
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
579 -p, --PathMode *AtomPathsWithoutRings | AtomPathsWithRings |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
580 AllAtomPathsWithoutRings | AllAtomPathsWithRings*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
581 Specify type of atom paths to use for generating pathlength
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
582 fingerprints for molecules in *SDFile(s)*. Possible
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
583 values:*AtomPathsWithoutRings, AtomPathsWithRings,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
584 AllAtomPathsWithoutRings, AllAtomPathsWithRings*. Default value:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
585 *AllAtomPathsWithRings*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
586
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
587 For molecules with no rings, first two and last two options are
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
588 equivalent and generate same set of atom paths starting from each
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
589 atom with length between --MinPathLength and --MaxPathLength.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
590 However, all these four options can result in the same set of final
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
591 atom paths for molecules containing fused, bridged or spiro rings.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
592
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
593 For molecules containing rings, atom paths starting from each atom
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
594 can be traversed in four different ways:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
595
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
596 *AtomPathsWithoutRings* - Atom paths containing no rings and without
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
597 sharing of bonds in traversed paths.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
598
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
599 *AtomPathsWithRings* - Atom paths containing rings and without any
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
600 sharing of bonds in traversed paths.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
601
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
602 *AllAtomPathsWithoutRings* - All possible atom paths containing no
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
603 rings and without any sharing of bonds in traversed paths.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
604
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
605 *AllAtomPathsWithRings* - All possible atom paths containing rings
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
606 and with sharing of bonds in traversed paths.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
607
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
608 Atom path traversal is terminated at the ring atom.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
609
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
610 Based on values specified for for -p, --PathMode, --MinPathLength
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
611 and --MaxPathLength, all appropriate atom paths are generated for
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
612 each atom in the molecule and collected in a list.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
613
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
614 For each atom path in the filtered atom paths list, an atom path
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
615 string is created using value of -a, --AtomIdentifierType and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
616 specified values to use for a particular atom identifier type. Value
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
617 of -u, --UseBondSymbols controls whether bond order symbols are used
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
618 during generation of atom path string. Atom symbol corresponds to
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
619 element symbol and characters used to represent bond order are: *1 -
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
620 None; 2 - '='; 3 - '#'; 1.5 or aromatic - ':'; others: bond order
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
621 value*. By default, bond symbols are included in atom path strings.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
622 Exclusion of bond symbols in atom path strings results in
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
623 fingerprints which correspond purely to atom paths without
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
624 considering bonds.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
625
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
626 UseUniquePaths controls the removal of structurally duplicate atom
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
627 path strings are removed from the list.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
628
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
629 For *PathLengthBits* value of -m, --mode option, each atom path is
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
630 hashed to a 32 bit unsigned integer key using TextUtil::HashCode
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
631 function. Using the hash key as a seed for a random number
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
632 generator, a random integer value between 0 and --Size is used to
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
633 set corresponding bits in the fingerprint bit-vector string. Value
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
634 of --NumOfBitsToSetPerPaths option controls the number of time a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
635 random number is generated to set corresponding bits.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
636
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
637 For * PathLengthCount* value of -m, --mode option, the number of
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
638 times an atom path appears is tracked and a fingerprints
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
639 count-string corresponding to count of atom paths is generated.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
640
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
641 For molecule containing rings, combination of -p, --PathMode and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
642 --UseBondSymbols allows generation of up to 8 different types of
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
643 atom path length strings:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
644
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
645 AllowSharedBonds AllowRings UseBondSymbols
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
646
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
647 0 0 1 - AtomPathsNoCyclesWithBondSymbols
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
648 0 1 1 - AtomPathsWithCyclesWithBondSymbols
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
649
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
650 1 0 1 - AllAtomPathsNoCyclesWithBondSymbols
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
651 1 1 1 - AllAtomPathsWithCyclesWithBondSymbols
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
652 [ DEFAULT ]
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
653
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
654 0 0 0 - AtomPathsNoCyclesNoBondSymbols
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
655 0 1 0 - AtomPathsWithCyclesNoBondSymbols
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
656
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
657 1 0 0 - AllAtomPathsNoCyclesNoBondSymbols
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
658 1 1 0 - AllAtomPathsWithCyclesNoWithBondSymbols
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
659
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
660 Default atom path length fingerprints generation for molecules
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
661 containing rings with *AllAtomPathsWithRings* value for -p,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
662 --PathMode, *Yes* value for --UseBondSymbols, *2* value for
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
663 --MinPathLength and *8* value for --MaxPathLength is the most time
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
664 consuming. Combinations of other options can substantially speed up
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
665 fingerprint generation for molecules containing complex ring
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
666 systems.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
667
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
668 Additionally, value for option -a, --AtomIdentifierType in
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
669 conjunction with corresponding specified values for atom types
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
670 changes the nature of atom path length strings and the fingerprints.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
671
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
672 -q, --quote *Yes | No*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
673 Put quote around column values in output CSV/TSV text file(s).
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
674 Possible values: *Yes or No*. Default value: *Yes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
675
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
676 -r, --root *RootName*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
677 New file name is generated using the root: <Root>.<Ext>. Default for
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
678 new file names: <SDFileName><PathLengthFP>.<Ext>. The file type
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
679 determines <Ext> value. The sdf, fpf, csv, and tsv <Ext> values are
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
680 used for SD, FP, comma/semicolon, and tab delimited text files,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
681 respectively.This option is ignored for multiple input files.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
682
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
683 -s, --size *number*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
684 Size of fingerprints. Default value: *1024*. Valid values correspond
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
685 to any positive integer which satisfies the following criteria:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
686 power of 2, >= 32 and <= 2 ** 32.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
687
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
688 Examples:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
689
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
690 256
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
691 512
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
692 2048
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
693
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
694 -u, --UseBondSymbols *Yes | No*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
695 Specify whether to use bond symbols for atom paths during generation
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
696 of atom path strings. Possible values: *Yes or No*. Default value:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
697 *Yes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
698
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
699 *No* value option for -u, --UseBondSymbols allows the generation of
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
700 fingerprints corresponding purely to atoms disregarding all bonds.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
701
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
702 --UsePerlCoreRandom *Yes | No*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
703 Specify whether to use Perl CORE::rand or MayaChemTools
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
704 MathUtil::random function during random number generation for
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
705 setting bits in fingerprints bit-vector strings. Possible values:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
706 *Yes or No*. Default value: *Yes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
707
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
708 *No* value option for --UsePerlCoreRandom allows the generation of
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
709 fingerprints bit-vector strings which are same across different
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
710 platforms.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
711
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
712 The random number generator implemented in MayaChemTools is a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
713 variant of linear congruential generator (LCG) as described by
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
714 Miller et al. [ Ref 120 ]. It is also referred to as Lehmer random
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
715 number generator or Park-Miller random number generator.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
716
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
717 Unlike Perl's core random number generator function rand, the random
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
718 number generator implemented in MayaChemTools, MathUtil::random,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
719 generates consistent random values across different platforms for a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
720 specific random seed and leads to generation of portable
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
721 fingerprints bit-vector strings.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
722
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
723 --UseUniquePaths *Yes | No*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
724 Specify whether to use structurally unique atom paths during
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
725 generation of atom path strings. Possible values: *Yes or No*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
726 Default value: *Yes*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
727
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
728 *No* value option for --UseUniquePaths allows usage of all atom
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
729 paths generated by -p, --PathMode option value for generation of
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
730 atom path strings leading to duplicate path count during
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
731 *PathLengthCount* value of -m, --mode option. It doesn't change
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
732 fingerprint string generated during *PathLengthBits* value of -m,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
733 --mode.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
734
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
735 For example, during *AllAtomPathsWithRings* value of -p, --PathMode
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
736 option, benzene has 12 linear paths of length 2 and 12 cyclic paths
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
737 length of 7, but only 6 linear paths of length 2 and 1 cyclic path
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
738 of length 7 are structurally unique.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
739
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
740 -v, --VectorStringFormat *IDsAndValuesString | IDsAndValuesPairsString |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
741 ValuesAndIDsString | ValuesAndIDsPairsString*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
742 Format of fingerprints vector string data in output SD, FP or
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
743 CSV/TSV text file(s) specified by --output used during
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
744 *PathLengthCount* value of -m, --mode option. Possible values:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
745 *IDsAndValuesString | IDsAndValuesPairsString | ValuesAndIDsString |
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
746 ValuesAndIDsPairsString*. Defaultvalue: *IDsAndValuesString*.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
747
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
748 Examples:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
749
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
750 FingerprintsVector;PathLengthCount:AtomicInvariantsAtomTypes:MinLength
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
751 1:MaxLength8;432;NumericalValues;IDsAndValuesPairsString;C.X1.BO1.H3 2
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
752 C.X2.BO2.H2 4 C.X2.BO3.H1 14 C.X3.BO3.H1 3 C.X3.BO4 10 F.X1.BO1 1 N.X
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
753 2.BO2.H1 1 N.X3.BO3 1 O.X1.BO1.H1 3 O.X1.BO2 2 C.X1.BO1.H3C.X3.BO3.H1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
754 2 C.X2.BO2.H2C.X2.BO2.H2 1 C.X2.BO2.H2C.X3.BO3.H1 4 C.X2.BO2.H2C.X3.BO
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
755 4 1 C.X2.BO2.H2N.X3.BO3 1 C.X2.BO3.H1:C.X2.BO3.H1 10 C.X2.BO3.H1:C....
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
756
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
757 FingerprintsVector;PathLengthCount:EStateAtomTypes:MinLength1:MaxLengt
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
758 h8;454;NumericalValues;IDsAndValuesPairsString;aaCH 14 aasC 8 aasN 1 d
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
759 O 2 dssC 2 sCH3 2 sF 1 sOH 3 ssCH2 4 ssNH 1 sssCH 3 aaCH:aaCH 10 aaCH:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
760 aasC 8 aasC:aasC 3 aasC:aasN 2 aasCaasC 2 aasCdssC 1 aasCsF 1 aasCssNH
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
761 1 aasCsssCH 1 aasNssCH2 1 dO=dssC 2 dssCsOH 1 dssCssCH2 1 dssCssNH 1
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
762 sCH3sssCH 2 sOHsssCH 2 ssCH2ssCH2 1 ssCH2sssCH 4 aaCH:aaCH:aaCH 6 a...
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
763
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
764 -w, --WorkingDir *DirName*
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
765 Location of working directory. Default: current directory.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
766
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
767 EXAMPLES
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
768 To generate path length fingerprints corresponding to all unique paths
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
769 from length 1 through 8 in hexadecimal bit-vector string format of size
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
770 1024 and create a SamplePLFPHex.csv file containing sequential compound
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
771 IDs along with fingerprints bit-vector strings data, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
772
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
773 % PathLengthFingerprints.pl -o -r SamplePLFPHex Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
774
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
775 To generate path length fingerprints corresponding to all unique paths
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
776 from length 1 through 8 in hexadecimal bit-vector string format of size
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
777 1024 and create SamplePLFPHex.sdf, SamplePLFPHex.fpf, and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
778 SamplePLFPHex.csv files containing sequential compound IDs in CSV file
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
779 along with fingerprints bit-vector strings data, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
780
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
781 % PathLengthFingerprints.pl --output all -o -r SamplePLFPHex Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
782
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
783 To generate path length fingerprints corresponding to all unique paths
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
784 from length 1 through 8 in binary bit-vector string format of size 1024
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
785 and create a SamplePLFPBin.csv file containing sequential compound IDs
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
786 along with fingerprints bit-vector strings data, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
787
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
788 % PathLengthFingerprints.pl --BitStringFormat BinaryString --size 2048
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
789 -o -r SamplePLFPBin Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
790
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
791 To generate path length fingerprints corresponding to count of all
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
792 unique paths from length 1 through 8 in IDsAndValuesString format and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
793 create a SamplePLFPCount.csv file containing sequential compound IDs
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
794 along with fingerprints vector strings data, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
795
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
796 % PathLengthFingerprints.pl -m PathLengthCount -o -r SamplePLFPCount
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
797 Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
798
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
799 To generate path length fingerprints corresponding to count of all
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
800 unique paths from length 1 through 8 in IDsAndValuesString format using
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
801 E-state atom types and create a SamplePLFPCount.csv file containing
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
802 sequential compound IDs along with fingerprints vector strings data,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
803 type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
804
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
805 % PathLengthFingerprints.pl -m PathLengthCount --AtomIdentifierType
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
806 EStateAtomTypes -o -r SamplePLFPCount Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
807
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
808 To generate path length fingerprints corresponding to count of all
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
809 unique paths from length 1 through 8 in IDsAndValuesString format using
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
810 SLogP atom types and create a SamplePLFPCount.csv file containing
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
811 sequential compound IDs along with fingerprints vector strings data,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
812 type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
813
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
814 % PathLengthFingerprints.pl -m PathLengthCount --AtomIdentifierType
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
815 SLogPAtomTypes -o -r SamplePLFPCount Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
816
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
817 To generate path length fingerprints corresponding to count of all
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
818 unique paths from length 1 through 8 in IDsAndValuesString format and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
819 create a SamplePLFPCount.csv file containing sequential compound IDs
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
820 along with fingerprints vector strings data, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
821
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
822 % PathLengthFingerprints.pl -m PathLengthCount --VectorStringFormat
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
823 ValuesAndIDsPairsString -o -r SamplePLFPCount Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
824
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
825 To generate path length fingerprints corresponding to count of all
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
826 unique paths from length 1 through 8 in IDsAndValuesString format using
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
827 AS,X,BO as atomic invariants and create a SamplePLFPCount.csv file
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
828 containing sequential compound IDs along with fingerprints vector
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
829 strings data, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
830
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
831 % PathLengthFingerprints.pl -m PathLengthCount --AtomIdentifierType
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
832 AtomicInvariantsAtomTypes --AtomicInvariantsToUse "AS,X,BO" -o
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
833 -r SamplePLFPCount Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
834
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
835 To generate path length fingerprints corresponding to count of all paths
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
836 from length 1 through 8 in IDsAndValuesString format and create a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
837 SamplePLFPCount.csv file containing compound IDs from MolName line along
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
838 with fingerprints vector strings data, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
839
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
840 % PathLengthFingerprints.pl -m PathLengthCount --UseUniquePaths No
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
841 -o --CompoundIDMode MolName -r SamplePLFPCount --UseUniquePaths No
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
842 Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
843
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
844 To generate path length fingerprints corresponding to all unique paths
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
845 from length 1 through 8 in hexadecimal bit-vector string format of size
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
846 512 after folding and create SamplePLFPHex.sdf, SamplePLFPHex.fpf, and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
847 SamplePLFPHex.sdf files containing sequential compound IDs along with
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
848 fingerprints bit-vector strings data, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
849
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
850 % PathLengthFingerprints.pl --output all --Fold Yes --FoldedSize 512
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
851 -o -r SamplePLFPHex Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
852
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
853 To generate path length fingerprints corresponding to all unique paths
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
854 from length 1 through 8 containing no rings and without sharing of bonds
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
855 in hexadecimal bit-vector string format of size 1024 and create a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
856 SamplePLFPHex.csv file containing sequential compound IDs along with
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
857 fingerprints bit-vector strings data and all data fields, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
858
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
859 % PathLengthFingerprints.pl -p AtomPathsWithoutRings --DataFieldsMode All
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
860 -o -r SamplePLFPHex Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
861
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
862 To generate path length fingerprints corresponding to all unique paths
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
863 from length 1 through 8 containing rings and without sharing of bonds in
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
864 hexadecimal bit-vector string format of size 1024 and create a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
865 SamplePLFPHex.tsv file containing compound IDs derived from combination
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
866 of molecule name line and an explicit compound prefix along with
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
867 fingerprints bit-vector strings data and all data fields, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
868
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
869 % PathLengthFingerprints.pl -p AtomPathsWithRings --DataFieldsMode
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
870 CompoundID --CompoundIDMode MolnameOrLabelPrefix --CompoundID Cmpd
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
871 --CompoundIDLabel MolID --FingerprintsLabel PathLengthFP --OutDelim Tab
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
872 -r SamplePLFPHex -o Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
873
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
874 To generate path length fingerprints corresponding to count of all
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
875 unique paths from length 1 through 8 in IDsAndValuesString format and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
876 create a SamplePLFPCount.csv file containing sequential compound IDs
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
877 along with fingerprints vector strings data using aromaticity specified
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
878 in SD file, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
879
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
880 % PathLengthFingerprints.pl -m PathLengthCount --DetectAromaticity No
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
881 -o -r SamplePLFPCount Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
882
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
883 To generate path length fingerprints corresponding to all unique paths
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
884 from length 2 through 6 in hexadecimal bit-vector string format of size
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
885 1024 and create a SamplePLFPHex.csv file containing sequential compound
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
886 IDs along with fingerprints bit-vector strings data, type:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
887
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
888 % PathLengthFingerprints.pl --MinPathLength 2 --MaxPathLength 6
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
889 -o -r SamplePLFPHex Sample.sdf
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
890
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
891 AUTHOR
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
892 Manish Sud <msud@san.rr.com>
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
893
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
894 SEE ALSO
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
895 InfoFingerprintsFiles.pl, SimilarityMatricesFingerprints.pl,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
896 AtomNeighborhoodsFingerprints.pl, ExtendedConnectivityFingerprints.pl,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
897 MACCSKeysFingerprints.pl, TopologicalAtomPairsFingerprints.pl,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
898 TopologicalAtomTorsionsFingerprints.pl,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
899 TopologicalPharmacophoreAtomPairsFingerprints.pl,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
900 TopologicalPharmacophoreAtomTripletsFingerprints.pl
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
901
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
902 COPYRIGHT
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
903 Copyright (C) 2015 Manish Sud. All rights reserved.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
904
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
905 This file is part of MayaChemTools.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
906
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
907 MayaChemTools is free software; you can redistribute it and/or modify it
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
908 under the terms of the GNU Lesser General Public License as published by
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
909 the Free Software Foundation; either version 3 of the License, or (at
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
910 your option) any later version.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
911