annotate docs/scripts/txt/PathLengthFingerprints.txt @ 3:90ea638ce878 draft default tip

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 09:11:59 -0500
parents 2abf0d43254d
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1 NAME
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
2 PathLengthFingerprints.pl - Generate atom path length based fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
3 for SD files
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
4
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
5 SYNOPSIS
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
6 PathLengthFingerprints.pl SDFile(s)...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
7
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
8 PathLengthFingerprints.pl [--AromaticityModel *AromaticityModelType*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
9 [-a, --AtomIdentifierType *AtomicInvariantsAtomTypes*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
10 [--AtomicInvariantsToUse *"AtomicInvariant1,AtomicInvariant2..."*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
11 [--FunctionalClassesToUse *"FunctionalClass1,FunctionalClass2..."*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
12 [--BitsOrder *Ascending | Descending*] [-b, --BitStringFormat
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
13 *BinaryString | HexadecimalString*] [--CompoundID *DataFieldName or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
14 LabelPrefixString*] [--CompoundIDLabel *text*] [--CompoundIDMode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
15 *DataField | MolName | LabelPrefix | MolNameOrLabelPrefix*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
16 [--DataFields *"FieldLabel1,FieldLabel2,... "*] [-d, --DataFieldsMode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
17 *All | Common | Specify | CompoundID*] [--DetectAromaticity *Yes | No*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
18 [-f, --Filter *Yes | No*] [--FingerprintsLabel *text*] [--fold *Yes |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
19 No*] [--FoldedSize *number*] [-h, --help] [-i, --IgnoreHydrogens *Yes |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
20 No*] [-k, --KeepLargestComponent *Yes | No*] [-m, --mode *PathLengthBits
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
21 | PathLengthCount*] [--MinPathLength *number*] [--MaxPathLength
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
22 *number*] [-n, --NumOfBitsToSetPerPath *number*] [--OutDelim *comma |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
23 tab | semicolon*] [--output *SD | FP | text | all*] [-q, --quote *Yes |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
24 No*] [-r, --root *RootName*] [-p, --PathMode *AtomPathsWithoutRings |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
25 AtomPathsWithRings | AllAtomPathsWithoutRings | AllAtomPathsWithRings*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
26 [-s, --size *number*] [-u, --UseBondSymbols *Yes | No*]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
27 [--UsePerlCoreRandom *Yes | No*] [--UseUniquePaths *Yes | No*] [-q,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
28 --quote *Yes | No*] [-r, --root *RootName*] [-v, --VectorStringFormat
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
29 *IDsAndValuesString | IDsAndValuesPairsString | ValuesAndIDsString |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
30 ValuesAndIDsPairsString*] [-w, --WorkingDir dirname] SDFile(s)...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
31
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
32 DESCRIPTION
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
33 Generate atom path length fingerprints for *SDFile(s)* and create
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
34 appropriate SD, FP or CSV/TSV text file(s) containing fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
35 bit-vector or vector strings corresponding to molecular fingerprints.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
36
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
37 Multiple SDFile names are separated by spaces. The valid file extensions
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
38 are *.sdf* and *.sd*. All other file names are ignored. All the SD files
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
39 in a current directory can be specified either by **.sdf* or the current
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
40 directory name.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
41
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
42 The current release of MayaChemTools supports generation of path length
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
43 fingerprints corresponding to following -a, --AtomIdentifierTypes:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
44
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
45 AtomicInvariantsAtomTypes, DREIDINGAtomTypes, EStateAtomTypes,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
46 FunctionalClassAtomTypes, MMFF94AtomTypes, SLogPAtomTypes,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
47 SYBYLAtomTypes, TPSAAtomTypes, UFFAtomTypes
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
48
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
49 Based on the values specified for -p, --PathMode, --MinPathLength and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
50 --MaxPathLength, all appropriate atom paths are generated for each atom
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
51 in the molecule and collected in a list and the list is filtered to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
52 remove any structurally duplicate paths as indicated by the value of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
53 --UseUniquePaths option.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
54
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
55 For each atom path in the filtered atom paths list, an atom path string
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
56 is created using value of -a, --AtomIdentifierType and specified values
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
57 to use for a particular atom identifier type. Value of -u,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
58 --UseBondSymbols controls whether bond order symbols are used during
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
59 generation of atom path string. For each atom path, only
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
60 lexicographically smaller atom path strings are kept.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
61
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
62 For *PathLengthBits* value of -m, --mode option, each atom path is
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
63 hashed to a 32 bit unsigned integer key using TextUtil::HashCode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
64 function. Using the hash key as a seed for a random number generator, a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
65 random integer value between 0 and --Size is used to set corresponding
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
66 bits in the fingerprint bit-vector string. Value of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
67 --NumOfBitsToSetPerPath option controls the number of time a random
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
68 number is generated to set corresponding bits.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
69
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
70 For * PathLengthCount* value of -m, --mode option, the number of times
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
71 an atom path appears is tracked and a fingerprints count-string
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
72 corresponding to count of atom paths is generated.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
73
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
74 Example of *SD* file containing path length fingerprints string data:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
75
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
76 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
77 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
78 $$$$
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
79 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
80 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
81 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
82 41 44 0 0 0 0 0 0 0 0999 V2000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
83 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
84 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
85 2 3 1 0 0 0 0
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
86 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
87 M END
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
88 > <CmpdID>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
89 Cmpd1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
90
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
91 > <PathLengthFingerprints>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
92 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLengt
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
93 h1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a49913991a66
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
94 03130b0a19e8051c89184414953800cc2151082844a201042800130860308e8204d4028
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
95 00831048940e44281c00060449a5000ac80c894114e006321264401600846c050164462
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
96 08190410805000304a10205b0100e04c0038ba0fad0209c0ca8b1200012268b61c0026a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
97 aa0660a11014a011d46
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
98
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
99 $$$$
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
100 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
101 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
102
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
103 Example of *FP* file containing path length fingerprints string data:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
104
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
105 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
106 # Package = MayaChemTools 7.4
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
107 # ReleaseDate = Oct 21, 2010
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
108 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
109 # TimeStamp = Mon Mar 7 15:14:01 2011
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
110 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
111 # FingerprintsStringType = FingerprintsBitVector
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
112 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
113 # Description = PathLengthBits:AtomicInvariantsAtomTypes:MinLength1:...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
114 # Size = 1024
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
115 # BitStringFormat = HexadecimalString
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
116 # BitsOrder = Ascending
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
117 #
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
118 Cmpd1 9c8460989ec8a49913991a6603130b0a19e8051c89184414953800cc21510...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
119 Cmpd2 000000249400840040100042011001001980410c000000001010088001120...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
120 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
121 ... ..
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
122
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
123 Example of CSV *Text* file containing pathlength fingerprints string
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
124 data:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
125
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
126 "CompoundID","PathLengthFingerprints"
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
127 "Cmpd1","FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
128 :MinLength1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a4
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
129 9913991a6603130b0a19e8051c89184414953800cc2151082844a20104280013086030
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
130 8e8204d402800831048940e44281c00060449a5000ac80c894114e006321264401..."
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
131 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
132 ... ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
133
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
134 The current release of MayaChemTools generates the following types of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
135 path length fingerprints bit-vector and vector strings:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
136
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
137 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
138 th1:MaxLength8;1024;BinaryString;Ascending;001000010011010101011000110
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
139 0100010101011000101001011100110001000010001001101000001001001001001000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
140 0010110100000111001001000001001010100100100000000011000000101001011100
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
141 0010000001000101010100000100111100110111011011011000000010110111001101
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
142 0101100011000000010001000011000010100011101100001000001000100000000...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
143
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
144 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
145 th1:MaxLength8;1024;HexadecimalString;Ascending;48caa1315d82d91122b029
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
146 42861c9409a4208182d12015509767bd0867653604481a8b1288000056090583603078
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
147 9cedae54e26596889ab121309800900490515224208421502120a0dd9200509723ae89
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
148 00024181b86c0122821d4e4880c38620dab280824b455404009f082003d52c212b4e6d
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
149 6ea05280140069c780290c43
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
150
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
151 FingerprintsVector;PathLengthCount:AtomicInvariantsAtomTypes:MinLength
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
152 1:MaxLength8;432;NumericalValues;IDsAndValuesPairsString;C.X1.BO1.H3 2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
153 C.X2.BO2.H2 4 C.X2.BO3.H1 14 C.X3.BO3.H1 3 C.X3.BO4 10 F.X1.BO1 1 N.X
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
154 2.BO2.H1 1 N.X3.BO3 1 O.X1.BO1.H1 3 O.X1.BO2 2 C.X1.BO1.H3C.X3.BO3.H1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
155 2 C.X2.BO2.H2C.X2.BO2.H2 1 C.X2.BO2.H2C.X3.BO3.H1 4 C.X2.BO2.H2C.X3.BO
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
156 4 1 C.X2.BO2.H2N.X3.BO3 1 C.X2.BO3.H1:C.X2.BO3.H1 10 C.X2.BO3.H1:C....
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
157
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
158 FingerprintsVector;PathLengthCount:DREIDINGAtomTypes:MinLength1:MaxLen
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
159 gth8;410;NumericalValues;IDsAndValuesPairsString;C_2 2 C_3 9 C_R 22 F_
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
160 1 N_3 1 N_R 1 O_2 2 O_3 3 C_2=O_2 2 C_2C_3 1 C_2C_R 1 C_2N_3 1 C_2O_3
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
161 1 C_3C_3 7 C_3C_R 1 C_3N_R 1 C_3O_3 2 C_R:C_R 21 C_R:N_R 2 C_RC_R 2 C
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
162 _RF_ 1 C_RN_3 1 C_2C_3C_3 1 C_2C_R:C_R 2 C_2N_3C_R 1 C_3C_2=O_2 1 C_3C
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
163 _2O_3 1 C_3C_3C_3 5 C_3C_3C_R 2 C_3C_3N_R 1 C_3C_3O_3 4 C_3C_R:C_R ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
164
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
165 FingerprintsVector;PathLengthCount:EStateAtomTypes:MinLength1:MaxLengt
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
166 h8;454;NumericalValues;IDsAndValuesPairsString;aaCH 14 aasC 8 aasN 1 d
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
167 O 2 dssC 2 sCH3 2 sF 1 sOH 3 ssCH2 4 ssNH 1 sssCH 3 aaCH:aaCH 10 aaCH:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
168 aasC 8 aasC:aasC 3 aasC:aasN 2 aasCaasC 2 aasCdssC 1 aasCsF 1 aasCssNH
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
169 1 aasCsssCH 1 aasNssCH2 1 dO=dssC 2 dssCsOH 1 dssCssCH2 1 dssCssNH 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
170 sCH3sssCH 2 sOHsssCH 2 ssCH2ssCH2 1 ssCH2sssCH 4 aaCH:aaCH:aaCH 6 a...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
171
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
172 FingerprintsVector;PathLengthCount:FunctionalClassAtomTypes:MinLength1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
173 :MaxLength8;404;NumericalValues;IDsAndValuesPairsString;Ar 22 Ar.HBA 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
174 HBA 2 HBA.HBD 3 HBD 1 Hal 1 NI 1 None 10 Ar.HBA:Ar 2 Ar.HBANone 1 Ar:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
175 Ar 21 ArAr 2 ArHBD 1 ArHal 1 ArNone 2 HBA.HBDNI 1 HBA.HBDNone 2 HBA=NI
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
176 1 HBA=None 1 HBDNone 1 NINone 1 NoneNone 7 Ar.HBA:Ar:Ar 2 Ar.HBA:ArAr
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
177 1 Ar.HBA:ArNone 1 Ar.HBANoneNone 1 Ar:Ar.HBA:Ar 1 Ar:Ar.HBANone 2 ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
178
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
179 FingerprintsVector;PathLengthCount:MMFF94AtomTypes:MinLength1:MaxLengt
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
180 h8;463;NumericalValues;IDsAndValuesPairsString;C5A 2 C5B 2 C=ON 1 CB 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
181 8 COO 1 CR 9 F 1 N5 1 NC=O 1 O=CN 1 O=CO 1 OC=O 1 OR 2 C5A:C5B 2 C5A:N
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
182 5 2 C5ACB 1 C5ACR 1 C5B:C5B 1 C5BC=ON 1 C5BCB 1 C=ON=O=CN 1 C=ONNC=O 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
183 CB:CB 18 CBF 1 CBNC=O 1 COO=O=CO 1 COOCR 1 COOOC=O 1 CRCR 7 CRN5 1 CR
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
184 OR 2 C5A:C5B:C5B 2 C5A:C5BC=ON 1 C5A:C5BCB 1 C5A:N5:C5A 1 C5A:N5CR ...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
185
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
186 FingerprintsVector;PathLengthCount:SLogPAtomTypes:MinLength1:MaxLength
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
187 8;518;NumericalValues;IDsAndValuesPairsString;C1 5 C10 1 C11 1 C14 1 C
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
188 18 14 C20 4 C21 2 C22 1 C5 2 CS 2 F 1 N11 1 N4 1 O10 1 O2 3 O9 1 C10C1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
189 1 C10N11 1 C11C1 2 C11C21 1 C14:C18 2 C14F 1 C18:C18 10 C18:C20 4 C18
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
190 :C22 2 C1C5 1 C1CS 4 C20:C20 1 C20:C21 1 C20:N11 1 C20C20 2 C21:C21 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
191 C21:N11 1 C21C5 1 C22N4 1 C5=O10 1 C5=O9 1 C5N4 1 C5O2 1 CSO2 2 C10...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
192
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
193 FingerprintsVector;PathLengthCount:SYBYLAtomTypes:MinLength1:MaxLength
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
194 8;412;NumericalValues;IDsAndValuesPairsString;C.2 2 C.3 9 C.ar 22 F 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
195 N.am 1 N.ar 1 O.2 1 O.3 2 O.co2 2 C.2=O.2 1 C.2=O.co2 1 C.2C.3 1 C.2C.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
196 ar 1 C.2N.am 1 C.2O.co2 1 C.3C.3 7 C.3C.ar 1 C.3N.ar 1 C.3O.3 2 C.ar:C
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
197 .ar 21 C.ar:N.ar 2 C.arC.ar 2 C.arF 1 C.arN.am 1 C.2C.3C.3 1 C.2C.ar:C
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
198 .ar 2 C.2N.amC.ar 1 C.3C.2=O.co2 1 C.3C.2O.co2 1 C.3C.3C.3 5 C.3C.3...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
199
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
200 FingerprintsVector;PathLengthCount:TPSAAtomTypes:MinLength1:MaxLength8
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
201 ;331;NumericalValues;IDsAndValuesPairsString;N21 1 N7 1 None 34 O3 2 O
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
202 4 3 N21:None 2 N21None 1 N7None 2 None:None 21 None=O3 2 NoneNone 13 N
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
203 oneO4 3 N21:None:None 2 N21:NoneNone 2 N21NoneNone 1 N7None:None 2 N7N
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
204 one=O3 1 N7NoneNone 1 None:N21:None 1 None:N21None 2 None:None:None 20
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
205 None:NoneNone 12 NoneN7None 1 NoneNone=O3 2 NoneNoneNone 8 NoneNon...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
206
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
207 FingerprintsVector;PathLengthCount:UFFAtomTypes:MinLength1:MaxLength8;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
208 410;NumericalValues;IDsAndValuesPairsString;C_2 2 C_3 9 C_R 22 F_ 1 N_
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
209 3 1 N_R 1 O_2 2 O_3 3 C_2=O_2 2 C_2C_3 1 C_2C_R 1 C_2N_3 1 C_2O_3 1 C_
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
210 3C_3 7 C_3C_R 1 C_3N_R 1 C_3O_3 2 C_R:C_R 21 C_R:N_R 2 C_RC_R 2 C_RF_
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
211 1 C_RN_3 1 C_2C_3C_3 1 C_2C_R:C_R 2 C_2N_3C_R 1 C_3C_2=O_2 1 C_3C_2O_3
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
212 1 C_3C_3C_3 5 C_3C_3C_R 2 C_3C_3N_R 1 C_3C_3O_3 4 C_3C_R:C_R 1 C_3...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
213
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
214 OPTIONS
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
215 --AromaticityModel *MDLAromaticityModel | TriposAromaticityModel |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
216 MMFFAromaticityModel | ChemAxonBasicAromaticityModel |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
217 ChemAxonGeneralAromaticityModel | DaylightAromaticityModel |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
218 MayaChemToolsAromaticityModel*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
219 Specify aromaticity model to use during detection of aromaticity.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
220 Possible values in the current release are: *MDLAromaticityModel,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
221 TriposAromaticityModel, MMFFAromaticityModel,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
222 ChemAxonBasicAromaticityModel, ChemAxonGeneralAromaticityModel,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
223 DaylightAromaticityModel or MayaChemToolsAromaticityModel*. Default
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
224 value: *MayaChemToolsAromaticityModel*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
225
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
226 The supported aromaticity model names along with model specific
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
227 control parameters are defined in AromaticityModelsData.csv, which
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
228 is distributed with the current release and is available under
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
229 lib/data directory. Molecule.pm module retrieves data from this file
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
230 during class instantiation and makes it available to method
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
231 DetectAromaticity for detecting aromaticity corresponding to a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
232 specific model.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
233
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
234 This option is ignored during *No* value of --DetectAromaticity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
235 option.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
236
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
237 -a, --AtomIdentifierType *AtomicInvariantsAtomTypes | DREIDINGAtomTypes
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
238 | EStateAtomTypes | FunctionalClassAtomTypes | MMFF94AtomTypes |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
239 SLogPAtomTypes | SYBYLAtomTypes | TPSAAtomTypes | UFFAtomTypes*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
240 Specify atom identifier type to use for assignment of atom types to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
241 hydrogen and/or non-hydrogen atoms during calculation of atom types
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
242 fingerprints. Possible values in the current release are:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
243 *AtomicInvariantsAtomTypes, DREIDINGAtomTypes, EStateAtomTypes,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
244 FunctionalClassAtomTypes, MMFF94AtomTypes, SLogPAtomTypes,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
245 SYBYLAtomTypes, TPSAAtomTypes, UFFAtomTypes*. Default value:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
246 *AtomicInvariantsAtomTypes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
247
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
248 -a, --AtomIdentifierType *AtomicInvariantsAtomTypes | DREIDINGAtomTypes
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
249 | EStateAtomTypes | FunctionalClassAtomTypes | MMFF94AtomTypes |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
250 SLogPAtomTypes | SYBYLAtomTypes | TPSAAtomTypes | UFFAtomTypes*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
251 Specify atom identifier type to use during generation of atom path
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
252 strings corresponding to path length fingerprints. Possible values
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
253 in the current release are: *AtomicInvariantsAtomTypes,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
254 DREIDINGAtomTypes, EStateAtomTypes, FunctionalClassAtomTypes,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
255 MMFF94AtomTypes, SLogPAtomTypes, SYBYLAtomTypes, TPSAAtomTypes,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
256 UFFAtomTypes*. Default value: *AtomicInvariantsAtomTypes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
257
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
258 --AtomicInvariantsToUse *"AtomicInvariant1,AtomicInvariant2..."*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
259 This value is used during *AtomicInvariantsAtomTypes* value of a,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
260 --AtomIdentifierType option. It's a list of comma separated valid
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
261 atomic invariant atom types.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
262
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
263 Possible values for atomic invariants are: *AS, X, BO, LBO, SB, DB,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
264 TB, H, Ar, RA, FC, MN, SM*. Default value: *AS*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
265
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
266 The atomic invariants abbreviations correspond to:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
267
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
268 AS = Atom symbol corresponding to element symbol
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
269
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
270 X<n> = Number of non-hydrogen atom neighbors or heavy atoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
271 BO<n> = Sum of bond orders to non-hydrogen atom neighbors or heavy atoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
272 LBO<n> = Largest bond order of non-hydrogen atom neighbors or heavy atoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
273 SB<n> = Number of single bonds to non-hydrogen atom neighbors or heavy atoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
274 DB<n> = Number of double bonds to non-hydrogen atom neighbors or heavy atoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
275 TB<n> = Number of triple bonds to non-hydrogen atom neighbors or heavy atoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
276 H<n> = Number of implicit and explicit hydrogens for atom
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
277 Ar = Aromatic annotation indicating whether atom is aromatic
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
278 RA = Ring atom annotation indicating whether atom is a ring
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
279 FC<+n/-n> = Formal charge assigned to atom
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
280 MN<n> = Mass number indicating isotope other than most abundant isotope
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
281 SM<n> = Spin multiplicity of atom. Possible values: 1 (singlet), 2 (doublet) or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
282 3 (triplet)
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
283
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
284 Atom type generated by AtomTypes::AtomicInvariantsAtomTypes class
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
285 corresponds to:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
286
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
287 AS.X<n>.BO<n>.LBO<n>.<SB><n>.<DB><n>.<TB><n>.H<n>.Ar.RA.FC<+n/-n>.MN<n>.SM<n>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
288
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
289 Except for AS which is a required atomic invariant in atom types,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
290 all other atomic invariants are optional. Atom type specification
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
291 doesn't include atomic invariants with zero or undefined values.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
292
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
293 In addition to usage of abbreviations for specifying atomic
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
294 invariants, the following descriptive words are also allowed:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
295
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
296 X : NumOfNonHydrogenAtomNeighbors or NumOfHeavyAtomNeighbors
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
297 BO : SumOfBondOrdersToNonHydrogenAtoms or SumOfBondOrdersToHeavyAtoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
298 LBO : LargestBondOrderToNonHydrogenAtoms or LargestBondOrderToHeavyAtoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
299 SB : NumOfSingleBondsToNonHydrogenAtoms or NumOfSingleBondsToHeavyAtoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
300 DB : NumOfDoubleBondsToNonHydrogenAtoms or NumOfDoubleBondsToHeavyAtoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
301 TB : NumOfTripleBondsToNonHydrogenAtoms or NumOfTripleBondsToHeavyAtoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
302 H : NumOfImplicitAndExplicitHydrogens
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
303 Ar : Aromatic
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
304 RA : RingAtom
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
305 FC : FormalCharge
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
306 MN : MassNumber
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
307 SM : SpinMultiplicity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
308
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
309 Examples:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
310
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
311 Benzene: Using value of *AS* for --AtomicInvariantsToUse, *Yes* for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
312 UseBondSymbols, and * AllAtomPathsWithRings* for -p, --PathMode,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
313 atom path strings generated are:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
314
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
315 C C:C C:C:C C:C:C:C C:C:C:C:C C:C:C:C:C:C C:C:C:C:C:C:C
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
316
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
317 And using *AS,X,BO* for --AtomicInvariantsToUse generates following
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
318 atom path strings:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
319
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
320 C.X2.BO3 C.X2.BO3:C.X2.BO3 C.X2.BO3:C.X2.BO3:C.X2.BO3
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
321 C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
322 C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
323 C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
324 C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3:C.X2.BO3
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
325
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
326 Urea: Using value of *AS* for --AtomicInvariantsToUse, *Yes* for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
327 UseBondSymbols, and * AllAtomPathsWithRings* for -p, --PathMode,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
328 atom path strings are:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
329
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
330 C N O C=O CN NC=O NCN
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
331
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
332 And using *AS,X,BO* for --AtomicInvariantsToUse generates following
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
333 atom path strings:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
334
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
335 C.X3.BO4 N.X1.BO1 O.X1.BO2 C.X3.BO4=O.X1.BO2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
336 C.X3.BO4N.X1.BO1 N.X1.BO1C.X3.BO4=O.X1.BO2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
337 N.X1.BO1C.X3.BO4N.X1.BO1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
338
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
339 --FunctionalClassesToUse *"FunctionalClass1,FunctionalClass2..."*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
340 This value is used during *FunctionalClassAtomTypes* value of a,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
341 --AtomIdentifierType option. It's a list of comma separated valid
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
342 functional classes.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
343
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
344 Possible values for atom functional classes are: *Ar, CA, H, HBA,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
345 HBD, Hal, NI, PI, RA*. Default value [ Ref 24 ]:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
346 *HBD,HBA,PI,NI,Ar,Hal*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
347
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
348 The functional class abbreviations correspond to:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
349
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
350 HBD: HydrogenBondDonor
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
351 HBA: HydrogenBondAcceptor
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
352 PI : PositivelyIonizable
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
353 NI : NegativelyIonizable
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
354 Ar : Aromatic
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
355 Hal : Halogen
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
356 H : Hydrophobic
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
357 RA : RingAtom
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
358 CA : ChainAtom
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
359
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
360 Functional class atom type specification for an atom corresponds to:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
361
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
362 Ar.CA.H.HBA.HBD.Hal.NI.PI.RA
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
363
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
364 *AtomTypes::FunctionalClassAtomTypes* module is used to assign
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
365 functional class atom types. It uses following definitions [ Ref
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
366 60-61, Ref 65-66 ]:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
367
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
368 HydrogenBondDonor: NH, NH2, OH
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
369 HydrogenBondAcceptor: N[!H], O
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
370 PositivelyIonizable: +, NH2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
371 NegativelyIonizable: -, C(=O)OH, S(=O)OH, P(=O)OH
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
372
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
373 --BitsOrder *Ascending | Descending*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
374 Bits order to use during generation of fingerprints bit-vector
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
375 string for *PathLengthBits* value of -m, --mode option. Possible
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
376 values: *Ascending, Descending*. Default: *Ascending*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
377
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
378 *Ascending* bit order which corresponds to first bit in each byte as
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
379 the lowest bit as opposed to the highest bit.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
380
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
381 Internally, bits are stored in *Ascending* order using Perl vec
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
382 function. Regardless of machine order, big-endian or little-endian,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
383 vec function always considers first string byte as the lowest byte
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
384 and first bit within each byte as the lowest bit.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
385
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
386 -b, --BitStringFormat *BinaryString | HexadecimalString*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
387 Format of fingerprints bit-vector string data in output SD, FP or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
388 CSV/TSV text file(s) specified by --output used during
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
389 *PathLengthBits* value of -m, --mode option. Possible values:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
390 *BinaryString, HexadecimalString*. Default value:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
391 *HexadecimalString*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
392
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
393 *BinaryString* corresponds to an ASCII string containing 1s and 0s.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
394 *HexadecimalString* contains bit values in ASCII hexadecimal format.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
395
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
396 Examples:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
397
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
398 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
399 th1:MaxLength8;1024;BinaryString;Ascending;001000010011010101011000110
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
400 0100010101011000101001011100110001000010001001101000001001001001001000
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
401 0010110100000111001001000001001010100100100000000011000000101001011100
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
402 0010000001000101010100000100111100110111011011011000000010110111001101
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
403 0101100011000000010001000011000010100011101100001000001000100000000...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
404
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
405 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
406 th1:MaxLength8;1024;HexadecimalString;Ascending;48caa1315d82d91122b029
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
407 42861c9409a4208182d12015509767bd0867653604481a8b1288000056090583603078
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
408 9cedae54e26596889ab121309800900490515224208421502120a0dd9200509723ae89
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
409 00024181b86c0122821d4e4880c38620dab280824b455404009f082003d52c212b4e6d
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
410 6ea05280140069c780290c43
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
411
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
412 --CompoundID *DataFieldName or LabelPrefixString*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
413 This value is --CompoundIDMode specific and indicates how compound
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
414 ID is generated.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
415
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
416 For *DataField* value of --CompoundIDMode option, it corresponds to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
417 datafield label name whose value is used as compound ID; otherwise,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
418 it's a prefix string used for generating compound IDs like
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
419 LabelPrefixString<Number>. Default value, *Cmpd*, generates compound
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
420 IDs which look like Cmpd<Number>.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
421
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
422 Examples for *DataField* value of --CompoundIDMode:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
423
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
424 MolID
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
425 ExtReg
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
426
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
427 Examples for *LabelPrefix* or *MolNameOrLabelPrefix* value of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
428 --CompoundIDMode:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
429
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
430 Compound
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
431
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
432 The value specified above generates compound IDs which correspond to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
433 Compound<Number> instead of default value of Cmpd<Number>.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
434
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
435 --CompoundIDLabel *text*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
436 Specify compound ID column label for FP or CSV/TSV text file(s) used
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
437 during *CompoundID* value of --DataFieldsMode option. Default:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
438 *CompoundID*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
439
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
440 --CompoundIDMode *DataField | MolName | LabelPrefix |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
441 MolNameOrLabelPrefix*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
442 Specify how to generate compound IDs and write to FP or CSV/TSV text
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
443 file(s) along with generated fingerprints for *FP | text | all*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
444 values of --output option: use a *SDFile(s)* datafield value; use
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
445 molname line from *SDFile(s)*; generate a sequential ID with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
446 specific prefix; use combination of both MolName and LabelPrefix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
447 with usage of LabelPrefix values for empty molname lines.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
448
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
449 Possible values: *DataField | MolName | LabelPrefix |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
450 MolNameOrLabelPrefix*. Default: *LabelPrefix*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
451
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
452 For *MolNameAndLabelPrefix* value of --CompoundIDMode, molname line
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
453 in *SDFile(s)* takes precedence over sequential compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
454 generated using *LabelPrefix* and only empty molname values are
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
455 replaced with sequential compound IDs.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
456
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
457 This is only used for *CompoundID* value of --DataFieldsMode option.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
458
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
459 --DataFields *"FieldLabel1,FieldLabel2,... "*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
460 Comma delimited list of *SDFiles(s)* data fields to extract and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
461 write to CSV/TSV text file(s) along with generated fingerprints for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
462 *text | all* values of --output option.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
463
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
464 This is only used for *Specify* value of --DataFieldsMode option.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
465
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
466 Examples:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
467
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
468 Extreg
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
469 MolID,CompoundName
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
470
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
471 -d, --DataFieldsMode *All | Common | Specify | CompoundID*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
472 Specify how data fields in *SDFile(s)* are transferred to output
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
473 CSV/TSV text file(s) along with generated fingerprints for *text |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
474 all* values of --output option: transfer all SD data field; transfer
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
475 SD data files common to all compounds; extract specified data
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
476 fields; generate a compound ID using molname line, a compound
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
477 prefix, or a combination of both. Possible values: *All | Common |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
478 specify | CompoundID*. Default value: *CompoundID*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
479
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
480 --DetectAromaticity *Yes | No*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
481 Detect aromaticity before generating fingerprints. Possible values:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
482 *Yes or No*. Default value: *Yes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
483
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
484 *No* --DetectAromaticity forces usage of atom and bond aromaticity
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
485 values from *SDFile(s)* and skips the step which detects and assigns
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
486 aromaticity.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
487
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
488 *No* --DetectAromaticity value is only allowed uring
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
489 *AtomicInvariantsAtomTypes* value of -a, --AtomIdentifierType
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
490 options; for all possible values -a, --AtomIdentifierType values, it
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
491 must be *Yes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
492
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
493 -f, --Filter *Yes | No*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
494 Specify whether to check and filter compound data in SDFile(s).
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
495 Possible values: *Yes or No*. Default value: *Yes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
496
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
497 By default, compound data is checked before calculating fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
498 and compounds containing atom data corresponding to non-element
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
499 symbols or no atom data are ignored.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
500
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
501 --FingerprintsLabel *text*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
502 SD data label or text file column label to use for fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
503 string in output SD or CSV/TSV text file(s) specified by --output.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
504 Default value: *PathLenghFingerprints*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
505
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
506 --fold *Yes | No*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
507 Fold fingerprints to increase bit density during *PathLengthBits*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
508 value of -m, --mode option. Possible values: *Yes or No*. Default
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
509 value: *No*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
510
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
511 --FoldedSize *number*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
512 Size of folded fingerprint during *PathLengthBits* value of -m,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
513 --mode option. Default value: *256*. Valid values correspond to any
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
514 positive integer which is less than -s, --size and meets the
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
515 criteria for its value.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
516
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
517 Examples:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
518
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
519 128
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
520 512
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
521
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
522 -h, --help
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
523 Print this help message
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
524
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
525 -i, --IgnoreHydrogens *Yes | No*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
526 Ignore hydrogens during fingerprints generation. Possible values:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
527 *Yes or No*. Default value: *Yes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
528
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
529 For *yes* value of -i, --IgnoreHydrogens, any explicit hydrogens are
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
530 also used for generation of atoms path lengths and fingerprints;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
531 implicit hydrogens are still ignored.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
532
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
533 -k, --KeepLargestComponent *Yes | No*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
534 Generate fingerprints for only the largest component in molecule.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
535 Possible values: *Yes or No*. Default value: *Yes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
536
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
537 For molecules containing multiple connected components, fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
538 can be generated in two different ways: use all connected components
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
539 or just the largest connected component. By default, all atoms
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
540 except for the largest connected component are deleted before
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
541 generation of fingerprints.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
542
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
543 -m, --mode *PathLengthBits | PathLengthCount*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
544 Specify type of path length fingerprints to generate for molecules
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
545 in *SDFile(s)*. Possible values: *PathLengthBits, PathLengthCount*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
546 Default value: *PathLengthBits*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
547
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
548 For *PathLengthBits* value of -m, --mode option, a fingerprint
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
549 bit-vector string containing zeros and ones is generated and for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
550 *PathLengthCount* value, a fingerprint vector string corresponding
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
551 to number of atom paths is generated.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
552
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
553 --MinPathLength *number*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
554 Minimum atom path length to include in fingerprints. Default value:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
555 *1*. Valid values: positive integers and less than --MaxPathLength.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
556 Path length of 1 correspond to a path containing only one atom.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
557
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
558 --MaxPathLength *number*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
559 Maximum atom path length to include in fingerprints. Default value:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
560 *8*. Valid values: positive integers and greater than
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
561 --MinPathLength.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
562
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
563 -n, --NumOfBitsToSetPerPath *number*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
564 Number of bits to set per path during generation of fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
565 bit-vector string for *PathLengthBits* value of -m, --mode option.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
566 Default value: *1*. Valid values: positive integers.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
567
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
568 --OutDelim *comma | tab | semicolon*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
569 Delimiter for output CSV/TSV text file(s). Possible values: *comma,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
570 tab, or semicolon* Default value: *comma*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
571
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
572 --output *SD | FP | text | all*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
573 Type of output files to generate. Possible values: *SD, FP, text, or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
574 all*. Default value: *text*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
575
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
576 -o, --overwrite
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
577 Overwrite existing files.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
578
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
579 -p, --PathMode *AtomPathsWithoutRings | AtomPathsWithRings |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
580 AllAtomPathsWithoutRings | AllAtomPathsWithRings*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
581 Specify type of atom paths to use for generating pathlength
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
582 fingerprints for molecules in *SDFile(s)*. Possible
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
583 values:*AtomPathsWithoutRings, AtomPathsWithRings,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
584 AllAtomPathsWithoutRings, AllAtomPathsWithRings*. Default value:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
585 *AllAtomPathsWithRings*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
586
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
587 For molecules with no rings, first two and last two options are
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
588 equivalent and generate same set of atom paths starting from each
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
589 atom with length between --MinPathLength and --MaxPathLength.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
590 However, all these four options can result in the same set of final
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
591 atom paths for molecules containing fused, bridged or spiro rings.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
592
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
593 For molecules containing rings, atom paths starting from each atom
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
594 can be traversed in four different ways:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
595
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
596 *AtomPathsWithoutRings* - Atom paths containing no rings and without
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
597 sharing of bonds in traversed paths.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
598
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
599 *AtomPathsWithRings* - Atom paths containing rings and without any
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
600 sharing of bonds in traversed paths.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
601
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
602 *AllAtomPathsWithoutRings* - All possible atom paths containing no
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
603 rings and without any sharing of bonds in traversed paths.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
604
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
605 *AllAtomPathsWithRings* - All possible atom paths containing rings
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
606 and with sharing of bonds in traversed paths.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
607
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
608 Atom path traversal is terminated at the ring atom.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
609
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
610 Based on values specified for for -p, --PathMode, --MinPathLength
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
611 and --MaxPathLength, all appropriate atom paths are generated for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
612 each atom in the molecule and collected in a list.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
613
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
614 For each atom path in the filtered atom paths list, an atom path
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
615 string is created using value of -a, --AtomIdentifierType and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
616 specified values to use for a particular atom identifier type. Value
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
617 of -u, --UseBondSymbols controls whether bond order symbols are used
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
618 during generation of atom path string. Atom symbol corresponds to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
619 element symbol and characters used to represent bond order are: *1 -
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
620 None; 2 - '='; 3 - '#'; 1.5 or aromatic - ':'; others: bond order
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
621 value*. By default, bond symbols are included in atom path strings.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
622 Exclusion of bond symbols in atom path strings results in
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
623 fingerprints which correspond purely to atom paths without
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
624 considering bonds.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
625
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
626 UseUniquePaths controls the removal of structurally duplicate atom
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
627 path strings are removed from the list.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
628
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
629 For *PathLengthBits* value of -m, --mode option, each atom path is
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
630 hashed to a 32 bit unsigned integer key using TextUtil::HashCode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
631 function. Using the hash key as a seed for a random number
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
632 generator, a random integer value between 0 and --Size is used to
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
633 set corresponding bits in the fingerprint bit-vector string. Value
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
634 of --NumOfBitsToSetPerPaths option controls the number of time a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
635 random number is generated to set corresponding bits.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
636
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
637 For * PathLengthCount* value of -m, --mode option, the number of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
638 times an atom path appears is tracked and a fingerprints
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
639 count-string corresponding to count of atom paths is generated.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
640
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
641 For molecule containing rings, combination of -p, --PathMode and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
642 --UseBondSymbols allows generation of up to 8 different types of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
643 atom path length strings:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
644
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
645 AllowSharedBonds AllowRings UseBondSymbols
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
646
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
647 0 0 1 - AtomPathsNoCyclesWithBondSymbols
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
648 0 1 1 - AtomPathsWithCyclesWithBondSymbols
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
649
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
650 1 0 1 - AllAtomPathsNoCyclesWithBondSymbols
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
651 1 1 1 - AllAtomPathsWithCyclesWithBondSymbols
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
652 [ DEFAULT ]
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
653
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
654 0 0 0 - AtomPathsNoCyclesNoBondSymbols
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
655 0 1 0 - AtomPathsWithCyclesNoBondSymbols
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
656
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
657 1 0 0 - AllAtomPathsNoCyclesNoBondSymbols
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
658 1 1 0 - AllAtomPathsWithCyclesNoWithBondSymbols
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
659
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
660 Default atom path length fingerprints generation for molecules
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
661 containing rings with *AllAtomPathsWithRings* value for -p,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
662 --PathMode, *Yes* value for --UseBondSymbols, *2* value for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
663 --MinPathLength and *8* value for --MaxPathLength is the most time
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
664 consuming. Combinations of other options can substantially speed up
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
665 fingerprint generation for molecules containing complex ring
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
666 systems.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
667
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
668 Additionally, value for option -a, --AtomIdentifierType in
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
669 conjunction with corresponding specified values for atom types
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
670 changes the nature of atom path length strings and the fingerprints.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
671
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
672 -q, --quote *Yes | No*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
673 Put quote around column values in output CSV/TSV text file(s).
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
674 Possible values: *Yes or No*. Default value: *Yes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
675
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
676 -r, --root *RootName*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
677 New file name is generated using the root: <Root>.<Ext>. Default for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
678 new file names: <SDFileName><PathLengthFP>.<Ext>. The file type
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
679 determines <Ext> value. The sdf, fpf, csv, and tsv <Ext> values are
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
680 used for SD, FP, comma/semicolon, and tab delimited text files,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
681 respectively.This option is ignored for multiple input files.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
682
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
683 -s, --size *number*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
684 Size of fingerprints. Default value: *1024*. Valid values correspond
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
685 to any positive integer which satisfies the following criteria:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
686 power of 2, >= 32 and <= 2 ** 32.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
687
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
688 Examples:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
689
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
690 256
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
691 512
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
692 2048
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
693
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
694 -u, --UseBondSymbols *Yes | No*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
695 Specify whether to use bond symbols for atom paths during generation
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
696 of atom path strings. Possible values: *Yes or No*. Default value:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
697 *Yes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
698
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
699 *No* value option for -u, --UseBondSymbols allows the generation of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
700 fingerprints corresponding purely to atoms disregarding all bonds.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
701
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
702 --UsePerlCoreRandom *Yes | No*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
703 Specify whether to use Perl CORE::rand or MayaChemTools
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
704 MathUtil::random function during random number generation for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
705 setting bits in fingerprints bit-vector strings. Possible values:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
706 *Yes or No*. Default value: *Yes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
707
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
708 *No* value option for --UsePerlCoreRandom allows the generation of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
709 fingerprints bit-vector strings which are same across different
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
710 platforms.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
711
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
712 The random number generator implemented in MayaChemTools is a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
713 variant of linear congruential generator (LCG) as described by
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
714 Miller et al. [ Ref 120 ]. It is also referred to as Lehmer random
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
715 number generator or Park-Miller random number generator.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
716
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
717 Unlike Perl's core random number generator function rand, the random
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
718 number generator implemented in MayaChemTools, MathUtil::random,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
719 generates consistent random values across different platforms for a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
720 specific random seed and leads to generation of portable
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
721 fingerprints bit-vector strings.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
722
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
723 --UseUniquePaths *Yes | No*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
724 Specify whether to use structurally unique atom paths during
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
725 generation of atom path strings. Possible values: *Yes or No*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
726 Default value: *Yes*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
727
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
728 *No* value option for --UseUniquePaths allows usage of all atom
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
729 paths generated by -p, --PathMode option value for generation of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
730 atom path strings leading to duplicate path count during
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
731 *PathLengthCount* value of -m, --mode option. It doesn't change
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
732 fingerprint string generated during *PathLengthBits* value of -m,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
733 --mode.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
734
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
735 For example, during *AllAtomPathsWithRings* value of -p, --PathMode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
736 option, benzene has 12 linear paths of length 2 and 12 cyclic paths
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
737 length of 7, but only 6 linear paths of length 2 and 1 cyclic path
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
738 of length 7 are structurally unique.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
739
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
740 -v, --VectorStringFormat *IDsAndValuesString | IDsAndValuesPairsString |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
741 ValuesAndIDsString | ValuesAndIDsPairsString*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
742 Format of fingerprints vector string data in output SD, FP or
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
743 CSV/TSV text file(s) specified by --output used during
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
744 *PathLengthCount* value of -m, --mode option. Possible values:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
745 *IDsAndValuesString | IDsAndValuesPairsString | ValuesAndIDsString |
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
746 ValuesAndIDsPairsString*. Defaultvalue: *IDsAndValuesString*.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
747
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
748 Examples:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
749
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
750 FingerprintsVector;PathLengthCount:AtomicInvariantsAtomTypes:MinLength
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
751 1:MaxLength8;432;NumericalValues;IDsAndValuesPairsString;C.X1.BO1.H3 2
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
752 C.X2.BO2.H2 4 C.X2.BO3.H1 14 C.X3.BO3.H1 3 C.X3.BO4 10 F.X1.BO1 1 N.X
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
753 2.BO2.H1 1 N.X3.BO3 1 O.X1.BO1.H1 3 O.X1.BO2 2 C.X1.BO1.H3C.X3.BO3.H1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
754 2 C.X2.BO2.H2C.X2.BO2.H2 1 C.X2.BO2.H2C.X3.BO3.H1 4 C.X2.BO2.H2C.X3.BO
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
755 4 1 C.X2.BO2.H2N.X3.BO3 1 C.X2.BO3.H1:C.X2.BO3.H1 10 C.X2.BO3.H1:C....
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
756
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
757 FingerprintsVector;PathLengthCount:EStateAtomTypes:MinLength1:MaxLengt
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
758 h8;454;NumericalValues;IDsAndValuesPairsString;aaCH 14 aasC 8 aasN 1 d
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
759 O 2 dssC 2 sCH3 2 sF 1 sOH 3 ssCH2 4 ssNH 1 sssCH 3 aaCH:aaCH 10 aaCH:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
760 aasC 8 aasC:aasC 3 aasC:aasN 2 aasCaasC 2 aasCdssC 1 aasCsF 1 aasCssNH
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
761 1 aasCsssCH 1 aasNssCH2 1 dO=dssC 2 dssCsOH 1 dssCssCH2 1 dssCssNH 1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
762 sCH3sssCH 2 sOHsssCH 2 ssCH2ssCH2 1 ssCH2sssCH 4 aaCH:aaCH:aaCH 6 a...
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
763
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
764 -w, --WorkingDir *DirName*
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
765 Location of working directory. Default: current directory.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
766
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
767 EXAMPLES
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
768 To generate path length fingerprints corresponding to all unique paths
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
769 from length 1 through 8 in hexadecimal bit-vector string format of size
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
770 1024 and create a SamplePLFPHex.csv file containing sequential compound
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
771 IDs along with fingerprints bit-vector strings data, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
772
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
773 % PathLengthFingerprints.pl -o -r SamplePLFPHex Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
774
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
775 To generate path length fingerprints corresponding to all unique paths
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
776 from length 1 through 8 in hexadecimal bit-vector string format of size
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
777 1024 and create SamplePLFPHex.sdf, SamplePLFPHex.fpf, and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
778 SamplePLFPHex.csv files containing sequential compound IDs in CSV file
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
779 along with fingerprints bit-vector strings data, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
780
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
781 % PathLengthFingerprints.pl --output all -o -r SamplePLFPHex Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
782
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
783 To generate path length fingerprints corresponding to all unique paths
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
784 from length 1 through 8 in binary bit-vector string format of size 1024
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
785 and create a SamplePLFPBin.csv file containing sequential compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
786 along with fingerprints bit-vector strings data, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
787
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
788 % PathLengthFingerprints.pl --BitStringFormat BinaryString --size 2048
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
789 -o -r SamplePLFPBin Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
790
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
791 To generate path length fingerprints corresponding to count of all
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
792 unique paths from length 1 through 8 in IDsAndValuesString format and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
793 create a SamplePLFPCount.csv file containing sequential compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
794 along with fingerprints vector strings data, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
795
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
796 % PathLengthFingerprints.pl -m PathLengthCount -o -r SamplePLFPCount
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
797 Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
798
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
799 To generate path length fingerprints corresponding to count of all
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
800 unique paths from length 1 through 8 in IDsAndValuesString format using
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
801 E-state atom types and create a SamplePLFPCount.csv file containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
802 sequential compound IDs along with fingerprints vector strings data,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
803 type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
804
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
805 % PathLengthFingerprints.pl -m PathLengthCount --AtomIdentifierType
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
806 EStateAtomTypes -o -r SamplePLFPCount Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
807
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
808 To generate path length fingerprints corresponding to count of all
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
809 unique paths from length 1 through 8 in IDsAndValuesString format using
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
810 SLogP atom types and create a SamplePLFPCount.csv file containing
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
811 sequential compound IDs along with fingerprints vector strings data,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
812 type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
813
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
814 % PathLengthFingerprints.pl -m PathLengthCount --AtomIdentifierType
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
815 SLogPAtomTypes -o -r SamplePLFPCount Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
816
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
817 To generate path length fingerprints corresponding to count of all
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
818 unique paths from length 1 through 8 in IDsAndValuesString format and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
819 create a SamplePLFPCount.csv file containing sequential compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
820 along with fingerprints vector strings data, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
821
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
822 % PathLengthFingerprints.pl -m PathLengthCount --VectorStringFormat
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
823 ValuesAndIDsPairsString -o -r SamplePLFPCount Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
824
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
825 To generate path length fingerprints corresponding to count of all
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
826 unique paths from length 1 through 8 in IDsAndValuesString format using
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
827 AS,X,BO as atomic invariants and create a SamplePLFPCount.csv file
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
828 containing sequential compound IDs along with fingerprints vector
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
829 strings data, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
830
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
831 % PathLengthFingerprints.pl -m PathLengthCount --AtomIdentifierType
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
832 AtomicInvariantsAtomTypes --AtomicInvariantsToUse "AS,X,BO" -o
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
833 -r SamplePLFPCount Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
834
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
835 To generate path length fingerprints corresponding to count of all paths
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
836 from length 1 through 8 in IDsAndValuesString format and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
837 SamplePLFPCount.csv file containing compound IDs from MolName line along
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
838 with fingerprints vector strings data, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
839
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
840 % PathLengthFingerprints.pl -m PathLengthCount --UseUniquePaths No
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
841 -o --CompoundIDMode MolName -r SamplePLFPCount --UseUniquePaths No
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
842 Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
843
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
844 To generate path length fingerprints corresponding to all unique paths
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
845 from length 1 through 8 in hexadecimal bit-vector string format of size
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
846 512 after folding and create SamplePLFPHex.sdf, SamplePLFPHex.fpf, and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
847 SamplePLFPHex.sdf files containing sequential compound IDs along with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
848 fingerprints bit-vector strings data, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
849
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
850 % PathLengthFingerprints.pl --output all --Fold Yes --FoldedSize 512
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
851 -o -r SamplePLFPHex Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
852
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
853 To generate path length fingerprints corresponding to all unique paths
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
854 from length 1 through 8 containing no rings and without sharing of bonds
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
855 in hexadecimal bit-vector string format of size 1024 and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
856 SamplePLFPHex.csv file containing sequential compound IDs along with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
857 fingerprints bit-vector strings data and all data fields, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
858
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
859 % PathLengthFingerprints.pl -p AtomPathsWithoutRings --DataFieldsMode All
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
860 -o -r SamplePLFPHex Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
861
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
862 To generate path length fingerprints corresponding to all unique paths
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
863 from length 1 through 8 containing rings and without sharing of bonds in
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
864 hexadecimal bit-vector string format of size 1024 and create a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
865 SamplePLFPHex.tsv file containing compound IDs derived from combination
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
866 of molecule name line and an explicit compound prefix along with
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
867 fingerprints bit-vector strings data and all data fields, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
868
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
869 % PathLengthFingerprints.pl -p AtomPathsWithRings --DataFieldsMode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
870 CompoundID --CompoundIDMode MolnameOrLabelPrefix --CompoundID Cmpd
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
871 --CompoundIDLabel MolID --FingerprintsLabel PathLengthFP --OutDelim Tab
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
872 -r SamplePLFPHex -o Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
873
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
874 To generate path length fingerprints corresponding to count of all
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
875 unique paths from length 1 through 8 in IDsAndValuesString format and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
876 create a SamplePLFPCount.csv file containing sequential compound IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
877 along with fingerprints vector strings data using aromaticity specified
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
878 in SD file, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
879
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
880 % PathLengthFingerprints.pl -m PathLengthCount --DetectAromaticity No
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
881 -o -r SamplePLFPCount Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
882
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
883 To generate path length fingerprints corresponding to all unique paths
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
884 from length 2 through 6 in hexadecimal bit-vector string format of size
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
885 1024 and create a SamplePLFPHex.csv file containing sequential compound
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
886 IDs along with fingerprints bit-vector strings data, type:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
887
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
888 % PathLengthFingerprints.pl --MinPathLength 2 --MaxPathLength 6
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
889 -o -r SamplePLFPHex Sample.sdf
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
890
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
891 AUTHOR
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
892 Manish Sud <msud@san.rr.com>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
893
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
894 SEE ALSO
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
895 InfoFingerprintsFiles.pl, SimilarityMatricesFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
896 AtomNeighborhoodsFingerprints.pl, ExtendedConnectivityFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
897 MACCSKeysFingerprints.pl, TopologicalAtomPairsFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
898 TopologicalAtomTorsionsFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
899 TopologicalPharmacophoreAtomPairsFingerprints.pl,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
900 TopologicalPharmacophoreAtomTripletsFingerprints.pl
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
901
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
902 COPYRIGHT
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
903 Copyright (C) 2015 Manish Sud. All rights reserved.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
904
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
905 This file is part of MayaChemTools.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
906
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
907 MayaChemTools is free software; you can redistribute it and/or modify it
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
908 under the terms of the GNU Lesser General Public License as published by
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
909 the Free Software Foundation; either version 3 of the License, or (at
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
910 your option) any later version.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
911