comparison docs/scripts/txt/InfoFingerprintsFiles.txt @ 0:4816e4a8ae95 draft default tip

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 09:23:18 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:4816e4a8ae95
1 NAME
2 InfoFingerprintsFiles.pl - List information about fingerprints data in
3 SD, FP and CSV/TSV text file(s)
4
5 SYNOPSIS
6 InfoFingerprintsFiles.pl SDFile(s) FPFile(s) TextFile(s)...
7
8 InfoFingerprintsFiles.pl [-a, --all] [--AverageBitDensity]
9 [--BitDensity] [-c, --count] [-c, --ColMode *ColNum | ColLabel*]
10 [--DataCheck] [-d, --detail *InfoLevel*] [-e, --empty]
11 [--FingerprintsCol *col number | col name*] [--FingerprintsField
12 *FieldLabel*] [--FingerprintsType] [--FingerprintsDescription]
13 [--FingerprintsSize] [--FingerprintsBitStringFormat]
14 [--FingerprintsBitOrder] [--FingerprintsVectorValuesType]
15 [--FingerprintsVectorValuesFormat] [-h, --help] [--InDelim *comma |
16 semicolon*] [--NumOfOnBits] [--NumOfNonZeroValues] [-w, --WorkingDir
17 dirname] SDFile(s) FPFile(s) TextFile(s)...
18
19 DESCRIPTION
20 List information about fingerprints data in *SD, FP and CSV/TSV* text
21 file(s): number of rows containing fingerprints data, type of
22 fingerprints vector, description and size of fingerprints, bit density
23 and average bit density for bit-vector fingerprints strings, and so on.
24
25 The scripts InfoFingerprintsSDFiles.pl and InfoFingerprintsTextFiles.pl
26 have been removed from the current release of MayaChemTools and their
27 functionality merged with this script.
28
29 The valid *SDFile* extensions are *.sdf* and *.sd*. All SD files in a
30 current directory can be specified either by **.sdf* or the current
31 directory name.
32
33 The valid *FPFile* extensions are *.fpf* and *.fp*. All FP files in a
34 current directory can be specified either by **.fpf* or the current
35 directory name.
36
37 The valid *TextFile* extensions are *.csv* and *.tsv* for
38 comma/semicolon and tab delimited text files respectively. All other
39 file names are ignored. All text files in a current directory can be
40 specified by **.csv*, **.tsv*, or the current directory name. The
41 --indelim option determines the format of *TextFile(s)*. Any file which
42 doesn't correspond to the format indicated by --indelim option is
43 ignored.
44
45 Format of fingerprint strings data in *SDFile(s), FPFile(s) and
46 TextFile(s)* is automatically detected.
47
48 Example of *FP* file containing fingerprints bit-vector string data:
49
50 #
51 # Package = MayaChemTools 7.4
52 # ReleaseDate = Oct 21, 2010
53 #
54 # TimeStamp = Mon Mar 7 15:14:01 2011
55 #
56 # FingerprintsStringType = FingerprintsBitVector
57 #
58 # Description = PathLengthBits:AtomicInvariantsAtomTypes:MinLength1:...
59 # Size = 1024
60 # BitStringFormat = HexadecimalString
61 # BitsOrder = Ascending
62 #
63 Cmpd1 9c8460989ec8a49913991a6603130b0a19e8051c89184414953800cc21510...
64 Cmpd2 000000249400840040100042011001001980410c000000001010088001120...
65 ... ...
66 ... ..
67
68 Example of *FP* file containing fingerprints vector string data:
69
70 #
71 # Package = MayaChemTools 7.4
72 # ReleaseDate = Oct 21, 2010
73 #
74 # TimeStamp = Mon Mar 7 15:14:01 2011
75 #
76 # FingerprintsStringType = FingerprintsVector
77 #
78 # Description = PathLengthBits:AtomicInvariantsAtomTypes:MinLength1:...
79 # VectorStringFormat = IDsAndValuesString
80 # VectorValuesType = NumericalValues
81 #
82 Cmpd1 338;C F N O C:C C:N C=O CC CF CN CO C:C:C C:C:N C:CC C:CF C:CN C:
83 N:C C:NC CC:N CC=O CCC CCN CCO CNC NC=O O=CO C:C:C:C C:C:C:N C:C:CC...;
84 33 1 2 5 21 2 2 12 1 3 3 20 2 10 2 2 1 2 2 2 8 2 5 1 1 1 19 2 8 2 2 2 2
85 6 2 2 2 2 2 2 2 2 3 2 2 1 4 1 5 1 1 18 6 2 2 1 2 10 2 1 2 1 2 2 2 2 ...
86 Cmpd2 103;C N O C=N C=O CC CN CO CC=O CCC CCN CCO CNC N=CN NC=O NCN O=C
87 O C CC=O CCCC CCCN CCCO CCNC CNC=N CNC=O CNCN CCCC=O CCCCC CCCCN CC...;
88 15 4 4 1 2 13 5 2 2 15 5 3 2 2 1 1 1 2 17 7 6 5 1 1 1 2 15 8 5 7 2 2 2 2
89 1 2 1 1 3 15 7 6 8 3 4 4 3 2 2 1 2 3 14 2 4 7 4 4 4 4 1 1 1 2 1 1 1 ...
90 ... ...
91 ... ...
92
93 Example of *SD* file containing fingerprints bit-vector string data:
94
95 ... ...
96 ... ...
97 $$$$
98 ... ...
99 ... ...
100 ... ...
101 41 44 0 0 0 0 0 0 0 0999 V2000
102 -3.3652 1.4499 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
103 ... ...
104 2 3 1 0 0 0 0
105 ... ...
106 M END
107 > <CmpdID>
108 Cmpd1
109
110 > <PathLengthFingerprints>
111 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLengt
112 h1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a49913991a66
113 03130b0a19e8051c89184414953800cc2151082844a201042800130860308e8204d4028
114 00831048940e44281c00060449a5000ac80c894114e006321264401600846c050164462
115 08190410805000304a10205b0100e04c0038ba0fad0209c0ca8b1200012268b61c0026a
116 aa0660a11014a011d46
117
118 $$$$
119 ... ...
120 ... ...
121
122 Example of CSV *Text* file containing fingerprints bit-vector string
123 data:
124
125 "CompoundID","PathLengthFingerprints"
126 "Cmpd1","FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes
127 :MinLength1:MaxLength8;1024;HexadecimalString;Ascending;9c8460989ec8a4
128 9913991a6603130b0a19e8051c89184414953800cc2151082844a20104280013086030
129 8e8204d402800831048940e44281c00060449a5000ac80c894114e006321264401..."
130 ... ...
131 ... ...
132
133 The current release of MayaChemTools supports the following types of
134 fingerprint bit-vector and vector strings:
135
136 FingerprintsVector;AtomNeighborhoods:AtomicInvariantsAtomTypes:MinRadi
137 us0:MaxRadius2;41;AlphaNumericalValues;ValuesString;NR0-C.X1.BO1.H3-AT
138 C1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-ATC1 NR0-C.X
139 1.BO1.H3-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2-C.X1.BO1.H3-ATC1:NR2-C.X3.BO4-A
140 TC1 NR0-C.X2.BO2.H2-ATC1:NR1-C.X2.BO2.H2-ATC1:NR1-C.X3.BO3.H1-ATC1:NR2
141 -C.X2.BO2.H2-ATC1:NR2-N.X3.BO3-ATC1:NR2-O.X1.BO1.H1-ATC1 NR0-C.X2.B...
142
143 FingerprintsVector;AtomTypesCount:AtomicInvariantsAtomTypes:ArbitraryS
144 ize;10;NumericalValues;IDsAndValuesString;C.X1.BO1.H3 C.X2.BO2.H2 C.X2
145 .BO3.H1 C.X3.BO3.H1 C.X3.BO4 F.X1.BO1 N.X2.BO2.H1 N.X3.BO3 O.X1.BO1.H1
146 O.X1.BO2;2 4 14 3 10 1 1 1 3 2
147
148 FingerprintsVector;AtomTypesCount:SLogPAtomTypes:ArbitrarySize;16;Nume
149 ricalValues;IDsAndValuesString;C1 C10 C11 C14 C18 C20 C21 C22 C5 CS F
150 N11 N4 O10 O2 O9;5 1 1 1 14 4 2 1 2 2 1 1 1 1 3 1
151
152 FingerprintsVector;AtomTypesCount:SLogPAtomTypes:FixedSize;67;OrderedN
153 umericalValues;IDsAndValuesString;C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C
154 12 C13 C14 C15 C16 C17 C18 C19 C20 C21 C22 C23 C24 C25 C26 C27 CS N1 N
155 2 N3 N4 N5 N6 N7 N8 N9 N10 N11 N12 N13 N14 NS O1 O2 O3 O4 O5 O6 O7 O8
156 O9 O10 O11 O12 OS F Cl Br I Hal P S1 S2 S3 Me1 Me2;5 0 0 0 2 0 0 0 0 1
157 1 0 0 1 0 0 0 14 0 4 2 1 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0...
158
159 FingerprintsVector;EStateIndicies:ArbitrarySize;11;NumericalValues;IDs
160 AndValuesString;SaaCH SaasC SaasN SdO SdssC SsCH3 SsF SsOH SssCH2 SssN
161 H SsssCH;24.778 4.387 1.993 25.023 -1.435 3.975 14.006 29.759 -0.073 3
162 .024 -2.270
163
164 FingerprintsVector;EStateIndicies:FixedSize;87;OrderedNumericalValues;
165 ValuesString;0 0 0 0 0 0 0 3.975 0 -0.073 0 0 24.778 -2.270 0 0 -1.435
166 4.387 0 0 0 0 0 0 3.024 0 0 0 0 0 0 0 1.993 0 29.759 25.023 0 0 0 0 1
167 4.006 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
168 0 0 0 0 0 0 0 0 0 0 0 0 0 0
169
170 FingerprintsVector;ExtendedConnectivity:AtomicInvariantsAtomTypes:Radi
171 us2;60;AlphaNumericalValues;ValuesString;73555770 333564680 352413391
172 666191900 1001270906 1371674323 1481469939 1977749791 2006158649 21414
173 08799 49532520 64643108 79385615 96062769 273726379 564565671 85514103
174 5 906706094 988546669 1018231313 1032696425 1197507444 1331250018 1338
175 532734 1455473691 1607485225 1609687129 1631614296 1670251330 17303...
176
177 FingerprintsVector;ExtendedConnectivityCount:AtomicInvariantsAtomTypes
178 :Radius2;60;NumericalValues;IDsAndValuesString;73555770 333564680 3524
179 13391 666191900 1001270906 1371674323 1481469939 1977749791 2006158649
180 2141408799 49532520 64643108 79385615 96062769 273726379 564565671...;
181 3 2 1 1 14 1 2 10 4 3 1 1 1 1 2 1 2 1 1 1 2 3 1 1 2 1 3 3 8 2 2 2 6 2
182 1 2 1 1 2 1 1 1 2 1 1 2 1 2 1 1 1 1 1 1 1 1 1 2 1 1
183
184 FingerprintsBitVector;ExtendedConnectivityBits:AtomicInvariantsAtomTyp
185 es:Radius2;1024;BinaryString;Ascending;0000000000000000000000000000100
186 0000000001010000000110000011000000000000100000000000000000000000100001
187 1000000110000000000000000000000000010011000000000000000000000000010000
188 0000000000000000000000000010000000000000000001000000000000000000000000
189 0000000000010000100001000000000000101000000000000000100000000000000...
190
191 FingerprintsVector;ExtendedConnectivity:FunctionalClassAtomTypes:Radiu
192 s2;57;AlphaNumericalValues;ValuesString;24769214 508787397 850393286 8
193 62102353 981185303 1231636850 1649386610 1941540674 263599683 32920567
194 1 571109041 639579325 683993318 723853089 810600886 885767127 90326012
195 7 958841485 981022393 1126908698 1152248391 1317567065 1421489994 1455
196 632544 1557272891 1826413669 1983319256 2015750777 2029559552 20404...
197
198 FingerprintsVector;ExtendedConnectivity:EStateAtomTypes:Radius2;62;Alp
199 haNumericalValues;ValuesString;25189973 528584866 662581668 671034184
200 926543080 1347067490 1738510057 1759600920 2034425745 2097234755 21450
201 44754 96779665 180364292 341712110 345278822 386540408 387387308 50430
202 1706 617094135 771528807 957666640 997798220 1158349170 1291258082 134
203 1138533 1395329837 1420277211 1479584608 1486476397 1487556246 1566...
204
205 FingerprintsBitVector;MACCSKeyBits;166;BinaryString;Ascending;00000000
206 0000000000000000000000000000000001001000010010000000010010000000011100
207 0100101010111100011011000100110110000011011110100110111111111111011111
208 11111111111110111000
209
210 FingerprintsBitVector;MACCSKeyBits;322;BinaryString;Ascending;11101011
211 1110011111100101111111000111101100110000000000000011100010000000000000
212 0000000000000000000000000000000000000000000000101000000000000000000000
213 0000000000000000000000000000000000000000000000000000000000000000000000
214 0000000000000000000000000000000000000011000000000000000000000000000000
215 0000000000000000000000000000000000000000
216
217 FingerprintsVector;MACCSKeyCount;166;OrderedNumericalValues;ValuesStri
218 ng;0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
219 0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 4 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0
220 0 0 0 0 1 1 8 0 0 0 1 0 0 1 0 1 0 1 0 3 1 3 1 0 0 0 1 2 0 11 1 0 0 0
221 5 0 0 1 2 0 1 1 0 0 0 0 0 1 1 0 1 1 1 1 0 4 0 0 1 1 0 4 6 1 1 1 2 1 1
222 3 5 2 2 0 5 3 5 1 1 2 5 1 2 1 2 4 8 3 5 5 2 2 0 3 5 4 1
223
224 FingerprintsVector;MACCSKeyCount;322;OrderedNumericalValues;ValuesStri
225 ng;14 8 2 0 2 0 4 4 2 1 4 0 0 2 5 10 5 2 1 0 0 2 0 5 13 3 28 5 5 3 0 0
226 0 4 2 1 1 0 1 1 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22 5 3 0 0 0 1 0
227 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
228 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 0 2 0 0 0 0 0 0 0 0 0
229 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
230
231 FingerprintsBitVector;PathLengthBits:AtomicInvariantsAtomTypes:MinLeng
232 th1:MaxLength8;1024;BinaryString;Ascending;001000010011010101011000110
233 0100010101011000101001011100110001000010001001101000001001001001001000
234 0010110100000111001001000001001010100100100000000011000000101001011100
235 0010000001000101010100000100111100110111011011011000000010110111001101
236 0101100011000000010001000011000010100011101100001000001000100000000...
237
238 FingerprintsVector;PathLengthCount:AtomicInvariantsAtomTypes:MinLength
239 1:MaxLength8;432;NumericalValues;IDsAndValuesPairsString;C.X1.BO1.H3 2
240 C.X2.BO2.H2 4 C.X2.BO3.H1 14 C.X3.BO3.H1 3 C.X3.BO4 10 F.X1.BO1 1 N.X
241 2.BO2.H1 1 N.X3.BO3 1 O.X1.BO1.H1 3 O.X1.BO2 2 C.X1.BO1.H3C.X3.BO3.H1
242 2 C.X2.BO2.H2C.X2.BO2.H2 1 C.X2.BO2.H2C.X3.BO3.H1 4 C.X2.BO2.H2C.X3.BO
243 4 1 C.X2.BO2.H2N.X3.BO3 1 C.X2.BO3.H1:C.X2.BO3.H1 10 C.X2.BO3.H1:C....
244
245 FingerprintsVector;PathLengthCount:MMFF94AtomTypes:MinLength1:MaxLengt
246 h8;463;NumericalValues;IDsAndValuesPairsString;C5A 2 C5B 2 C=ON 1 CB 1
247 8 COO 1 CR 9 F 1 N5 1 NC=O 1 O=CN 1 O=CO 1 OC=O 1 OR 2 C5A:C5B 2 C5A:N
248 5 2 C5ACB 1 C5ACR 1 C5B:C5B 1 C5BC=ON 1 C5BCB 1 C=ON=O=CN 1 C=ONNC=O 1
249 CB:CB 18 CBF 1 CBNC=O 1 COO=O=CO 1 COOCR 1 COOOC=O 1 CRCR 7 CRN5 1 CR
250 OR 2 C5A:C5B:C5B 2 C5A:C5BC=ON 1 C5A:C5BCB 1 C5A:N5:C5A 1 C5A:N5CR ...
251
252 FingerprintsVector;TopologicalAtomPairs:AtomicInvariantsAtomTypes:MinD
253 istance1:MaxDistance10;223;NumericalValues;IDsAndValuesString;C.X1.BO1
254 .H3-D1-C.X3.BO3.H1 C.X2.BO2.H2-D1-C.X2.BO2.H2 C.X2.BO2.H2-D1-C.X3.BO3.
255 H1 C.X2.BO2.H2-D1-C.X3.BO4 C.X2.BO2.H2-D1-N.X3.BO3 C.X2.BO3.H1-D1-...;
256 2 1 4 1 1 10 8 1 2 6 1 2 2 1 2 1 2 2 1 2 1 5 1 10 12 2 2 1 2 1 9 1 3 1
257 1 1 2 2 1 3 6 1 6 14 2 2 2 3 1 3 1 8 2 2 1 3 2 6 1 2 2 5 1 3 1 23 1...
258
259 FingerprintsVector;TopologicalAtomPairs:FunctionalClassAtomTypes:MinDi
260 stance1:MaxDistance10;144;NumericalValues;IDsAndValuesString;Ar-D1-Ar
261 Ar-D1-Ar.HBA Ar-D1-HBD Ar-D1-Hal Ar-D1-None Ar.HBA-D1-None HBA-D1-NI H
262 BA-D1-None HBA.HBD-D1-NI HBA.HBD-D1-None HBD-D1-None NI-D1-None No...;
263 23 2 1 1 2 1 1 1 1 2 1 1 7 28 3 1 3 2 8 2 1 1 1 5 1 5 24 3 3 4 2 13 4
264 1 1 4 1 5 22 4 4 3 1 19 1 1 1 1 1 2 2 3 1 1 8 25 4 5 2 3 1 26 1 4 1 ...
265
266 FingerprintsVector;TopologicalAtomTorsions:AtomicInvariantsAtomTypes;3
267 3;NumericalValues;IDsAndValuesString;C.X1.BO1.H3-C.X3.BO3.H1-C.X3.BO4-
268 C.X3.BO4 C.X1.BO1.H3-C.X3.BO3.H1-C.X3.BO4-N.X3.BO3 C.X2.BO2.H2-C.X2.BO
269 2.H2-C.X3.BO3.H1-C.X2.BO2.H2 C.X2.BO2.H2-C.X2.BO2.H2-C.X3.BO3.H1-O...;
270 2 2 1 1 2 2 1 1 3 4 4 8 4 2 2 6 2 2 1 2 1 1 2 1 1 2 6 2 4 2 1 3 1
271
272 FingerprintsVector;TopologicalAtomTorsions:EStateAtomTypes;36;Numerica
273 lValues;IDsAndValuesString;aaCH-aaCH-aaCH-aaCH aaCH-aaCH-aaCH-aasC aaC
274 H-aaCH-aasC-aaCH aaCH-aaCH-aasC-aasC aaCH-aaCH-aasC-sF aaCH-aaCH-aasC-
275 ssNH aaCH-aasC-aasC-aasC aaCH-aasC-aasC-aasN aaCH-aasC-ssNH-dssC a...;
276 4 4 8 4 2 2 6 2 2 2 4 3 2 1 3 3 2 2 2 1 2 1 1 1 2 1 1 1 1 1 1 1 2 1 1 2
277
278 FingerprintsVector;TopologicalAtomTriplets:AtomicInvariantsAtomTypes:M
279 inDistance1:MaxDistance10;3096;NumericalValues;IDsAndValuesString;C.X1
280 .BO1.H3-D1-C.X1.BO1.H3-D1-C.X3.BO3.H1-D2 C.X1.BO1.H3-D1-C.X2.BO2.H2-D1
281 0-C.X3.BO4-D9 C.X1.BO1.H3-D1-C.X2.BO2.H2-D3-N.X3.BO3-D4 C.X1.BO1.H3-D1
282 -C.X2.BO2.H2-D4-C.X2.BO2.H2-D5 C.X1.BO1.H3-D1-C.X2.BO2.H2-D6-C.X3....;
283 1 2 2 2 2 2 2 2 8 8 4 8 4 4 2 2 2 2 4 2 2 2 4 2 2 2 2 1 2 2 4 4 4 2 2
284 2 4 4 4 8 4 4 2 4 4 4 2 4 4 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 8...
285
286 FingerprintsVector;TopologicalAtomTriplets:SYBYLAtomTypes:MinDistance1
287 :MaxDistance10;2332;NumericalValues;IDsAndValuesString;C.2-D1-C.2-D9-C
288 .3-D10 C.2-D1-C.2-D9-C.ar-D10 C.2-D1-C.3-D1-C.3-D2 C.2-D1-C.3-D10-C.3-
289 D9 C.2-D1-C.3-D2-C.3-D3 C.2-D1-C.3-D2-C.ar-D3 C.2-D1-C.3-D3-C.3-D4 C.2
290 -D1-C.3-D3-N.ar-D4 C.2-D1-C.3-D3-O.3-D2 C.2-D1-C.3-D4-C.3-D5 C.2-D1-C.
291 3-D5-C.3-D6 C.2-D1-C.3-D5-O.3-D4 C.2-D1-C.3-D6-C.3-D7 C.2-D1-C.3-D7...
292
293 FingerprintsVector;TopologicalPharmacophoreAtomPairs:ArbitrarySize:Min
294 Distance1:MaxDistance10;54;NumericalValues;IDsAndValuesString;H-D1-H H
295 -D1-NI HBA-D1-NI HBD-D1-NI H-D2-H H-D2-HBA H-D2-HBD HBA-D2-HBA HBA-D2-
296 HBD H-D3-H H-D3-HBA H-D3-HBD H-D3-NI HBA-D3-NI HBD-D3-NI H-D4-H H-D4-H
297 BA H-D4-HBD HBA-D4-HBA HBA-D4-HBD HBD-D4-HBD H-D5-H H-D5-HBA H-D5-...;
298 18 1 2 1 22 12 8 1 2 18 6 3 1 1 1 22 13 6 5 7 2 28 9 5 1 1 1 36 16 10
299 3 4 1 37 10 8 1 35 10 9 3 3 1 28 7 7 4 18 16 12 5 1 2 1
300
301 FingerprintsVector;TopologicalPharmacophoreAtomPairs:FixedSize:MinDist
302 ance1:MaxDistance10;150;OrderedNumericalValues;ValuesString;18 0 0 1 0
303 0 0 2 0 0 1 0 0 0 0 22 12 8 0 0 1 2 0 0 0 0 0 0 0 0 18 6 3 1 0 0 0 1
304 0 0 1 0 0 0 0 22 13 6 0 0 5 7 0 0 2 0 0 0 0 0 28 9 5 1 0 0 0 1 0 0 1 0
305 0 0 0 36 16 10 0 0 3 4 0 0 1 0 0 0 0 0 37 10 8 0 0 0 0 1 0 0 0 0 0 0
306 0 35 10 9 0 0 3 3 0 0 1 0 0 0 0 0 28 7 7 4 0 0 0 0 0 0 0 0 0 0 0 18...
307
308 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:ArbitrarySize:
309 MinDistance1:MaxDistance10;696;NumericalValues;IDsAndValuesString;Ar1-
310 Ar1-Ar1 Ar1-Ar1-H1 Ar1-Ar1-HBA1 Ar1-Ar1-HBD1 Ar1-H1-H1 Ar1-H1-HBA1 Ar1
311 -H1-HBD1 Ar1-HBA1-HBD1 H1-H1-H1 H1-H1-HBA1 H1-H1-HBD1 H1-HBA1-HBA1 H1-
312 HBA1-HBD1 H1-HBA1-NI1 H1-HBD1-NI1 HBA1-HBA1-NI1 HBA1-HBD1-NI1 Ar1-...;
313 46 106 8 3 83 11 4 1 21 5 3 1 2 2 1 1 1 100 101 18 11 145 132 26 14 23
314 28 3 3 5 4 61 45 10 4 16 20 7 5 1 3 4 5 3 1 1 1 1 5 4 2 1 2 2 2 1 1 1
315 119 123 24 15 185 202 41 25 22 17 3 5 85 95 18 11 23 17 3 1 1 6 4 ...
316
317 FingerprintsVector;TopologicalPharmacophoreAtomTriplets:FixedSize:MinD
318 istance1:MaxDistance10;2692;OrderedNumericalValues;ValuesString;46 106
319 8 3 0 0 83 11 4 0 0 0 1 0 0 0 0 0 0 0 0 21 5 3 0 0 1 2 2 0 0 1 0 0 0
320 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 101 18 11 0 0 145 132 26
321 14 0 0 23 28 3 3 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 61 45 10 4 0
322 0 16 20 7 5 1 0 3 4 5 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 5 ...
323
324 OPTIONS
325 -a, --all
326 List all the available information.
327
328 --AverageBitDensity
329 List average bit density of fingerprint bit-vector strings.
330
331 --BitDensity
332 List bit density of fingerprints bit-vector strings data in each
333 row.
334
335 --count
336 List number of data entries containing fingerprints bit-vector or
337 vector strings data. This is default behavior.
338
339 -c, --ColMode *ColNum | ColLabel*
340 Specify how columns are identified in CSV/TSV *TextFile(s)*: using
341 column number or column label. Possible values: *ColNum or
342 ColLabel*. Default value: *ColNum*
343
344 -d, --detail *InfoLevel*
345 Level of information to print about lines being ignored. Default:
346 *1*. Possible values: *1, 2 or 3*.
347
348 --DataCheck
349 Validate fingerprints data specified using --FingerprintsCol and
350 list information about missing and invalid data.
351
352 -e, --empty
353 List number of rows containing no fingerprints data.
354
355 --FingerprintsCol *col number | col name*
356 This value is -c, --colmode specific. It corresponds to column in
357 CSV/TSV *TextFile(s)* containing fingerprints data. Possible values:
358 *col number or col label*. Default value: *first column containing
359 the word Fingerprints in its column label*.
360
361 --FingerprintsField *FieldLabel*
362 Fingerprints field label to use during listing of fingerprints
363 information for *SDFile(s)*. Default value: *first data field label
364 containing the word Fingerprints in its label*.
365
366 --FingerprintsType
367 List types of fingerprint strings: FingerprintsBitVector or
368 FingerprintsVector.
369
370 --FingerprintsDescription
371 List types of fingerprints: PathLengthBits, PathLengthCount,
372 MACCSKeyCount, ExtendedConnectivity and so on.
373
374 --FingerprintsSize
375 List size of fingerprints.
376
377 --FingerprintsBitStringFormat
378 List format of fingerprint bit-vector strings: BinaryString or
379 HexadecimalString.
380
381 --FingerprintsBitOrder
382 List order of bits data in fingerprint bit-vector bit strings:
383 Ascending or Descending.
384
385 --FingerprintsVectorValuesType
386 List type of values in fingerprint vector strings:
387 OrderedNumericalValues, NumericalValues or AlphaNumericalValues.
388
389 --FingerprintsVectorValuesFormat
390 List format of values in fingerprint vector strings: ValuesString,
391 IDsAndValuesString, IDsAndValuesPairsString, ValuesAndIDsString or
392 ValuesAndIDsPairsString.
393
394 -h, --help
395 Print this help message.
396
397 --InDelim *comma | semicolon*
398 Input delimiter for CSV *TextFile(s)*. Possible values: *comma or
399 semicolon*. Default value: *comma*. For TSV files, this option is
400 ignored and *tab* is used as a delimiter.
401
402 --NumOfOnBits
403 List number of on bits in fingerprints bit-vector strings data in
404 each row.
405
406 --NumOfNonZeroValues
407 List number of non-zero values in fingerprints vector strings data
408 in each row.
409
410 -w, --WorkingDir *DirName*
411 Location of working directory. Default: current directory.
412
413 EXAMPLES
414 To count number of lines containing fingerprints bit-vector or vector
415 strings data present in FP file, in a column name containing Fingerprint
416 substring in text file, and in a data field with Fingerprint substring
417 in its label, type:
418
419 % InfoFingerprintsFiles.pl SampleFPBin.csv
420
421 % InfoFingerprintsFiles.pl SampleFPBin.sdf SampleFPBin.fpf
422 SampleFPBin.csv
423
424 % InfoFingerprintsFiles.pl SampleFPHex.sdf SampleFPHex.fpf
425 SampleFPHex.csv
426
427 % InfoFingerprintsFiles.pl SampleFPcount.sdf SampleFPcount.fpf
428 SampleFPcount.csv
429
430 To list all available information about fingerprints bit-vector or
431 vector strings data present in FP file, in a column name containing
432 Fingerprint substring in text file, and in a data field with Fingerprint
433 substring in its label, type:
434
435 % InfoFingerprintsFiles.pl -a SampleFPHex.sdf SampleFPHex.fpf
436 SampleFPHex.csv
437
438 % InfoFingerprintsFiles.pl -a SampleFPcount.sdf SampleFPcount.fpf
439 SampleFPcount.csv
440
441 To list all available information about fingerprints bit-vector or
442 vector strings data present in a column named Fingerprints in text file,
443 type:
444
445 % InfoFingerprintsFiles.pl -a --ColMode ColLabel --FingerprintsCol
446 Fingerprints SampleFPHex.sdf
447
448 % InfoFingerprintsFiles.pl -a --ColMode ColLabel --FingerprintsCol
449 Fingerprints SampleFPcount.csv
450
451 To list all available information about fingerprints bit-vector or
452 vector strings data present in a data field names Fingerprints in SD
453 file, type:
454
455 % InfoFingerprintsFiles.pl -a --FingerprintsField Fingerprints
456 SampleFPHex.sdf
457
458 % InfoFingerprintsFiles.pl -a --FingerprintsField Fingerprints
459 SampleFPcount.sdf
460
461 To list bit density, average bit density, and number of on bits for
462 fingerprints bit-vector strings data present in FP file, in a column
463 name containing Fingerprint substring in text file, and in a data field
464 with Fingerprint substring in its label, type:
465
466 % InfoFingerprintsFiles.pl --BitDensity --AverageBitDensity
467 --NumOfOnBits SampleFPBin.csv SampleFPBin.sdf SampleFPBin.fpf
468
469 To list vector values type, format and number of non-zero values for
470 fingerprints vector strings data present in FP file, in a column name
471 containing Fingerprint substring in text file, and in a data field with
472 Fingerprint substring in its label along with fingerprints type and
473 description, type:
474
475 % InfoFingerprintsFiles.pl --FingerprintsType --FingerprintsDescription
476 --FingerprintsVectorValuesType --FingerprintsVectorValuesFormat
477 --NumOfNonZeroValues SampleFPcount.csv SampleFPcount.sdf
478 SampleFPcount.fpf
479
480 AUTHOR
481 Manish Sud <msud@san.rr.com>
482
483 SEE ALSO
484 SimilarityMatricesFingerprints.pl, SimilaritySearchingFingerprints.pl,
485 AtomNeighborhoodsFingerprints.pl, AtomNeighborhoodsFingerprints.pl,
486 ExtendedConnectivityFingerprints.pl, MACCSKeysFingerprints.pl,
487 PathLengthFingerprints.pl, TopologicalAtomPairsFingerprints.pl,
488 TopologicalAtomTorsionsFingerprints.pl,
489 TopologicalPharmacophoreAtomPairsFingerprints.pl,
490 TopologicalPharmacophoreAtomTripletsFingerprints.pl
491
492 COPYRIGHT
493 Copyright (C) 2015 Manish Sud. All rights reserved.
494
495 This file is part of MayaChemTools.
496
497 MayaChemTools is free software; you can redistribute it and/or modify it
498 under the terms of the GNU Lesser General Public License as published by
499 the Free Software Foundation; either version 3 of the License, or (at
500 your option) any later version.
501