annotate docs/modules/txt/Lexer.txt @ 0:4816e4a8ae95 draft default tip

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 09:23:18 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1 NAME
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
2 Parsers::Lexer
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
3
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
4 SYNOPSIS
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
5 use Parsers::Lexer;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
6
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
7 use Parsers::Lexer qw(:all);
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
8
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
9 DESCRIPTION
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
10 Lexer class provides the following methods:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
11
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
12 new, GetLex, Lex, Next, Peek, StringifyLexer
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
13
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
14 The object oriented chained Lexer is implemented based on examples
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
15 available in Higher-order Perl [ Ref 126 ] book by Mark J. Dominus. It
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
16 is designed to be used both in standalone mode or as a base class for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
17 YYLexer.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
18
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
19 A chained lexer is created by generating a lexer for for the first
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
20 specified token specification using specified input and chaining it with
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
21 other lexers generated for all subsequent token specifications. The
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
22 lexer generated for the first token specification uses input iterator to
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
23 retrieve any available input text; the subsequent chained lexeres for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
24 rest of the token specifications use lexers generated for previous token
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
25 specifications to get next input, which might be unmatched input text or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
26 a reference to an array containing token and matched text pair.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
27
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
28 METHODS
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
29 new
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
30 $Lexer = new Parsers::Lexer($Input, @TokensSpec);
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
31
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
32 Using specified *Input* and *TokensSpec*, new method generates a new
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
33 lexer and returns a reference to newly created Lexer object.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
34
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
35 Example:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
36
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
37 # Tokens specifications supplied by the caller. It's an array containing references
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
38 # to arrays with each containing TokenLabel and TokenMatchRegex pair along with
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
39 # an option reference to code to be executed after a matched.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
40 #
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
41 @LexerTokensSpec = (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
42 [ 'LETTER', qr/[a-zA-Z]/ ],
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
43 [ 'NUMBER', qr/\d+/ ],
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
44 [ 'SPACE', qr/[ ]*/,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
45 sub { my($This, $TokenLabel, $MatchedText) = @_; return ''; }
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
46 ],
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
47 [ 'NEWLINE', qr/(?:\r\n|\r|\n)/,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
48 sub { my($This, $TokenLabel, $MatchedText) = @_; return "\n"; }
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
49 ],
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
50 [ 'CHAR', qr/./ ]
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
51 );
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
52
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
53 # Input string...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
54 $InputText = 'y = 3 + 4';
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
55 $Lexer = new Parsers::Lexer($InputText, @LexerTokensSpec);
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
56
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
57 # Process input stream...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
58 while (defined($Token = $Lexer->Lex())) {
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
59 print "Token: " . ((ref $Token) ? "@{$Token}" : "$Token") . "\n";
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
60 }
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
61
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
62 # Input file...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
63 $InputFile = "Input.txt";
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
64 open INPUTFILE, "$InputFile" or die "Couldn't open $InputFile: $!\n";
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
65 $Lexer = new Parsers::Lexer(\*INPUTFILE, @LexerTokensSpec);
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
66
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
67 # Input file iterator...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
68 $InputFile = "TestSimpleCalcParser.txt";
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
69 open INPUTFILE, "$InputFile" or die "Couldn't open $InputFile: $!\n";
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
70 $InputIterator = sub { return <INPUTFILE>; };
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
71 $Lexer = new Parsers::Lexer($InputIterator, @LexerTokensSpec);
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
72
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
73 @LexerTokensSpec = (
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
74 [ 'VAR', qr/[[:alpha:]]+/ ],
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
75 [ 'NUM', qr/\d+/ ],
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
76 [ 'OP', qr/[-+=\/]/,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
77 sub { my($This, $Label, $Value) = @_;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
78 $Value .= "; ord: " . ord $Value;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
79 return [$Label, $Value];
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
80 }
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
81 ],
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
82 [ 'NEWLINE', qr/(?:\r\n|\r|\n)/, sub { return [$_[1], 'NewLine']; } ],
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
83 [ 'SPACE', qr/\s*/, sub { return [$_[1], 'Space']; } ],
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
84 );
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
85
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
86 # Look ahead without removing...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
87 $Token = $Lexer->Lex('Peek');
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
88 if (defined $Token && ref $Token) {
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
89 print "PEEK: Token: @{$Token}\n\n";
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
90 }
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
91
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
92 # Process input stream...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
93 while (defined($Token = $Lexer->Lex())) {
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
94 print "Token: " . ((ref $Token) ? "@{$Token}" : "$Token") . "\n";
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
95 }
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
96
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
97 GetLex
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
98 $LexerRef = $Lexer->GetLex();
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
99
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
100 Returns a refernece to *Lexer* method to the caller for use in a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
101 specific YYLexer.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
102
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
103 Lex
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
104 $TokenRefOrText = $Lexer->Lex($Mode);
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
105 if (ref $TokenRefOrText) {
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
106 ($TokenLabel, $TokenValue) = @{$TokenRefOrText};
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
107 }
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
108 else {
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
109 $TokenText = $TokenRefOrText;
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
110 }
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
111
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
112 Get next available token label and value pair as an array reference
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
113 or unrecognized text from input stream by either removing it from
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
114 the input or simply peeking ahead and without removing it from the
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
115 input stream.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
116
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
117 Possible *Mode* values: *Peek, Next*. Default: *Next*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
118
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
119 Next
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
120 $TokenRefOrText = $Lexer->Next();
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
121
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
122 Get next available token label and value pair as an array reference
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
123 or unrecognized text from input stream by removing it from the input
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
124 stream.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
125
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
126 Peek
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
127 $TokenRefOrText = $Lexer->Peek();
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
128
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
129 Get next available token label and value pair as an array reference
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
130 or unrecognized text from input stream by by simply peeking ahead
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
131 and without removing it from the input stream.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
132
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
133 StringifyLexer
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
134 $LexerString = $Lexer->StringifyLexer();
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
135
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
136 Returns a string containing information about *Lexer* object.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
137
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
138 AUTHOR
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
139 Manish Sud <msud@san.rr.com>
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
140
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
141 SEE ALSO
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
142 YYLexer.pm, SimpleCalcYYLexer.pm, SimpleCalcParser.yy
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
143
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
144 COPYRIGHT
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
145 Copyright (C) 2015 Manish Sud. All rights reserved.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
146
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
147 This file is part of MayaChemTools.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
148
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
149 MayaChemTools is free software; you can redistribute it and/or modify it
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
150 under the terms of the GNU Lesser General Public License as published by
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
151 the Free Software Foundation; either version 3 of the License, or (at
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
152 your option) any later version.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
153