-
Notifications
You must be signed in to change notification settings - Fork 0
/
chapter3.html
266 lines (201 loc) · 9.68 KB
/
chapter3.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
<!DOCTYPE html>
<html><head>
<title>Chapter 3 - A small language for declarations</title>
<meta charset='utf-8'>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link href="//netdna.bootstrapcdn.com/twitter-bootstrap/2.2.2/css/bootstrap-combined.min.css" rel="stylesheet">
<link href="style.css" rel="stylesheet">
</head>
<body>
<div class="container">
<div class="row">
<div class="offset2 span8">
<p>Chapter 3</p>
<h1>A small language for declarations</h1>
<p><a href="index.html">Top</a></p>
<p>In this chapter we'll look a small language thats helps a declare which people
or groups of people are allowed to use something.</p>
<h3>Keywords</h3>
<p>Let's take a look at a small declarative part of a language with two keywords: 'Deny', 'Allow'.</p>
<pre class="prettyprint linenums"><code>Deny baduser
Allow admin
</code></pre>
<p>The semantics (or meaning) of the language could be anything. Parsing the
language can be really easy with Marpa.</p>
<pre class="prettyprint linenums"><code>:start ::= rules
rules ::= rule+
rule ::= cmd_type username
cmd_type ~ 'Deny' | 'Allow'
username ~ [\w]+
:discard ~ ws
ws ~ [\s]+
</code></pre>
<p class='example-filename'><a href='examples/deny-allow.pl'>examples/deny-allow.pl</a></p>
<pre class="prettyprint linenums"><code>Trying to parse:
Deny baduser
Allow admin
Output:
$VAR1 = [
[
'Deny',
'baduser'
],
[
'Allow',
'admin'
]
];
</code></pre>
<p>The output shows that Marpa parsed two rules. The first rule with the
'Deny' keyword and the second with the 'Allow' keyword. It seems that the
keywords could interfere with the usernames. The <code>cmd_type</code>s are a subset of
the usernames.</p>
<p>Let's see what happens when we change the first username from baduser to Allow.
Can Marpa see the difference between keywords and usernames?</p>
<pre class="prettyprint linenums"><code>Output:
$VAR1 = [
[
'Deny',
'Allow'
],
[
'Allow',
'admin'
]
];
</code></pre>
<p>It doesn't matter that the first username matches a <code>cmd_type</code>. Marpa knows
that the word 'Allow' could also be a username.</p>
<p>Now let's see what happens when we use an input that doesn't match by switching
around the username and the keyword. A keyword could be a username, but it
can't be the other way around.</p>
<p>I change the input to:</p>
<pre class="prettyprint linenums"><code>baduser Deny
admin Allow
</code></pre>
<p>When we run the examples we get the following output:</p>
<pre class="prettyprint linenums"><code>Error in SLIF G1 read: No lexemes accepted at position 0
* String before error: baduser
* The error was at line 1, column 8, and at character 0x0020 (non-graphic character), ...
* here: Deny\nadmin Allow\n
Marpa::R2 exception at examples/deny-allow.pl line 33.
</code></pre>
<p>Marpa tries to point to the error as best as it can. Marpa starts with the
problem: <code>No lexemes accepted at position 0</code>. This is Marpa's way of telling us
that Marpa couldn't find a way to match <code>baduser</code> to one of the expected
lexemes, <code>Deny</code> or <code>Allow</code>.</p>
<h3>More syntax</h3>
<p>Now let's add a way to specify groups of users. The syntax could look like this.</p>
<pre class="prettyprint linenums"><code>admins = admin root
Deny baduser
Allow @admins
</code></pre>
<p>The first line specifies a list of admin users. The second line stays the same
and the third line contains a reference to the <code>admins</code> list of users. The <code>@</code>
operators makes it a reference.</p>
<p>We start by changing the input in the file. We run the code and find that it
doesn't parse. That's what we expected.</p>
<p>Now we need to add a rule for parsing the <code>admins = ...</code> line. In this case
<code>admins</code> is similar to a username so let's use that.</p>
<pre class="prettyprint linenums"><code>user_list ::= username '=' username username
</code></pre>
<p>This line should work. We add it in the grammar. Let's add it to the
<code>rule</code> rule as an alternative. We use the <code>|</code> (or) operator.</p>
<pre class="prettyprint linenums"><code>rule ::= cmd_type username
| user_list
</code></pre>
<p>When we run this code we get the following:</p>
<pre class="prettyprint linenums"><code>Lexing failed at unacceptable character 0x0040 '@'
Marpa::R2 exception at examples/deny-allow2.pl line 37.
</code></pre>
<p>Marpa doesn't like the '@' character in the third line. Let's add that.
We add another line to <code>rule</code>. The reference is like a username.</p>
<pre class="prettyprint linenums"><code>rule ::= cmd_type username
| cmd_type '@' username
| user_list
</code></pre>
<p class='example-filename'><a href='examples/deny-allow2.pl'>examples/deny-allow2.pl</a></p>
<p>When we run this code, the parse succeeds, but the result is not completely right.</p>
<pre class="prettyprint linenums"><code>[
'Allow',
'@',
'admins'
]
</code></pre>
<p>The <code>@</code> and <code>admins</code> is parsed in two parts. We like the reference to be a
single thing. Another problem with this version is that there can be spaces
between the '@' and the name after it. This is not what we mean. The '@' and
the name should always be part of the same lexeme. We change this by making a
lexical rule for the <code>list_ref</code>.</p>
<p><strong>When two tokens should never be separated, make them part of the same lexical rule.</strong></p>
<pre class="prettyprint linenums"><code>rule ::= cmd_type username
| cmd_type list_ref
| user_list
list_ref ~ '@' username
</code></pre>
<p class='example-filename'><a href='examples/deny-allow3.pl'>examples/deny-allow3.pl</a></p>
<p>Now let's run <code>deny-allow3.pl</code>.</p>
<pre class="prettyprint linenums"><code>Symbol <username> is a lexeme in G1, but not in G0.
This may be because <username> was used on a RHS in G0.
A lexeme cannot be used on the RHS of a G0 rule.
Marpa::R2 exception at examples/deny-allow3.pl line 6.
</code></pre>
<p>This gives us a look into the inner workings of Marpa. This error message says
that we can't use a lexeme in both the structural rules (G1) and lexical rules
(G0). We need to create two lexical rules: one for <code>rule</code> and <code>user_list</code> and
one for <code>list_ref</code>.</p>
<pre class="prettyprint linenums"><code>rule ::= cmd_type user
| cmd_type list_ref
| user_list
user_list ::= user '=' user user
list_ref ~ '@' username
user ~ username
</code></pre>
<p class='example-filename'><a href='examples/deny-allow4.pl'>examples/deny-allow4.pl</a></p>
<p>With these changes we can parse the input without problems. The output shows
what we expected.</p>
<h3>Multiple users</h3>
<p>Even though the code works, it could be that you saw a problem with this. Ask
yourself what would happen if you specify more or less than two users on the
right side of a <code>user_list</code>? Try it.</p>
<p>The parser doesn't know what to do with the extra user in the list.
Let's change <code>user_list</code> to allow multiple users.</p>
<pre class="prettyprint linenums"><code>user_list ::= user '=' users
users ::= user+
</code></pre>
<p class='example-filename'><a href='examples/deny-allow5.pl'>examples/deny-allow5.pl</a></p>
<p>This change now allows us to specify multiple users in a user_list. The change
that allows us to specify multiple users on an <code>Allow</code> or <code>Deny</code> line is
featured in Exercise 1.1.</p>
<p>With that change this language allows us to easily specify users that are
allowed or denied access to a particular thing. The design of the language
leaves out the specification of that particular thing to keep the design simple
and general.</p>
<h3>Exercise</h3>
<ol>
<li><p><strong>Multiple users</strong> — Change the grammar to allow multiple users on the
<code>Allow</code> or <code>Deny</code> line. For example:</p>
<pre class="prettyprint linenums"><code>Allow root admin
Deny baduser cracker
</code></pre></li>
<li><p><strong>All users</strong> — Add a rule that allows you to specify <code>all</code> or
'everybody' as a rule that denies or allows everyone.</p></li>
<li><p><strong>Implementation</strong> — Implement a class <code>Authorization</code> with a method
<code>CanAccess($username)</code> that return true if a user is allowed access and false
if the user is denied access. The contructor of the class takes a source string
in the language that we specified above. The rules that the parser creates
should be checked in sequence.</p></li>
</ol>
<h2>Next step</h2>
<p><a href="chapter4.html">Chapter 4: Parsing an expression</a></p>
</div>
</div>
<div class="row footer">
<div class="offset2 span8">
<p>
<a rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/"><img alt="Creative Commons License" style="float:left; margin-right:12px; border-width:0" src="http://i.creativecommons.org/l/by-sa/3.0/88x31.png" /></a><span xmlns:dct="http://purl.org/dc/terms/" property="dct:title">Marpa Guide</span> by <span xmlns:cc="http://creativecommons.org/ns#" property="cc:attributionName">Peter Stuifzand and Contributors</span> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">Creative Commons Attribution-ShareAlike 3.0 Unported License</a> and the <a rel="license" href="http://www.gnu.org/copyleft/fdl.html">GNU Free Documentation License 1.3</a></p>
</div>
</div>
</div>
<script src="//netdna.bootstrapcdn.com/twitter-bootstrap/2.2.2/js/bootstrap.min.js"></script>
</body></html>