-
Notifications
You must be signed in to change notification settings - Fork 14
/
label-translation.html
249 lines (231 loc) · 16.1 KB
/
label-translation.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
<!Doctype html>
<html lang="en">
<head>
<title>Label Translation</title>
<meta charset="UTF-8">
<!--<link rel="stylesheet" href="css/bootstrap.min.css">-->
<link rel="stylesheet" href="css/style_new.css">
<script src="js/jquery-1.12.1.min.js" charset="utf-8"></script>
<link rel="stylesheet" href="js/embed-2cd369fa1c0830bd3aa06c21d4f14a13e060d2d31bbaae740f4af4.css"><div id="gist28627206" class="gist">
<link rel="stylesheet" href="js/embed-cbe5b40fa72b0964f90d4919c2da8f8f94d7c9f6c2aa49c07f6fa3.css"><div id="gist28627206" class="gist">
<script src="js/bootstrap.min.js" charset="utf-8"></script>
</head>
<div class="container">
<header id="navtop">
<a href="index.html" class="logo fleft"><img src="img/logo.png" alt=""></a>
<nav class="fright">
<ul>
<li><a href="index.html">Home</a></li>
<li><a href="about.html">About</a></li>
<!-- <li><a href="help.html">Help</a></li> -->
<li><a href="roadmap.html">Roadmap</a></li>
<li><a href="documentation.html" class="navactive">Documentation</a></li>
</ul>
</nav>
</header>
<body>
<div class="Services-page main grid-wrap">
<header class="grid col-full">
<hr/>
<p class="fleft">Label Translation (Linking<span style="color: red;">*</span>)</p>
<br>
<br>
</header>
<aside class="grid col-one-quarter mq2-col-full sticky_sidebar" style="position: static;">
<menu>
<ul>
<li><a class="sec" href="#nav-introduction">Introduction</a></li>
<li><a class="sec" href="#nav-illustration">Illustration</a></li>
</ul>
</menu>
<!-- <a class="button" href="">Download as PDF</a> -->
</aside>
<section class="grid col-three-quarters mq2-col-full">
<div class="grid-wrap">
<article class="grid col-full" id="nav-introduction">
<h2>Introduction</h2>
<p>
Labels are generated when the compiler translates the high level program constructs such as <i>if-then-else</i>, <i>while-do</i> and <i>function calls</i>
into target machine code. Labels are needed because the target address of a JUMP instruction may not be known
at the time of generating code (Why?). However these labels must be replaced with target addresses in the final target code.
Thus the compiler designer needs to write a program to replace all the labels in the program with the correct memory
addresses eventually. This conversion is called Label translation.
</p>
<p>
The <b>input to the label translator program is the machine code with labels</b> (generated by <a href="codegen.html"> codeGen()</a>). The <b>output is the target machine code with the labels replaced with corresponding addresses</b>.
</p>
<p>
The main idea behind label translation is to execute the following two steps.<br>
<ul>
<li id="otis">1. The input ( machine code with labels ) is parsed and a table in which the labels are mapped with
their corresponding addresses is created.</li>
<li id="otis">2. Parse the input again and replace all the labels with their corresponding addresses by
looking into the table created above.</li>
</ul>
</p>
<p>
<b><span style="color: red;">*</span>Note:</b> In real systems, the compiler generates an <a href="https://en.wikipedia.org/wiki/Object_file" target="_blank"> object file</a> which contains
labels for functions as well as variables. The object file is processed by a separate software module called the
<a href="https://en.wikipedia.org/wiki/Linker_(computing)" target="_blank">linker</a> which converts the object file
into an excutable file. The advantage of this scheme is that a huge program could be divided into different stand alone
source modules, each compiled seperately
and finally linked together. A change in one module does not require recompilation of other modules and only the
linking phase needs to be run again. In this project, we keep the linking task as a part of the compilation itself as
ExpL does not allow a program to be compiled into several object modules.
</p>
<p>
In the ExpL project, variable addresses are resolved before reaching the Label translation phase by the compiler itself.
Hence the labels remaining will correspond to
<br>
a) The label of the first instruction to be executed (MAIN:),
<br>
b) Labels placed at the beginning of assembly instructions to which
there is some JMP/JZ/JNZ from somewhere.
<br>
c) Labels placed at the beginning of functions. (The compiler
translates a call to the function with a CALL instruction with this label).
</p>
</article>
<article class="grid col-full" id="nav-illustration">
<h2>Illustration</h2>
<p>
The above procedure is demonstrated with the help of a program snippet.
</p>
<div class="col-one-half">
<script src="js/36a210c1c92cd7ac6d45eb119b4d0020.js"></script>
</div>
<div class="grid col-full">
<p>
The code generated after the <a href="codegen.html"> codeGen()</a> phase is shown in the figure 1 ( It consists of labels ).
</p>
<div class="fleft col-one-third" id="fig1">
<p>
Figure 1
<script src="js/cbc271f4cff409923ac1989a9433bdfe.js"></script>
</p>
</div>
<div style="padding-right: 15em" class="fright col-one-third" id="fig2">
<p>
Figure 2
</p>
<script src="js/d2de8b9cd8a88a5054c9ead9e697e99a.js"></script>
</div>
</div>
<div>
<p>
In the machine code generated after codegen section, labels occur in two different ways.<br>
<ul>
<li id="otis">1. Label declarations in which labels are followed by semicolon (:).</li>
<li id="otis"><b>Examples</b> : F0: is a label for the instruction at address 2066, L0: for 2102, L1: for 2162 and MAIN: for 2186 as in <a href="#fig1">figure1</a></li>.
<li id="otis">2. Instructions which contain labels in them namely, JMP, JZ, JNZ and CALL.</li>
<li id="otis"><b>Examples</b> : JMP L1 at address 2100, JZ R0,L3 at address 2248, CALL MAIN at address 2062 etc., as in <a href="#fig1">figure1</a>.</li>
</ul>
</p>
<p>
Now, the label translation procedure can be explained with the help of an example label F0 from the <a href="#fig1">figure1</a>.<br>
<br>
First <b>"F0:"</b> needs to be recognized as a label and the memory address of the label occurence need to be stored in a table called the <b>Label-Address Table</b>. In the above example, the address corresponding to <b>F0:</b> is 2066.
We need to parse the entire machine code to identify all the labels in the program and store all the (label, address) pairs in the Label-Address table. We can remove the label F0: (and other labels) from the program once the corresponding (label,address) pair is entered into the label-address table.
</p>
<p>
The next task is to replace labels occuring in instructions with the addresses of the labels from the label-address table. Refering to the <a href="#fig1">figure1</a>, there are two instructions that use the label F0: <b> CALL F0 occuring at address 2136 </b> and <b>CALL F0 occuring at address 2302</b>. We will replace the labels in these instructions with the address of F0 (2066) by looking up the label-address table. Thus, after translation, both these instructions will translate to <b>CALL 2066</b> as shown in <a href="#fig2">figure2</a>.
</p>
<p>
The label translator program needs to parse the entire machine code two times. In the first parse we identify all the label declarations and the memory addresses in the label-address table.
The table constructed after the first parse of the above code is shown below.
<p>
<img src="img/label-table.png">
</p>
<p>
In the second parse, we remove the label declarations and replace the labels in instructions like JUMP, CALL etc. with the corresponding memory address from the label-address table.
</p>
<p>
The target code after the label translation is shown in the <a href="#fig2"> figure 2</a>.
</p>
<hr>
</div>
<div>
<h3>Implementing Label Translation</h3>
<!-- The above two pass translation process can be implemented using a single Lex program.The following functions provided by Lex can be useful while implementing the parser :
<ul>
<li id="otis">1. <b>yyless(k)</b> : Returns all but the first n characters of the current token back to the input stream, where they will be rescanned when the scanner looks for the next match. <b>yytext</b> and <b>yyleng</b> are adjusted appropriately (e.g., yyleng will now be equal to n ).<br>
For example, on the input "foobar" the following will write out "foobarbar":<br>
<div class="syntax">
%%<br>
foobar ECHO; yyless(3);<br>
[a-z]+ ECHO;<br>
</div>
An argument of 0 to yyless will cause the entire current input string to be scanned again. Unless you've changed how the scanner will subsequently process its input, this will result in an endless loop. Note that yyless is a macro and can only be used in the flex input file, not from other source files.</li>
<li id="otis">2. <b>yytext+k</b> : ignores the first k characters in yytext.</li>
</ul>
-->
<p>
As noted previously, the input to the label translator program is the machine code with labels (generated by <a href="codegen.html"> codeGen()</a>). The output is the target machine code with the labels replaced with corresponding addresses.
It is easy to design a simple lex program which does two passes on the input file
and perform label translation.
</p>
<p>
The first pass must collect all the label declarations and simultaneously keep track of the addresses of the
instructions corresponding to the labels. Since the pattern "<i>letter.(letter|digit)*:</i>"
identifies a label, it is easy to design a LEX rule to recognize all label declarations in the program.
The compiler must ensure that label names are different from instructions, to avoid confusion.
</p>
<p>
Since each XSM instruction takes two words of storage, and the first instruction is at address 2056,
the first pass can track the address corresponding to each label declaration
as 2056 + 2 *(InstructionNumber-1).
</p>
<p>
The program can maintain a simple linked list into
which each label address pair found is entered. The label declarations can be
removed from the input program, once it's table entry is made.
</p>
<p>
In the second pass, the program must "search and replace" each label occuring in the
control transfer instructions (JMP, JZ, JNZ and CALL)
with the corresponding address in the label table. Since the instruction set is fixed and labels
are lexically different from instructions, a simple set of LEX rules can identify all labels
and perform the replacement.
</p>
<p>
We leave the implementation details to you.
</p>
<p>
<b>Question</b> :<br>
What is the difficulty in implementing the label translation using single pass algorithm?
</p>
</div>
<!-- <script src="https://gist.github.com/vishnupriyam/6cde145e400e15cfda5f1a65366a264f.js"></script> -->
<!-- <script src="js/6ce685d686f903d3e976dd8b20b11763.js"></script> -->
<!-- <p>
The XSM instructions for the above while code (lines 11-12) will be as follows:
</p> -->
<!-- <script src="js/330a763b31980649846c3a2cbb103113.js"></script> -->
</article>
</section>
</div>
</div>
</body>
<footer class="center part clearfix">
<ul class="grid col-one-third social">
<li><a href="http://github.com/silcnitc">Github</a></li>
<li> <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="img/creativecommons.png" /></a></li>
</ul>
<div class="grid col-one-third" style="color:black;">
<p style="font-weight: bold;">Contributed By : <a href="https://www.linkedin.com/in/dattathallam">Thallam Sai Sree Datta</a>, <a href="https://www.linkedin.com/in/n-ruthviik-0a0539100">N Ruthvik</a>
</p>
</div>
<nav class="grid col-one-third ">
<ul >
<li><a href="index.html">Home</a></li>
<li><a href="about.html">About</a></li>
<!-- <li><a href="uc.html">Contact</a></li> -->
</ul>
</nav>
<br>
</footer>
<script>window.jQuery || document.write('<script src="js/jquery-1.7.2.min.js"><\/script>')</script>
<script src="js/scripts.js"></script>
<script src="js/inject.js"></script>
</html>