Skip to content

Commit 0b28e78

Browse files
Fixes to intermediate representations write up
1 parent 09b7a43 commit 0b28e78

10 files changed

+60
-24
lines changed

_sources/intermediate-representations.rst.txt

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ instructions into basic blocks, and linking basic blocks through jump instructio
5050
equivalent, you can think of a label as indicating the start of a basic block, and a jump as ending
5151
a basic block.
5252

53-
The idea is that inside a basic block, instructions executed linearly one after the other.
53+
The idea is that inside a basic block, instructions execute linearly one after the other.
5454
Each basic block ends with a branching instruction, something like a goto or a conditional jump.
5555

5656
Here is a simple example of input source code and the IR you might see::
@@ -83,13 +83,14 @@ Each basic block begins with a label, which is just the unique name of the block
8383

8484
* The ``jump`` instruction transfers control from a basic block to another.
8585
* The ``cbr`` instruction is the conditional branch. It consumes the top most value from the stack,
86-
and if this value is true, then control is transferred to the first block, else to the second block.
87-
* The ``eq`` instruction pops the topmost two values from the stack, and replaces them with integer value
88-
``1`` or ``0``.
86+
and if this value is true (in this case, a non-zero value), then control is transferred
87+
to the first block, else to the second block.
88+
* The ``eq`` instruction pops the two topmost values from the stack, compares them and pushes a result:
89+
``1`` for true or ``0`` for false.
8990

9091
Advantages
9192
----------
92-
* The IR is compact to represent in stored form as most instructions do not take have operands.
93+
* The IR is compact to represent in stored form as most instructions do not have operands.
9394
This is a reason why many languages choose to encode their compiled code in
9495
this form. Examples are Java, C#, Web Assembly.
9596
* The IR can be executed easily by an Interpreter.
@@ -98,6 +99,8 @@ Advantages
9899
Disadvantages
99100
-------------
100101
* Not easy to implement optimizations.
102+
* For a reader it is hard to trace values as they flow through instructions,
103+
as it requires tracking them through a conceptual stack.
101104
* Harder to analyze the IR, although there are methods available to do so.
102105

103106
Examples
@@ -127,13 +130,13 @@ Produces::
127130
The instructions above are as follows:
128131

129132
* ``%t1 = n+1`` - is a typical three-address instruction of the form ``result = value1 operator value2``. The name ``%t1``
130-
refers to a temporary, whereas ``n`` refers to the input argument ``n``.
133+
refers to a temporary, whereas ``n`` refers to the input argument ``n``. Both of these names are virtual registers.
131134
* ``ret %t1`` - is the return instruction, in this instance it references the temporary.
132135

133136
The virtual registers in the IR are so called because they do not map to real registers in the target physical machine.
134137
Instead these are just named slots in the abstract machine responsible for executing the IR. Typically, the abstract machine
135138
will assign each virtual register a unique location in its stack frame. So we still end up using the function's
136-
stack frame, but the IR references locations within the stack frame via these virtual names, rather than implicitly
139+
stack frame, but the IR references locations within the stack frame directly using these virtual names, rather than implicitly
137140
through push and pop instructions. During optimization some of the virtual registers will end up in real hardware registers.
138141

139142
Control flow is represented the same way as for the stack IR. Revisiting the same source example from above, we get following
@@ -157,7 +160,8 @@ IR::
157160
Advantages
158161
----------
159162
* Readability: the flow of values is easier to trace, whereas with a stack IR you need to conceptualize a stack somewhere,
160-
and track values being pushed and popped.
163+
and track values being pushed and popped.
164+
* Fewer instructions are needed compared to stack IR.
161165
* The IR can be executed easily by an Interpreter.
162166
* Most optimization algorithms can be applied to this form of IR.
163167
* The IR can represent Static Single Assignment (SSA) in a natural way.
@@ -178,15 +182,15 @@ Sea of Nodes IR
178182
===============
179183
The final example we will look at is known as the Sea of Nodes IR.
180184

181-
It is quite different from the IRs we described above.
185+
This IR is quite different from the IRs we described above.
182186

183187
The key features of this IR are:
184188

185189
* Instructions are NOT organized into Basic Blocks - instead, intructions form a graph, where
186190
each instruction has as its inputs the definitions it uses.
187191
* Instructions that produce data values are not directly bound to a Basic Block, instead they "float" around,
188-
the order being defined purely in terms of the dependencies between the instructions.
189-
* Control flow is also represented in the same way, and control flows between control flow
192+
the order being defined purely in terms of the dependencies between the instructions.
193+
* Control flow is represented in a similar way, and control flows between control flow
190194
instructions. Dependencies between data instructions and control intructions occur at few well
191195
defined places.
192196
* The IR as described above cannot be readily executed, because to execute the IR, the instructions

abstract-syntax-tree.html

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,10 @@ <h3>Table of Contents</h3>
7373
<li class="toctree-l1"><a class="reference internal" href="type-systems.html">Type Systems</a></li>
7474
<li class="toctree-l1"><a class="reference internal" href="semantic-analysis.html">Semantic Analysis</a></li>
7575
</ul>
76+
<p class="caption" role="heading"><span class="caption-text">Backend Basics</span></p>
77+
<ul>
78+
<li class="toctree-l1"><a class="reference internal" href="intermediate-representations.html">Intermediate Representations</a></li>
79+
</ul>
7680
<p class="caption" role="heading"><span class="caption-text">Learning Resources</span></p>
7781
<ul>
7882
<li class="toctree-l1"><a class="reference internal" href="learning-resources.html">Learning Resources</a></li>

compiler-books.html

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,10 @@ <h3>Table of Contents</h3>
175175
<li class="toctree-l1"><a class="reference internal" href="type-systems.html">Type Systems</a></li>
176176
<li class="toctree-l1"><a class="reference internal" href="semantic-analysis.html">Semantic Analysis</a></li>
177177
</ul>
178+
<p class="caption" role="heading"><span class="caption-text">Backend Basics</span></p>
179+
<ul>
180+
<li class="toctree-l1"><a class="reference internal" href="intermediate-representations.html">Intermediate Representations</a></li>
181+
</ul>
178182
<p class="caption" role="heading"><span class="caption-text">Learning Resources</span></p>
179183
<ul>
180184
<li class="toctree-l1"><a class="reference internal" href="learning-resources.html">Learning Resources</a></li>

intermediate-representations.html

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ <h2>Stack-Based IR<a class="headerlink" href="#stack-based-ir" title="Permalink
8686
instructions into basic blocks, and linking basic blocks through jump instructions. These two approaches are
8787
equivalent, you can think of a label as indicating the start of a basic block, and a jump as ending
8888
a basic block.</p>
89-
<p>The idea is that inside a basic block, instructions executed linearly one after the other.
89+
<p>The idea is that inside a basic block, instructions execute linearly one after the other.
9090
Each basic block ends with a branching instruction, something like a goto or a conditional jump.</p>
9191
<p>Here is a simple example of input source code and the IR you might see:</p>
9292
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">func</span> <span class="n">foo</span><span class="p">()</span><span class="o">-&gt;</span><span class="n">Int</span>
@@ -118,14 +118,15 @@ <h2>Stack-Based IR<a class="headerlink" href="#stack-based-ir" title="Permalink
118118
<ul class="simple">
119119
<li><p>The <code class="docutils literal notranslate"><span class="pre">jump</span></code> instruction transfers control from a basic block to another.</p></li>
120120
<li><p>The <code class="docutils literal notranslate"><span class="pre">cbr</span></code> instruction is the conditional branch. It consumes the top most value from the stack,
121-
and if this value is true, then control is transferred to the first block, else to the second block.</p></li>
122-
<li><p>The <code class="docutils literal notranslate"><span class="pre">eq</span></code> instruction pops the topmost two values from the stack, and replaces them with integer value
123-
<code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code>.</p></li>
121+
and if this value is true (in this case, a non-zero value), then control is transferred
122+
to the first block, else to the second block.</p></li>
123+
<li><p>The <code class="docutils literal notranslate"><span class="pre">eq</span></code> instruction pops the two topmost values from the stack, compares them and pushes a result:
124+
<code class="docutils literal notranslate"><span class="pre">1</span></code> for true or <code class="docutils literal notranslate"><span class="pre">0</span></code> for false.</p></li>
124125
</ul>
125126
<section id="advantages">
126127
<h3>Advantages<a class="headerlink" href="#advantages" title="Permalink to this headline"></a></h3>
127128
<ul class="simple">
128-
<li><p>The IR is compact to represent in stored form as most instructions do not take have operands.
129+
<li><p>The IR is compact to represent in stored form as most instructions do not have operands.
129130
This is a reason why many languages choose to encode their compiled code in
130131
this form. Examples are Java, C#, Web Assembly.</p></li>
131132
<li><p>The IR can be executed easily by an Interpreter.</p></li>
@@ -136,6 +137,8 @@ <h3>Advantages<a class="headerlink" href="#advantages" title="Permalink to this
136137
<h3>Disadvantages<a class="headerlink" href="#disadvantages" title="Permalink to this headline"></a></h3>
137138
<ul class="simple">
138139
<li><p>Not easy to implement optimizations.</p></li>
140+
<li><p>For a reader it is hard to trace values as they flow through instructions,
141+
as it requires tracking them through a conceptual stack.</p></li>
139142
<li><p>Harder to analyze the IR, although there are methods available to do so.</p></li>
140143
</ul>
141144
</section>
@@ -168,13 +171,13 @@ <h2>Register Based IR or Three-Address IR<a class="headerlink" href="#register-b
168171
<p>The instructions above are as follows:</p>
169172
<ul class="simple">
170173
<li><p><code class="docutils literal notranslate"><span class="pre">%t1</span> <span class="pre">=</span> <span class="pre">n+1</span></code> - is a typical three-address instruction of the form <code class="docutils literal notranslate"><span class="pre">result</span> <span class="pre">=</span> <span class="pre">value1</span> <span class="pre">operator</span> <span class="pre">value2</span></code>. The name <code class="docutils literal notranslate"><span class="pre">%t1</span></code>
171-
refers to a temporary, whereas <code class="docutils literal notranslate"><span class="pre">n</span></code> refers to the input argument <code class="docutils literal notranslate"><span class="pre">n</span></code>.</p></li>
174+
refers to a temporary, whereas <code class="docutils literal notranslate"><span class="pre">n</span></code> refers to the input argument <code class="docutils literal notranslate"><span class="pre">n</span></code>. Both of these names are virtual registers.</p></li>
172175
<li><p><code class="docutils literal notranslate"><span class="pre">ret</span> <span class="pre">%t1</span></code> - is the return instruction, in this instance it references the temporary.</p></li>
173176
</ul>
174177
<p>The virtual registers in the IR are so called because they do not map to real registers in the target physical machine.
175178
Instead these are just named slots in the abstract machine responsible for executing the IR. Typically, the abstract machine
176179
will assign each virtual register a unique location in its stack frame. So we still end up using the function’s
177-
stack frame, but the IR references locations within the stack frame via these virtual names, rather than implicitly
180+
stack frame, but the IR references locations within the stack frame directly using these virtual names, rather than implicitly
178181
through push and pop instructions. During optimization some of the virtual registers will end up in real hardware registers.</p>
179182
<p>Control flow is represented the same way as for the stack IR. Revisiting the same source example from above, we get following
180183
IR:</p>
@@ -198,6 +201,7 @@ <h3>Advantages<a class="headerlink" href="#id1" title="Permalink to this headlin
198201
<ul class="simple">
199202
<li><p>Readability: the flow of values is easier to trace, whereas with a stack IR you need to conceptualize a stack somewhere,
200203
and track values being pushed and popped.</p></li>
204+
<li><p>Fewer instructions are needed compared to stack IR.</p></li>
201205
<li><p>The IR can be executed easily by an Interpreter.</p></li>
202206
<li><p>Most optimization algorithms can be applied to this form of IR.</p></li>
203207
<li><p>The IR can represent Static Single Assignment (SSA) in a natural way.</p></li>
@@ -223,14 +227,14 @@ <h3>Examples<a class="headerlink" href="#id3" title="Permalink to this headline"
223227
<section id="sea-of-nodes-ir">
224228
<h2>Sea of Nodes IR<a class="headerlink" href="#sea-of-nodes-ir" title="Permalink to this headline"></a></h2>
225229
<p>The final example we will look at is known as the Sea of Nodes IR.</p>
226-
<p>It is quite different from the IRs we described above.</p>
230+
<p>This IR is quite different from the IRs we described above.</p>
227231
<p>The key features of this IR are:</p>
228232
<ul class="simple">
229233
<li><p>Instructions are NOT organized into Basic Blocks - instead, intructions form a graph, where
230234
each instruction has as its inputs the definitions it uses.</p></li>
231235
<li><p>Instructions that produce data values are not directly bound to a Basic Block, instead they “float” around,
232236
the order being defined purely in terms of the dependencies between the instructions.</p></li>
233-
<li><p>Control flow is also represented in the same way, and control flows between control flow
237+
<li><p>Control flow is represented in a similar way, and control flows between control flow
234238
instructions. Dependencies between data instructions and control intructions occur at few well
235239
defined places.</p></li>
236240
<li><p>The IR as described above cannot be readily executed, because to execute the IR, the instructions

lexical-analysis.html

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,10 @@ <h3>Table of Contents</h3>
113113
<li class="toctree-l1"><a class="reference internal" href="type-systems.html">Type Systems</a></li>
114114
<li class="toctree-l1"><a class="reference internal" href="semantic-analysis.html">Semantic Analysis</a></li>
115115
</ul>
116+
<p class="caption" role="heading"><span class="caption-text">Backend Basics</span></p>
117+
<ul>
118+
<li class="toctree-l1"><a class="reference internal" href="intermediate-representations.html">Intermediate Representations</a></li>
119+
</ul>
116120
<p class="caption" role="heading"><span class="caption-text">Learning Resources</span></p>
117121
<ul>
118122
<li class="toctree-l1"><a class="reference internal" href="learning-resources.html">Learning Resources</a></li>

prelim-impl-lang.html

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,10 @@ <h3>Table of Contents</h3>
8787
<li class="toctree-l1"><a class="reference internal" href="type-systems.html">Type Systems</a></li>
8888
<li class="toctree-l1"><a class="reference internal" href="semantic-analysis.html">Semantic Analysis</a></li>
8989
</ul>
90+
<p class="caption" role="heading"><span class="caption-text">Backend Basics</span></p>
91+
<ul>
92+
<li class="toctree-l1"><a class="reference internal" href="intermediate-representations.html">Intermediate Representations</a></li>
93+
</ul>
9094
<p class="caption" role="heading"><span class="caption-text">Learning Resources</span></p>
9195
<ul>
9296
<li class="toctree-l1"><a class="reference internal" href="learning-resources.html">Learning Resources</a></li>

0 commit comments

Comments
 (0)