On Github tjmonsi / cmsc129-slide2
Created by TJ Monserrat
For the whole semester: We will learn how to design and implement compilers.
Programs that can turn everyday human language to things that can be executed by computers
Takes in Source Code or Character Stream and turns it into an array of tokens or Lexemes
Each token is in the form of:
[token-name, attribute-value]
These tokens will be passed to the Syntax Analyzer
For Example
position = initial + rate * 60
For Example
position = initial + rate * 60
For Example
position = initial + rate * 60
For Example
position = initial + rate * 60
This makes the example:
position = initial + rate * 60
into...
Lexemes:
[ [id, 0], [eq], [id, 1], [add], [id, 2], [mul], [number, 3] ]
SymbolTable: {
0: { value: 'position', ... },
1: { value: 'initial', ... },
2: { value: 'rate', ... },
3: { value: 60, ... }
}
The Lexemes...
[ [id, 0], [eq], [id, 1], [add], [id, 2], [mul], [number, 3] ]
...will need to be fed to the syntax analyzer to produce a syntax tree like this...
[eq]
|--------
| |
[id, 0] [add]
|--------
| |
[id, 2] [mul]
|---------
| |
[id, 3] [number, 60]
To be able to do this...
[eq]
|--------
| |
[id, 0] [add]
|--------
| |
[id, 2] [mul]
|---------
| |
[id, 3] [number, 60]
We need to know how to parse them using Context-free grammars.
[eq]
|--------
| |
[id, 0] [add]
|--------
| |
[id, 2] [mul]
|---------
| |
[id, 3] [number, 60]
The system can create an intermediate code before translating it into the target program code.
t1 = 60 t2 = id3 * t1 t3 = id2 + t2 id1 = t3
t1 = 60 t2 = id3 * t1 t3 = id2 + t2 id1 = t3
We need to transform parse trees into a three-address code sequence. Note that three-address code sequences will be of the following...
t1 = 60 t2 = id3 * t1 t3 = id2 + t2 id1 = t3
We can then transform the intermediate code generation to the target code that can be understood by the system
LDF R2, id3 MULF R2, R2, #60.0 LDF R1, id2 ADDF R1, R1, R2 STF id1, R1
t1 = 60 t2 = id3 * t1 t3 = id2 + t2 id1 = t3
But for our class, we need to transform the intermediate code above to our pseudo-ASM that we will discuss next time.
load t1 60 load t2 mul id3 t1 load t3 add id2 t2 load id1 t3
An essential function of a compiler is to record the variable names used in the source program and collect information about various attributes of each name.
These attributes may provide information about the storage allocated for:
Things you need to ponder when designing a language:
We should consider static-scope rules for a language with blocks, where a block is a grouping of declarations and statements.
const y = 10;
const x = (z, a, b = 10) => {
let q = 0;
const m = (p) => {
return p + 5;
}
q = (b === 10) ? m(z) + a : m(b);
setTimeout(() => {
y = q + 5
}, 5000);
}
x(y, 25);
const y = 10;
const x = (z, a, b = 10) => {
...
}
x(y, 25);
// Outer scope has variables y and x, where x is a function
const x = (z, a, b = 10) => {
let q = 0;
const m = (p) => {
...
}
q = (b === 10) ? m(z) + a : m(b);
setTimeout(() => {
...
}, 5000);
}
// the function x has parameter values z, a, b, q, and m,
// where m is also a function
const y = 10;
const x = (z, a, b = 10) => {
let q = 0;
const m = (p) => {
...
}
q = (b === 10) ? m(z) + a : m(b);
setTimeout(() => {
...
}, 5000);
}
// Take note that function x also includes variable y
// as part of its scope.
const m = (p) => {
return p + 5;
}
// function m includes variable p.
const y = 10;
const x = (z, a, b = 10) => {
let q = 0;
const m = (p) => {
return p + 5;
}
q = (b === 10) ? m(z) + a : m(b);
setTimeout(() => {
...
}, 5000);
}
// But the same as function x, function m also includes:
// z, a, b, q, and y as part of its scope
setTimeout(() => {
y = q + 5
}, 5000);
// The anonymous function inside the parameter callback
// of the setTimeout function includes y and q even if not initialized...
const y = 10;
const x = (z, a, b = 10) => {
let q = 0;
const m = (p) => {
...
}
q = (b === 10) ? m(z) + a : m(b);
setTimeout(() => {
y = q + 5
}, 5000);
}
// Because the anonymous function is within the scope of
// function x.
Note that Scoping affects the Symbol Table structure.
Every exercise will have an effect on your project.
All exercises will building towards the project, Except the Target-Code Generation, because...
Translate own Source code to Target-code generation, passing through all of the phases:
// SOURCE CODE var y; var x = 2; y = x + 1
// PSEUDO-ASM start main var y var x load x 2 load y add x 1 end
// SOURCE CODE
var x;
var y;
...
if ((x < 3) && (y >= 3)) {
x = 4
}
else {
x = 2
}
// PSEUDO-ASM start main var x ... var temp1 load temp1 lt x 3 var temp2 load temp2 load gte y 3 var temp3 load temp3 and temp1 temp2 cond temp3 load x 4 else load x 2 endcond end
// SOURCE CODE
for (x=0; x<10; x++) {
console.log(x)
}
// or
var x = 0
while (x<10) {
console.log(x)
x++
}
// PSEUDO-ASM start main var x load x 0 var temp1 load temp1 lt x 10 loop temp1 print x load x add x 1 endloop end
// SOURCE CODE
var x = 0
do {
console.log(x)
x++;
} while(x<10)
// PSEUDO-ASM start main var x load x 0 floop print x add x 1 var temp1 load temp1 lt x 10 endfloop temp1 end
// SOURCE CODE
function c() {
console.log("Hah")
}
function b(y, z) {
console.log(y)
console.log(z)
console.log("Hey")
}
function a() {
var x = 0;
var y = 1;
b(x, y);
console.log("Yoh")
c();
return x
}
var x;
x = a();
// PSEUDO-ASM start c print "Hah" end start b param var y var z print y print z print "Hey" end start a var x var y load x 0 load y 1 call b x y print "Yoh" call c return x end start main var x load x call a end
Google Docs: https://goo.gl/SLMzVB