Compilers and programming languages – Early programming languages – Establishing fundametal paradigms



Compilers and programming languages – Early programming languages – Establishing fundametal paradigms

0 0


slides


On Github zzag / slides

Compilers and programming languages

Agenda

  • History of programming languages
  • How compiler works?
  • PL and its implementation
  • Demo
  • Questions

Early programming languages

  • 1952 - Autocode
  • 1957 - FORTRAN
  • 1958 - LISP
  • 1957 - ALGOL
  • 1959 - COBOL
  • 1962 - Simula
  • 1964 - BASIC
  • 1967 - BCPL

Autocode (1952)

Was developed in 1952 for the Mark 1
It used a compiler to automatically convert the language into machine code
The first compiled high-level programming language

At the University of Manchester, Alick Glennie developed Autocode in the early 1950s. The first code and compiler was developed in 1952 for the Mark 1 computer at the University of Manchester and is considered to be the first compiled high-level programming language.

Fortran (1957)

Invented at IBM by John Backus
The first widely used high level PL
Still popular language for high-performance computing

Establishing fundametal paradigms

  • 1969 - B
  • 1970 - Pascal
  • 1972 - C
  • 1972 - Smalltalk
  • 1972 - Prolog
  • 1973 - ML
  • 1975 - Scheme
  • C, an early systems programming language
  • Smalltalk (mid-1970s) provided a complete ground-up design of an OOP
  • Prolog, designed in 1972, was the first logic programming language
  • ML built a polymorphic type system

C (1972)

Created by Dennis Ritchi
The origin of C is closely tied to the development of the Unix operating system

The origin of C is closely tied to the development of the Unix operating system, originally implemented in assembly language on a PDP-7 by Ritchie and Thompson, incorporating several ideas from colleagues. Eventually, they decided to port the operating system to a PDP-11. The original PDP-11 version of Unix was developed in assembly language. The developers were considering rewriting the system using the B language, Thompson's simplified version of BCPL. However B's inability to take advantage of some of the PDP-11's features, notably byte addressability, led to C.

Unix was one of the first operating system kernels implemented in a language other than assembly. (Earlier instances include the Multics system (written in PL/I), and MCP (Master Control Program)

1980s

  • 1980 - C++(C with classes, renamed in 1983)
  • 1983 - Ada
  • 1984 - Common Lisp
  • 1984 - MATLAB
  • 1986 - Objective-C
  • 1986 - Erlang
  • 1987 - Perl
  • 1988 - Wolfram Language

The 1980s were years of relative consolidation in imperative languages. Rather than inventing new paradigms, all of these movements elaborated upon the ideas invented in the previous decade. C++ combined object-oriented and systems programming. The United States government standardized Ada, a systems programming language intended for use by defense contractors. In Japan and elsewhere, vast sums were spent investigating so-called fifth-generation programming languages that incorporated logic programming constructs. The functional languages community moved to standardize ML and Lisp. Research in Miranda, a functional language with lazy evaluation, began to take hold in this decade.

One important new trend in language design was an increased focus on programming for large-scale systems through the use of modules, or large-scale organizational units of code. Modula, Ada, and ML all developed notable module systems in the 1980s. Module systems were often wedded to generic programming constructs---generics being, in essence, parametrized modules (see also polymorphism in object-oriented programming).

The 1980s also brought advances in programming language implementation. The RISC movement in computer architecture postulated that hardware should be designed for compilers rather than for human assembly programmers. Aided by processor speed improvements that enabled increasingly aggressive compilation techniques, the RISC movement sparked greater interest in compilation technology for high-level languages.

1990s

  • 1990 - Haskell
  • 1991 - Python <3
  • 1993 - Ruby
  • 1993 - Lua
  • 1995 - Java
  • 1995 - JavaScript
  • 1995 - PHP(╬ಠ益ಠ)
  • 1999 - D

This era began the spread of functional languages

Python

Designed by Guido van Rossum
Large organizations that make use of Python include Google, Yahoo!, CERN, NASA
Python has been used in artificial intelligence tasks

Current trends

  • 2001 - C#
  • 2003 - Scala
  • 2005 - F#
  • 2009 - Go
  • 2011 - Dart
  • 2014 - Swift
  • 2015 - Rust <3

Parse step

Sema step

typeOf(<e1, '+', e2>) =
  t1 = typeof(e1)
  t2 = typeOf(e2)

  if t1 != t2 &&
      t1 != <no-type> &&
      t2 != <no-type> {
    error("type mismatch")
    return <no-type>
  }

  return t1
Use Visitor pattern to traverse AST!!!

Codegen step

ARM target Simulate stack machine
codegen(<e1, '+', e2>) =
  codegen(e1)
  print("push r0")
  codegen(e2)
  print("push r0")

  print("pop r2") # value of e2
  print("pop r1") # value of e1
  print("add r0, r1, r2")

Intermediate Languate (IL)

class Band {
  var name: String
  var members: List<Dude>

  fn getName(): String {
    return name;
  }

  fn getMembers(): List<Dude> {
    return members;
  }

  fn addMember(dude: Dude) {
    members.add(dude);
  }
}
struct Band {
  name: String,
  members: List<Dude>
}

fn Band__getName(self: Band): String {
  return self.name;
}

fn Band__getMembers(self: Band): List<Dude> {
  return self.members;
}

fn Band__addMember(self: Band, dude: Dude) {
  List_Dude__add(self.members, dude);
}

Why IL?

  • Faster execution time (optimized code)
  • More precise type checking
  • Eliminating redundancy and complexity

IR

Defined by backend

Close enough to assembly

Target independent

SSA

%4 = load i32, i32* %j, align 4
%5 = sitofp i32 %4 to float
%6 = load float, float* %delta_x, align 4
%7 = fmul float %6, %5
%8 = load float, float* %x_min, align 4
%9 = fadd float %8, %7
store float %9, float* %re, align 4
%10 = load i32, i32* %i, align 4
%11 = sitofp i32 %10 to float

LLVM

collection of modular and reusable compiler and toolchain technologies

Started in 2000s
Compilers that use LLVM include ActionScript, Ada, C#, Common Lisp, Crystal, D, Fortran, GLSL, Haskell, Lua, Objective-C, Python, R, Ruby, Rust, Scala, and Swift.

Programming lang and impl

  • Written in C++
  • Operating system: Unix-like
  • Influenced by Swift, Ruby and Rust
  • Statically typed
  • No implicit type casting
  • Type inference
  • Emit assembly, objective file or ELF

Variables..

var name: type = initValue;
var name = initValue;


var pi: f32 = 3.1415;
var goldenRatio = 1.618; // => f32
var rustIsTheBest = true; // => bool

Functions

func name(args): return-type { ... }


func add(a: i32, b: i32): i32 {
  return a + b;
}

Operations

binary: +, -, *, /, %, <<, >>, &, |, ^
unary: -, prefix -- and ++, postfix -- and ++
comparison: <, >, <=, >=, ==, !=, &&, ||


x++; // postfix inc
++x; // prefix inc

x = 3 + 4;
x -= y; // x = x - y
x += 3.6 as i32;

if statement

if <cond> {
  ...
} else if <cond> {
  ...
} else if <cond> {
  ...
} else {
  ...
}


<stmt> if <cond>

<stmt> unless <cond>

Loops

while <cond> { ... }

unless <cond> { ... }

loop { ... }

for <init>, <cond>, <step> { ... }

CFG

// break statement is encountered inside a loop,
//   the loop is immediately terminated and the program
//   control resumes at the next statement following the loop
break


// continue statement forces the next iteration
//   of the loop to take place
continue


return <expr>

Example #1

// Hello world

// compile: bin/emit-elf hello.txt

func main(): i32 {
  puts("Hello, world!");
  return 0;
}

Example #2

// Get sum of numbers in range [1, 10]

var acc = 0;

for var i = 1, i <= 10, ++i {
  acc += i;
}

Demo

Draw fractals

Questions

Keep on hacking!

1
Compilers and programming languages