Stages of compiler design

12.5.1.3 describe program compilation stages: lexical and syntactic analysis, code generation and optimization​

Stages of compiler design

When a programmer uses a computer language (high level) to write a program the statements are called source code.

Source code is a human-readable text written in a specific programming language.

The compiler translates source code in to machine code (low level).

The code that is compiled is stored as an executable file also called object file. When the file runs the machine code is processed by the CPU.

Stages in the compilation of a program


Lexical analysis

Lexical analysis is the process of analyzing a stream of individual characters (normally arranged as lines), into a sequence of lexical tokens (tokenization of words and symbols) to feed into the parser that the compiler will understand. 
It splits text written in a natural language (e.g. English) into a sequence of words and punctuation symbols that the compiler will understand.

Source code is written by programmers using ASCII characters. During lexical analysis, the compiler breaks down this stream of ASCII characters into its component parts, called "lexemes".

A lexeme is the smallest unit of language. Lexemes cannot be broken down further without losing meaning.

Syntactic analysis

This is alternatively known as parsing. This stage analyses the syntax of the statements to ensure they conform to the rules of grammar for the computer language in question.
It is roughly the equivalent of checking that some ordinary text written in a natural language (e.g. English) is grammatically correct (without worrying about meaning).
The purpose of syntax analysis or parsing is to check that we have a valid sequence of tokens. Tokens are a valid sequence of symbols, keywords, identifiers etc. 

Code generation

The code generated by the compiler is an object code of some lower-level programming language, for example, assembly language. 

Minimum properties of low-level object code:​

  • It should carry the exact meaning of the source code.​
  • It should be efficient in terms of CPU usage and memory management.​

Code optimization

Making the compile time as short as possible. Optimization is a program transformation technique, which tries to improve the code by making it consume less resources (i.e. CPU, Memory) and deliver high speed.

Optimizations provided by a compiler includes:

  • Inlining small functions 
  • Code hoisting 
  • Dead store elimination
  • Eliminating common sub-expressions 
  • Loop unrolling
  • Loop optimizations: Code motion, Induction variable elimination, and Reduction in strength.

 

Questions:

Give two examples of high level languages. (Marks: 1)
  • eg Pascal, Python etc.
A compiler is used to run them. What does it do? (Marks: 1)
  • A compiler will translate a high level language into machine code and each program instruction translates into many machine code instructions.
What is an advantage of writing a program using Pascal or Python compared to writing the same program in assembly code? (Marks: 1)
  • The code can be compiled and distributed without the source code.

Exercises:

Ex. 1

Test Revision "12.1B Programming paradigms"

Exam questions:

Why would a company not want to distribute source code when they sell a software package? (Marks: 2)
  • Retention of source code ensures control over it is kept with company or individual; (1)
  • software cannot be so easily reverse engineered (taking design knowledge and reusing it); (1)
  • code cannot be modified. (1)
Категория: Programming languages | Добавил: bzfar77 (17.09.2020)
Просмотров: 111 | Теги: Compilation, object code, Source Code, syntactic analisys, code optimisation, Token, identifier, executable file, keyword, Operator, code genetation, lexical analisys | Рейтинг: 0.0/0
Всего комментариев: 0
avatar