Basic Compiler Function:-Edit
A compiler is a program that reads a program written in source language and translates it into an equivalent program in machine code. During compilation, it reports the presence of errors in the source program.
Phases of Compilation:-Edit
A) Analysis Phase:-Edit
a) Lexical Analysis: -Edit
It is also called as scanning. It performs scanning from top to bottom and for every line from left to right and generates tokens.
i : integer ;
a,b : real ;
a : = b + I ;
Taken for this statement are as follows:
i) a is an identifier - Id # 2
ii) : = is an operator OP # 5
iii) b is an identifier Id # 3
iv) + is an operator OP # 3
v) i is an identifier Id # 1
vi) ; is an operator OP # 6
Lexical token is having syntax - code # no
Code contains identifier (Id), operator (op) and constant (CON)
- Symbol Table
b) Syntax Analysis (Parsing):-Edit
It is also called parsing. It generates syntax tree a: = b+i;
So we are checking syntax of expression + sign requires two operands. := sign is having one identifier towards left hand side.
c) Semantic Analysis :-Edit
a) convert i to real to perform addition operation.
b) convert i to real giving i *
c) Add i * to b , giving temp. Store temp in a.
d) Intermediate Code Generator :-Edit
i * = int to real i ;
temp = b + i * ;
a = temp ;
The intermediate code is a kind of code which is easy to generate and this code can be easily converted to target code. This code is in variety of forms such as three address code, quadruple, triple, posix. Here we considered code as three address code.
B) Synthesis Phase :-Edit
a) Code Optimizer :-Edit
The code optimization phase attempt to improve the intermediate code. This is necessary to have a faster executing code or less consumption of memory. Thus by optimizing the code the overall running time of the target program can be improved. For given instruction we have intermediate code
i * = int to real i ;
temp = b + i * ;
a = temp ;
We get optimized code as
a = b + int to real i ;
b) Code Generation :-Edit
It is easy to translate an intermediate instruction into a sequence of machine or assembly instruction. Translation of the code is as below :
CON – R AREG, I
ADD – R AREG, B
MOVEM AREG, A
Where CONV – R converts the value of I into the real representation and leaves the result in AREG.
ADD – R performs the addition in real mode and MOVEM puts the result into the memory area allocated to A.
Q. Explain the working of the various phases of the compiler for the expression :X = Y + Z * 30 ,The variables X, Y and Z are of float type. –( 8 Mark)
Solution :- X = Y + Z * 30
Q. Explain various forms of intermediate code. ----- ( 6 m)
Intermediate code has following forms :
a) Three Address Code :-Edit
The three address code consists of a sequence of instructions, where each instruction has maximum of three operands.
For example, to evaluate z = x + y * a
Three address code is given by :- 1) temp1 = y * a 2) temp2 = x + temp1 3) z = temp2
b) Quadruple Representation :-Edit
A quadruple representation has the following format:
|Operator||Operand 1||Operand 2||Result|
The quadruple representation of z = x + y * a is shown here
|Expression||Operator||operand 1||operand 2||Result|
|temp1 = y * a||*||y||a||temp1|
|temp2 = x + temp1||+||x||temp1||temp2|
|z = temp2||temp2||-||z|
c) Triple Representation :-Edit
A Triple representation has the following format :
The triple representation of z – x + y * a is shown here
|Expression||Operator||operand 1||operand 2|
|temp1 = y * a||1||*||y||a|
|temp2 = x + temp1||2||+||x||(1)|
|z = temp2||3||z||(2)|
( 1 ) Stands for the result of the first triple y * A
( 2 ) Stands for the result of the second triple.
d) Post fix Notation :-Edit
In post fix notation, the operator appears after operands. The post fix representation of z = x + y * a is given by :- z x y a * + =
e) Syntax Tree :-Edit
A syntax tree for an expression z = x + y * a has following structure
Memory Allocation :-Edit
Memory is required for loading and running a program. Three important tasks :-
i) How much memory is required for data.
ii) Appropriate allocation model to decide lifetime and scope of data.
iii) Memory mapping done.
Static Memory :-Edit
It is a kind of memory allocation in which memory is allocated to a variable before the program execution. Memory is allocated at compile time . It is permanent type of memory.
Dynamic Memory :-Edit
Memory is allocated at time of execution .
There are two types of memory allocation possible in dynamic memory allocation:-
1) Automatic dynamic allocation – memory is allocated as well as deallocated at time of execution, so multiple programs can utilize it. ( Stack Memory )
2) Program controlled dynamic allocation :- Help memory is used. Recursion can be implemented by program controlled dynamic allocation.
(a) (b) (c) (d)
Figure (a) shows static memory allocation. Figure (b) shows dynamic allocation when only program unit A is active. Figure (c) shows the situation after A calls B while Figure (d) shows the situation after B returns to A and A calls C. C has been allocated part of the memory deallocated from B. It is clear that static memory allocation allocates more memory than dynamic memory allocation except when all program units are active.
Q. Explain static and dynamic memory allocation. -- 4 m
Q. Explain memory allocation in block structured language. --- 4 m
Ans :- The block is a sequence of statements containing the local data and declarations which are enclosed within the delimiters.
Thus block structure storage allocation can be done by stack
Display is pointer which will point to starting address of every procedure. In this way memory allocation is done in block structure.
Compilation of Control StructuresEdit
Q. Explain compilation of control structures ? --- 4 m
Ans :- Generally every program is executed sequentially. The control structure of a programming language is the collection of language features which govern the sequencing of control through a program. Control transfer implemented through conditional and unconditional control structure. Control structure like if, for or while are very impact full.
(1) if loop :- if ( e1 ) thenEdit
S1; where e1 is condition
else if condition is true it will execute S1
S2 ; if condition is false it will execute S2
S3 ; even if condition is true or false S3 will be executed
(2) While loop :-Edit
While ( e2 ) do where e2 is condition
S1; it will execute S1, S2 ,…….. Sn
S2; statements till condition is true.
end while ;
(3) for loop :- if n = 3Edit
For ( i = l to n ) do then S1 will be executed
S1 when value of i = 1, 2 or 3.
end for ; For 3 times it will be executed
Thus control structure work.
Code Optimization and TechniquesEdit
Q. What is the need for code optimization ? Explain various code optimization techniques. ---- 7 m
Ans :- Code optimization aims at improving the execution efficiency of a program. This is achieved in two ways :
1) Redundancies in a program are eliminated.
2) Computations in a program are rearranged or rewritten to make it execute efficiency.
Code optimization techniques :-
a) Compile time evaluation.
b) Elimination of common sub exertions.
c) Dead code elimination
d) Frequency reduction
e) Strength reduction
f) Local and global optimization
a) Compile Time Evaluation :-Edit
Execution efficiency can be improved by performing certain actions specified in a program during compilation itself. This eliminates the need to perform then during execution of the program there by reducing the execution time of the program when all operands in an operation are constants, the operation can be performed at compilation time the result of the operation, also a constant, can replace the original evaluation in the program. Thus an assignment a : = 5/2; can be replaced by a : = 2.5, there by eliminating a division operation.
b) Elimination of common sub expressions :-Edit
Common sub expressions are occurrences of expressions giving the same value.
Her b * c is a common sub expressions. So instead of calculating its value each and every time, we will calculate its value initially and assign to variable t and this t variable can be used in program.
c) Dead Code Elimination :-Edit
Code which can be omitted from a program without affecting its result is called dead code. Dead code is detected by checking whether the value assigned in an assignment statement is used anywhere in the program.
i : = o :
if ( i = 1 ) then i : = o ;
a:=p; z : = c ;
z : = c ;
a)With Dead code b) without dead code
d) Frequency Reduction :-Edit
Execution time of a program can be reduced by moving code from a part of a program which is executed very frequently to another part of the program which his executed fewer times. For example the transformation of loop optimization moves loop invariant code out of a loop and places it prior to loop entry.
Example : For i : = 1 to 100 do x : = 25 * a ;
begin for i : = 1 to 100 do
z : = i ; begin
x : = 25 * a ; z : = i ;
y : = x + z ; y : = x + z ;
Here x : = 25 * a ; is loop invariant. Hence in the optimized program it is computed only once before entering the for loop . y : = x + z ; is not loop invariant. Hence it cannot be subjected to frequency reduction.
e) Strength Reduction: -Edit
The strength reduction optimization replaces the occurrence of a time consuming operation (a’ high strength ‘operation) by an occurrence of a faster operation (a ‘low strength ‘ operation ) e.g replacement of a multiplication by an addition.
Example :- for i : = 1 to do temp : = 5
begin for i : = 1 to 10 do
k : = i * 5 ; begin
end k : = temp ;
temp : = k + 5 ;
In this example, the ‘high strength’ operator ‘*’ in ‘i * 5 ‘occurring inside the loop is replaced by a low strength operator ‘+ ‘in ‘k + 5 ‘.
f) Local and Global Optimization: -Edit
Local optimization is applied over small segments of a program consisting of a few statements. Global optimization is applied over an entire program unit. Local optimization provides limited benefits at a low cost compared to local optimization global optimization.
- Notes by Prof Dipak Pawar, Team WikiNote, PUNE
- WikiNote Foundation