INTERPRETERS vs COMPILERS

**Nick Chepurniy, Ph.D.**

**e-mail: nickc@sharcnet.ca**

**HPC Consultant - SHARCNET**

**http://www.sharcnet.ca**

**Compute Canada**

**http://www.computecanada.org**

### Overview

We start by defining what interpreters and compilers are and give examples of both. Then two simple problems are done in OCTAVE (interpreter) and two compiler languages (c and fortran) to illustrate how these tools are used. A more realistic problem (Matrix Inversion) for different matrix sizes is done in OCTAVE and compared to LAPACK. Based on this problem conclusions are drawn about the use of interpreters and compilers.

## Contents

### Introduction

In this online-tutorial we start by describing briefly how compilers and interpreters work.
The most important issue is: *which should you use*, and that depends on what kind of code
you are developing.

Fortran, c/C++, Pascal and ALGOL are compilers.

A compiler analyses all the statements in a program and links the code (using libraries) to produce an executable. To run the program you submit the executable which would produce some output. To make changes in the program you must repeat these steps.

OCTAVE, MATLAB, MAPLE, R and APL are interpreters.

An interpreter translates one high-level language command into low-level machine instructions and executes that command at once. The interpreter takes one command at a time, however the user can place several commands into a script file and execute the commands in that file one at a time.

### Simple Comparison

First let's illustrate how OCTAVE (an interpreter) works. The second example
is done in OCTAVE(interpreter) and both c and fortran (compilers) to illustrate the two approaches. Finally, the third example (in the next segment),
a more realistic one, illustrates and answers:
*which should you use (a compiler or an interpreter)*.

OCTAVE and MATLAB are very similar. They have a lot of similar functions but there are some that are different.

Here is an example in OCTAVE:

[nickc@nar316:~] octave GNU Octave, version 3.2.2 octave:1> x=3.0; octave:2> xsq=x*x xsq = 9 octave:3> quit [nickc@nar316:~]

The first line invokes the interpreter (OCTAVE) which responds on the next line with the name of the interpreter and version. Then the interpreter prints the prompt with the line number:

octave:1>

User types a command (assigning the value of 3.0 to variable x) followed by a semicolon (;) and then presses CR (Carriage Return). The effect of the semicolon at the end of the command instructs the interpreter not to print the answer (i.e. value of x in this case is not printed on the screen).

The next command computes variable xsq as the product of x by x. Since the command does not end with a semicolon the answer is printed on the next line.

To terminate the interaction session the user types the command: quit

Here is a second example using OCTAVE:

[nickc@nar316:~] octave GNU Octave, version 3.2.2 octave:1> sum=0; octave:2> N=1000 N = 1000 octave:3> for i=1:N > sum=sum+i; > endfor octave:4> sum sum = 500500 octave:5> my_sum=N*(N+1)/2 my_sum = 500500 octave:6> quit [nickc@nar316:~]

In above example, the user sets sum=0 and N=1000 in the first 2 commands. The third command is a for-loop starting with the clause:

for i=1:N

The interpreter will read instructions until an endfor clause is detected.

In the body of the for loop we are adding the index variable i and not printing the answer. On line 4 we print the answer (sum of integers from 1 to N).

Line 5 verifies the answer with the analytical formula: my_sum=N*(N+1)/2

When the algorithms become more complex rather than typing each command at the OCATVE prompt you can place all the commands into a script file. An OCTAVE script file has the *.m extension (e.g. my_commands.m). To execute the commands in the script file type the name of the file without the extension (e.g. my_commands) into the OCATVE prompt.

Let's place the commands of the second example into a file called ex2.m as shown next:

# OCTAVE script file ex2.m # Commands for example 2 sum=0; N=1000 for i=1:N sum=sum+i; endfor sum my_sum=N*(N+1)/2

To execute the commands in the OCTAVE script file ex2.m do as follows:

[nickc@nar316:] octave GNU Octave, version 3.2.2 octave:1> ex2 N = 1000 sum = 500500 my_sum = 500500 octave:2> quit

or you can run it in batch mode by submiting it as a job using:

#!/bin/bash # script file: sub_job if [ $# -ne 1 ]; then echo echo "Must have 1 argument specifying the OCTAVE file" echo echo "For example: ./sub_job ex2.m " echo exit fi OCTAVE_FILE=$1 sqsub -t -r 15m -o OUTPUT_${OCTAVE_FILE%%.*}_%J octave ${OCTAVE_FILE}

For completeness here is the file sumN.c which solves the problem in the second example using the c compiler, pathcc:

#include <stdio.h> /* This program prints the sum of integers from 1 to N */ #define N 10000 int main(int argc, char **argv) { int i, sum, sum_formula; sum = 0; for (i=1 ; i <= N; i++) sum = sum + i ; printf("For N = %d the sum = %d\n",N,sum); sum_formula = N*(N+1)/2; printf("and using formula sum = %d\n",sum_formula); return 0; }

and the equivalent fortran file sumN.f90:

! file sumN.f90 program sumN integer, parameter :: N = 10000 integer :: i, sum, sum_formula sum = 0 do i=1,N sum = sum + i enddo write(6,1001) N,sum 1001 format("For N = ",i8," the sum = ",i10) sum_formula = N*(N+1)/2 write(6,1002) sum_formula 1002 format("and using formula sum = ",i10) end

To compile and execute the c program you would do these commands:

[nickc@nar316:/work/nickc/INTERPRETERS_vs_COMPILERS] pathcc sumN.c [nickc@nar316:/work/nickc/INTERPRETERS_vs_COMPILERS] ./a.out For N = 10000 the sum = 50005000 and using formula sum = 50005000

and for the fortran program you would do:

[nickc@nar316:/work/nickc/INTERPRETERS_vs_COMPILERS] pathf90 sumN.f90 [nickc@nar316:/work/nickc/INTERPRETERS_vs_COMPILERS] ./a.out For N = 10000 the sum = 50005000 and using formula sum = 50005000

### Matrix Inversion (Example 3)

The third example in this online tutorial defines a STRIDWAD matrix, stored in the array A, and a vector X described in:

https://www.sharcnet.ca/help/index.php/LAPACK_and_ScaLAPACK_Examples#Defining_the_STRIDWAD_matrix

Once the matrix A and vector X are defined, the matrix-vector product, A*X, is computed and stored in the vector RHS. Then, using A and RHS the linear system of equations is solved and the solution is saved in the vector XX which should be identical to X.

In preparation for our third example, let's introduce OCTAVE functions. You can define an OCTAVE function by typing it into a *.m file which starts with the word: "function". Here are the definitions for two functions:

# file: INIT_MY_MATRIX.m function A = INIT_MY_MATRIX (N) if (nargin != 1) usage ("A = INIT_MY_MATRIX (N)"); endif NH = N/2; NH1 = NH + 1; # Diagonal elements MD = 3*eye(N); # Upper diagonal ii=-1*ones(N-1,1); UD=diag(ii,1); # Lower diagonal LD=diag(ii,-1); # ANTI-DIAGONAL AD=fliplr(eye(N)); A = UD + MD + LD + AD ; return endfunction

# file: INIT_MY_VECTOR.m function X = INIT_MY_VECTOR(N) if (nargin != 1) usage ("A = INIT_MY_MATRIX (N)"); endif X = find(ones(N,1)); return endfunction

Note that the name of the file (e.g. INIT_MY_VECTOR.m) without the *.m extension
is identical to the function name inside the file.

Once these functions are defined (by writing them into the *.m files), you can use them either interactevly or submiting a batch job from the same subdirectory where these functions appear.

Here is a script file for a short version of example 3:

# file: short_ex3.m # OCTAVE commands for example 3 t0 = clock (); N=10 A = INIT_MY_MATRIX(N) X = INIT_MY_VECTOR(N) RHS = A*X # Now use A and RHS to find new XX XX = A \ RHS elapsed_time = etime (clock (), t0) [total, user, system] = cputime (); total user system N

To run it interactively you type:

octave GNU Octave, version 3.2.2 octave:1> short_ex3 N = 10 A = 3 -1 0 0 0 0 0 0 0 1 -1 3 -1 0 0 0 0 0 1 0 0 -1 3 -1 0 0 0 1 0 0 0 0 -1 3 -1 0 1 0 0 0 0 0 0 -1 3 0 0 0 0 0 0 0 0 0 0 3 -1 0 0 0 0 0 0 1 0 -1 3 -1 0 0 0 0 1 0 0 0 -1 3 -1 0 0 1 0 0 0 0 0 -1 3 -1 1 0 0 0 0 0 0 0 -1 3 X = 1 2 3 4 5 6 7 8 9 10 RHS = 11 11 11 11 11 11 11 11 11 22 XX = 1.0000 2.0000 3.0000 4.0000 5.0000 6.0000 7.0000 8.0000 9.0000 10.0000 elapsed_time = 0.0048676 total = 0.24096 user = 0.15298 system = 0.087986 N = 10 -- less (100%) (f)orward, (b)ack, (q)uitPress Control-C again to abort. octave:2> quit

For a larger system of equations (N=12000) we use the script file long_ex3.m:

# file: long_ex3.m # Commands for example 3 t0 = clock (); N=12000 A = INIT_MY_MATRIX(N); X = INIT_MY_VECTOR(N); RHS = A*X; # Now use A and RHS to find new XX XX = A \ RHS elapsed_time = etime (clock (), t0) [total, user, system] = cputime (); total user system N

and submit the job in batch mode using the script sub_job (described earlier) with argument long_ex3.m as follows:

[nickc@nar316:] ./sub_job long_ex3.m submitted as jobid 1524992

The output file, OUTPUT_long_ex3_1524992, for the job was:

GNU Octave, version 3.2.2 N = 12000 XX = 1.0000e-00 2.0000e+00 3.0000e+00 ... 1.1998e+04 1.1999e+04 1.2000e+04 elapsed_time = 263.65 total = 263.69 user = 253.32 system = 10.373 N = 12000 ------------------------------------------------------------ Subject: Job 1524992: <srun octave long_ex3.m> Done Job <srun octave long_ex3.m> was submitted from host <nar316> by user <nickc>. ------------------------------------------------------------ # LSBATCH: User input srun octave long_ex3.m ------------------------------------------------------------ Successfully completed. Resource usage summary: CPU time : 264.45 sec. Max Memory : 2528 MB Max Swap : 3478 MB

### Comparison of execution times

Following table compares OCTAVE and LAPACK execution times for the matrix inversion problem for different orders of the matrix (N):

N OCTAVE LAPACK order of CPU CPU matrix time time [ sec ] [ sec ] 10000 169.12 344.93 13000 314.98 594.70 15000 448.33 853.51 16000 541.83 1000.37 17000 N/A 1183.27 20000 N/A 1813.09 30000 N/A 3599.94 32000 N/A N/A Note: N/A means the problem cannot be run because there is not enough memory.

### Conclusions

Following conclusions can be drawn from the OCTAVE program in Example 3 in this online tutorial and the results presented in the

https://www.sharcnet.ca/help/index.php/LAPACK_and_ScaLAPACK_Examples

online tutorial:

(1) Developing a model with an interpreter is much easier because interpreters have many more commands and the development is carried out interactively. This results in a much faster development cycle. (2) OCTAVE and other interpreters could run as fast as compiled code for smaller models. As seen from the table in the previous segment OCTAVE in all cases runs in about half the time of the LAPACK execution time. (3) Compiled code however could run much larger models. (4) Since mpi can be used with compiled codes this extends the size of the models to even a higher dimension. In this case LAPACK programs can be extended to ScaLAPACK. See: https://www.sharcnet.ca/help/index.php/LAPACK_and_ScaLAPACK_Examples#LAPACK_RESULTS

Answer to *which should you use*:

For smaller models an intepreter would be more appropriate. However, if the model exceeds the memory then you have no choice but to switch to a compiler.

For the development of new algorithms it is easier to do all the preliminary testing using an interpreter. If the memory requirements are met then there is no need to switch to a compiler.

But for much larger models the compiler with mpi is more adecuate.

Using OCTAVE we were able to invert matrices up to N=16,000. For higher dimensions LAPACK was able to go up to N=30,000 and for even higher models ScaLAPACK went up to N=250,000 (using 256 processors).

Last update: 12 Sept 2013