Description
1. The serial portion of an algorithm constitutes 20 %.
(a) What is the maximum achievable speedup for this algorithm? [5pts]
(b) If the desirable speedup is 50. What should be maximum percentage of the serial
portion for the algorithm? [5pts]
2. Consider a hypothetical Von Neumann type machine shown in the figure. This machine
has a cache of 1 KiloByte and a main memory of 10 MegaBytes. The cost of fetching
one floating point number (8 bytes in size) from the cache is 1 CPU cycle (or one clock
tick).
In the event the data needed is not in the cache, a chunk of data 1 KiloByte
in size is fetched from the main memory to cache, replacing the contents of the cache.
The cost of fetching data from main memory is 150 CPU cycles.
CPU
Cache
1 KiloByte
Main Memory
10 Megabytes
8 bytes
1 CPU cycle
1 KiloByte
150 CPU cycles
Figure 1: Hypothetical Von Neumann Machine
Now, considering the following snippet code, where matrix A is initialised to some
integer values
#define N 1024
sum = 0;
for( i=0; i<=N; i++ )
for( j=0; j<=N; j++ )
sum = sum + A[i][j];
(a) What is the cost in CPU cycles to fetch the requisite data for the matrix A to
perform the computation (code is in C)? Disregard the cost for the variable ”sum”.
[20pts]
(b) What will be the cost again in terms of CPU cycles, if the exact same code is
written in FORTRAN? [20pts]