Description
1. Write a Java, Python, C, or C++ program that outputs a longest common subsequence of three
input sequences X, Y , and Z of symbols from a fixed alphabet. For example, if the input is:
X = hN, I, N, Ei,
Y = hS, I, N, G, I, N, Gi,
Z = hN, I, N, J, A, Si
then the output should be hN, I, Ni. For this input, the optimal solution is unique.
For inputs where
the optimal solution is not unique, your program may output any longest common subsequence.
(If you want to use another programming language, get the instructors’ approval first.) (4 marks)
2. In the report, briefly explain how your program works and analyze its time and space complexity.
Also, show a small example input created by you and the corresponding output. (2 marks)
3. When all three input sequences have the same length n, what is the largest n for which your program
manages to compute a solution in less than one minute? Is the execution time or running out of
memory the main issue here? Put this “extreme” example in “Max99999999Z.txt”. (1 mark)
4. A binary sequence is a sequence over the alphabet {0, 1}. For example, h0, 1, 0, 1, 1, 0, 0, 0i is a
binary sequence.
Give an example of three binary sequences X, Y , Z, each of length 8, such
that the length of a longest common subsequence of X and W, where W is a longest common
subsequence of Y and Z, is different from the length of a longest common subsequence of X, Y , Z.
You may use your program to find such an example. (2 marks)
5. For 10 different values of n selected by you, use your program to compute A(n), defined as follows.
A(n) is the average length of a longest common subsequence of three randomly generated binary
sequences X, Y , Z of length n, where all symbols in the sequences are chosen independently and
uniformly at random from {0, 1} and the average is taken over 10 independent runs.
In other
words, you need to run your program 10 times for each n and therefore 100 times in total.
Present your results in a table with ten rows and three columns: one column for n, one column
for A(n), and one column for the ratio A(n)/n. According to your experiments, does A(n)/n seem
converge to any constant, and if so, what is its value? (3 marks)
6. This question concerns the case of two input sequences. Prove that every algorithm that outputs all
longest common subsequences of two input sequences has a worst-case running time that is exponential.
To do so, show how to define, for every positive integer n, two length-n sequences Xn, Yn
having Ω(c
n) different longest common subsequences, where c is a constant. You are allowed to use
an alphabet of size n, i.e., the symbols in Xn, Yn can come from a set {a1, a2, . . . , an}. (3 marks)