## Description

Problem 1: Pattern-matching: The brute-force

Problem 1.1: The brute-force pattern-matching algorithm [10 pt.]

Describe a text D and a pattern P such that the brute-force pattern-matching algorithm runs in

Ω(dp) time.The lengths of D and P are d and p, respectively.

Problem 1.2: Python’s str class and pattern-matching [20 pt]

In this part, you are asked to modify three pattern matching programs given to you (See appendix). Run your modified programs for varying-length patterns and show your results.

The count method in Python’s str class takes a text D and a pattern P and returns the

maximum number of non-overlapping occurrences of a P within D. As an example ‘cdcdcdcdc’.count(‘cdc’) returns 2.

1. Modify the brute-force pattern-matching to return non-overlapping occurrences of a P

within D.

2. Similar to the previous question (Problem 1.2.1), do the same on the Boyer-Moore program.

3. Similar to problem 1.2.1, modify the KMP program.

Problem 2: Experimental Analysis of Pattern-Matching Algorithms

[20 pt.]

Perform an experimental analysis of pattern matching algorithms in terms of:

1. Number of character comparison: Perform an experimental analysis of the efficiency of

the brute-force, the KMP and Boyer-Moore pattern matching algorithms for varying-length

patterns.

2. Relative speed comparison: Perform an experimental comparison of the brute-force, KMP,

and Boyer-Moore pattern-matching algorithms. Run each algorithm against large text documents using varying-length patterns and report the relative running times.

Assignment № 5 Page 2

Problem 3: Matrix-chain Multiplication

The matrix-chain multiplication problem: Given a chain of < D1, D2, . . . , Dn > of n matrices fully

parenthesize the product < D1 ·D2 · · · Dn > in a way so that the number of scalar multiplications

is minimized. Each Di has a pi−1 × pi dimension and i = 1, 2, . . . , n.

1. The Brute-Force: [10 pt.]: Implement a Python program to solve the matrix-chain multiplication problem by the brute force algorithm.

2. Bottom-up Dynamic Programming [20 pt.]: Implement a Python program to solve the

matrix-chain multiplication problem using bottom-up dynamic programming approach.

3. Dynamic Programming with Memoization [Extra Credit, 10 pt.]: Implement a Python program to solve the matrix-chain multiplication problem using dynamic programming with

memoization.

Problem 4: Longest Common Sub-sequence (LCS) Problem [20 pt.]

Implement a Python program to solve LCS problem using dynamic programming. Run your

program to find the best sequence alignment between DNA strings. Show your results.

Longest Common Sub-sequence (LCS) problem: Given two character strings over some

alphabet, find a longest string that is a sub-sequence of given two strings.

Data source: https://www.ncbi.nlm.nih.gov/genbank/

Directions

Please follow the syllabus guidelines in turning in your homework. While testing your programs,

run them with a variety of inputs and discuss your findings. This homework is due Sunday, Nov

14, 2021 10:00pm. OBSERVE THE TIME. Absolutely no homework will be accepted after that

time. All the work should be your own.

Assignment № 5 Page 3

Appendix

Python program for the Brute-Force pattern-matching algorithm

1 # Brute force

2 def find_brute (T , P ) :

3 n , m = len( T ) , len ( P )

4 # every starting position

5 for i in range (n – m +1) :

6 k = 0

7 # conduct O(k) comparisons

8 while k < m and T [ i + k ] == P [ k ]:

9 k += 1

10 if k == m :

11 return i

12 return -1

Python program for the Boyer-Moore pattern-matching algorithm

1 # Boyer – Moore

2 def find_boyer_moore (T , P ) :

3 n , m = len( T ) , len ( P )

4 if m == 0:

5 return 0

6 last = {}

7 for k in range ( m ) :

8 last [ P [ k ]] = k

9 i = m -1

10 k = m -1

11 while i < n :

12 # If match , decrease i,k

13 if T [ i ] == P [ k ]:

14 if k == 0:

15 return i

16 else :

17 i -= 1

18 k -= 1

19 # Not match , reset the positions

20 else :

21 j = last . get ( T [ i ] , -1)

22 i += m – min (k , j +1)

23 k = m -1

24 return -1

Assignment № 5 Page 4

Python program for the Knuth-Morris-Pratt pattern-matching algorithm

1 # KMP failure function

2 def compute_kmp_fail ( P ) :

3 m = len( P )

4 fail = [0] * m

5 j = 1

6 k = 0

7 while j < m :

8 if P [ j ] == P [ k ]:

9 fail [ j ] = k +1

10 j += 1

11 k += 1

12 elif k > 0:

13 k = fail [k -1]

14 else :

15 j += 1

16 return fail

1 # KMP

2 def find_kmp (T , P ) :

3 n , m = len( T ) , len ( P )

4 if m == 0:

5 return 0

6 fail = compute_kmp_fail ( P )

7 # print ( fail )

8 j = 0

9 k = 0

10 while j < n :

11 if T [ j ] == P [ k ]:

12 if k == m -1:

13 return j – m +1

14 j += 1

15 k += 1

16 elif k > 0:

17 k = fail [k -1]

18 else :

19 j += 1

20 return -1

Assignment № 5 Page 5