HARVARD CS 124
                    DATA STRUCTURES AND ALGORITHMS
                          MIDTERM EXAMINATION                     
                              Spring 1996
                    (Exam Date: March 18, 1996)


This is a CLOSED BOOK exam.  You have 80 minutes to answer the following
questions in any order.  The number in brackets is the estimated time in
minutes to answer the question and is the maximum score for that question.
The highest grade is therefore 80.

GIVE REASONS FOR ALL YOUR ANSWERS.  CORRECT ANSWERS WITHOUT REASONS WILL NOT
RECEIVE FULL CREDIT.  FOR PROBLEMS WHICH ASK YOU TO GIVE TWO OR THREE
ANSWERS, EACH ADDITIONAL CORRECT ANSWER YOU PROVIDE WILL EARN CREDIT OF
INCREASING WEIGHT.


(1) [15] Given two strings (X[1], X[2], ..., X[m]) and (Y[1], Y[2], ...,
Y[n]), we have learned from the class that, by using dynamic programming,
the length C(m,n) of their longest common subsequences can be computed in
O(m*n) time.  Now given three strings (X[1], X[2], ..., X[l]), (Y[1], Y[2],
..., Y[m]) and (Z[1], Z[2], ..., Z[n]), you will use the same dynamic
programming technique to compute the length C(l,m,n) of their longest common
subsequences.

(1a) [10] Write down the recurrence relation on C(i,j,k) for the
three-string algorithm.  
(Hint: Recall the recurrence relation for the two-string case, 
C(i,j) = C(i-1,j-1) + 1, if X[i] = Y[j]
         or 
         max (C(i-1,j), C(i,j-1)), otherwise

(1b) [5] Give the running time of the three-string algorithm.

*************************
Solution:

(1a)
C(i,j,k) = C(i-1,j-1,k-1) + 1, if X[i] = Y[j] = Z[k], or  
	   max (C(i-1,j,k), C(i,j-1,k), C(i,j,k-1)), otherwise
           
(1b) O(l*m*n)
*************************

(2) [8] Consider the following binary search tree, which has five
internal nodes 1, 2, 3, 4 and 5, and six leaves a, b, c, d, e and f:

       1
      / \
     a   2
        / \
       b   3
          / \  
         c   4
            / \ 
           d   5
              / \
             e   f

Note that the tree has height 5, and is not balanced.  Demonstrate two
consecutive tree rotation operations that will reduce the tree height 
from 5 to 3.

*************************
Solution:

Rotation 1:

          2
        /  \
       1    3
      / \  / \
     a  b c   4
             / \
            d   5
               / \
              e   f

Rotation 2:

              3
            /   \
           2     4
          / \   / \
         1   c d   5
        / \       / \
       a   b     e   f

*************************

(3) [13] Suppose that we are interested in efficient searching of an
N-element set.  Hashing tables and binary search trees are two data
structures that could be used.

(3a) [8] For the purpose of searching, describe two scenarios under which
hashing is more desirable than binary search.

(3b) [5] For the purpose of searching, describe one scenario under which
binary search is more desirable than hashing.

*************************
Solution:

(3a) Scenario 1: The given elements are not ordered.  Scenario 2: N is so
large that the average theta (lg N) time for binary search can be
significant

(3b) scenario 1: The search time in the worst case is guaranteed not
to exceed O(lg N), while the worst case time for hash can be O(N).
*************************

(4) [10] Consider a radix sort of N numbers of b bits each.  The radix sort
will view each number as having b/r digits of r bits each, and perform b/r
passes of counting sort.  

(4a) [5] What is the running time of this radix sort?

(4b) [5] What is the required storage, in bits, for this radix sort?

*************************
Solution:

(4a) O((b/r)*(N + 2**r))

(4b) N*b + (lg N)*2**r bits
*************************

(5) [12] We have studied several algorithms in class based on the
divide-and-conquer principle.  For each of these algorithms, either the
dividing or combining phase must be nontrivial in computation, otherwise the
whole algorithm would be trivial.

(5a)[6] List two algorithms for which the combining phase is more expensive
than the dividing phase.

(5b)[6] List two algorithms for which the dividing phase is more expensive
than the combining phase.

*************************
Solution:

(5a) Strassen's fast matrix multiplication and merge sort

(5b) Quicksort and linear-time median algorithm
*************************

(6) [9] John Smith has learned the divide-and-conquer principle, and wants
to apply it to searching.  To search a key in a set of N keys, his algorithm
will divide the set into two subsets of N/2 keys each and then search each
subset recursively.  What is the running time of his algorithm?  Show your
analysis.

*************************
Solution:

T(N) = 2T(N/2) + O(1).
Thus T(N) = O(N).
*************************

(7) [13] Suppose that a cousin of V. Strassen comes up with an ingenious 3x3
matrix multiplication algorithm which takes 16 scalar multiplications and 50
scalar additions.  Moreover, suppose that this algorithm works even if the
entries in the 3x3 matrices are themselves matrices.

(7a) [3] Outline how you could use this 3x3 result to devise a fast NxN
matrix multiplication algorithm for large N?

(7b) [10] What is the running time of your NxN matrix multiplication
algorithm?


*************************
Solution:

(7a) Divide each input matrix into a 3x3 matrix of (N/3)x(N/3)
entries.  Use the 3x3 algorithm on these 3x3 matrices, recursively.

(7b) T(N) = 16*T(N/3) + O(N**2)

Since N**2/[N**(log3 16)] = O(N**-epsilon) which epsilon ~ .5, 
T(N) = N**(log3 16) ~ O(N**2.5)
by the Master method.

*************************