CS124
Problem Set 6
Due on Friday, March 21th, 1997 in Pierce or 10DeWolfe #33
Please write the names of your collaborators on your problem sets!
1. The Name Game (40 points)
The problem is that I always forget people's names!. Suppose I meet a person
whose name is the string S. I am allowed to ask her substring queries
of the form "Does the substring x occur in your name S?"
and she will respond yes or no. My challenge is to figure our her name
using as few queries as possible. Assume that all characters in the name
are from the 26-character alphabet {a,b,c...z} (take no offense
if your name has an accent agu).
(A) Suppose I knew that her name was n characters long.
Give a lower bound on the number of queries.
(B) Now give an algorithm to find this person's name. Prove your
correctness, and analyze time and space requirements.
(*C) Suppose you did not know the length of the name. Give and analyze
a new strategy.
(**D) Suppose you were allowed to query whether "x is a subsequence
of S." How does this change the problem? Give an algorithm
and analyze it.
2. The Four Russians trick. (5 points)
HT is offering extra credit to anyone who gives a lower bound on LCS. I
have found one algorithm that is faster than O(mn); it achieves
the speedup by using a technique called the Four Russians Algorithm (ask
HT the story behind this). This problem should give you some insight into
that technique and perhaps help you with the extra credit. This problem
will take a bit of work; skip it if it becomes a chore.
Given two n x n bit matrices, A,B, find their product AB=C in time
O(n3 / log n). Note that this bound can even be lowered
to O(n2 / log n) if we allow bit vector operations (as
is the case with most processors). A bit matrix is a matrix whose elements
are just ones and zeros (bits). The problem takes a completely different
approach than Strassen or Winograd.
Hint: The first insight is to break the matrix up into "chunks"
of size log n x n. You can assume for simplicity sake that log
n divides n evenly. The next step is to notice how you can use
a table lookup to speed up some basic operations. The point here is that
it might be worthwhile to amortize the cost of precomputing all possibilities
(if say the total number of possible outcomes is small), and then use lookups
to do the computation you want. Now find a spiffy way to create this table--
problem (3) might help.
Aside: This result is far more general. I believe Prof Valiant proved that
any computation that takes O(n2) on a turing machine
can be done in O(n^2 / log n) on a machine that allows random access
to memory using the same trick as above.
3. Bit by Bit (25 points)
You want to enumerate all the numbers from (0...2k -1), that is
all, k bit words in such a way that only one bit changes between
any two consecutive numbers (Gray code). For example, for k=3, there
is the sequence:
000 001 011 010 110 111 101 100
0 1 3
2 6 7
5 4
This type of enumeration is useful in CS141 (K-maps), and in problem (2).
Give an algorithm that produces this sequence
efficiently.
4. Chomsky Normal Form (30 points)
From CS121, consider a context free grammar in Chomsky Normal Form, G
= {N, E, P, S} where
-
N - finite set of non terminal symbols
-
E - finite set of terminals
-
P - set of production rules of the form A->BC
or A-> a where B,C are in N and a is in E
-
S - starting symbol from N
Write an O(n3) algorithm to determine whether a string
w=a_1...a_n can be generated by the grammar G. Use dynamic programming.