Honours Projects Winter 2001, proposed by Prof. Lucia Moura
e-mail: lucia@site.uottawa.ca
home page: http://www.site.uottawa.ca/~lucia

Check the general rules at the SITE project page.
Please contact me by e-mail if you are interested in any of this projects or if you have questions about the proposed projects:

Project code: moura1
Project title: How to share a secret
Status:  not available

Description: ``A bank teller has a vault that must be opened every day. The bank employs three senior tellers, but they do not want to trust any individual with the key combination. Hence, they would like a system whereby any two of the three senior tellers can gain access to the vault.'' (example by Gopalakrishnan and Stinson, "Applications of designs to cryptography", The CRC Handbook of Combinatorial Designs, 1996).
In general, partial information (shadows) should be shared among w participants, so that any t participants together can compute a key to a secret information, but no group with t-1 participants can. A (t,w,v)-threshold scheme is a way of distributing v shadows in this way, and the threshold scheme is perfect if it has the additional property that no group of t-1 participants can conspire to determine any partial information regarding the key.  The bank teller example requires a (2,3,v)-threshold scheme for suitable v (the larger the number of possible shadows, the larger the number of possible keys). We will look at the construction of a perfect (3,3,v)-threshold scheme for arbitrarily large v. This threshold scheme has v possible shadows and v-2 keys; as described above,  the 3 people can put their shadows together to discover the key, but if 2 people put their shadows together, any of the v-2 keys are equally likely to be the secret key.
In this project, you are going to study threshold schemes in general and implement a particular one, namely a (3,3,v)-threshold scheme based on a combinatorial construction by Wilson (see: Stinson and Vanstone, "A combinatorial approach to threshold schemes", SIAM J. Disc. Math. 1 (1988)). You are going to write two main tools. The first one will produce a complete (3,3,v)-threshold scheme, given a suitable v. The second tool will efficiently calculate the key, given 3 shadows and suitable v, in average time O(1). The construction uses arithmetic module p (for p a prime number), so you will have to implement basic arithmetic in a finite field of prime order.

Number of students required: 1 or 2
Additional information: You will need some basic algebra knowledge in order to implement arithmetic over a finite field of order p (F_p) and understand Wilson's construction. You will write a C++ class for handling basic operations over F_p. Operations should be carefully implemented in order to guarantee the O(1) average time for key calculation.  Extense bibliography on the topic can be found online . We will provide background reading. Regular meetings with supervisor will be scheduled.



Project code: moura2
Project title: Generating erasure-resilient codes for large disk arrays
Status:  not available

Description: The disk array architecture organizes many independent small disks into one large logical disk. Suppose each part (some bits) of a piece of information is stored in each of the physical disks. In order to tolerate disk failures, redundancy must be incorporated. Thus, some extra disks are added to store redundant information so that lost information can be quickly retrieved and the contents of the erased disks can be reconstructed. This forms the so-called RAIDs (redundant arrays of inexpensive disks). Error-correcting codes could be used but erasure-resilient codes provide better solutions for this problem (there is a tradeoff between reliability and check-disk overhead/update penalty). It has been shown that certain combinatorial designs can be used to construct the matrix defining optimal erasure-resilient codes.
In this project, you will design and experiment with algorithms for generating certain triple systems (you will learn what these animals are!) for the construction of erasure-resilient codes for RAIDs. The general construction method to be used is called hill-climbing, which is a general randomized search algorithm. Care should be taken in the use of efficient data structures for this particular problem. Different heuristics should be tried and compared through experiments.
You will spend some time understanding the disk array application and the use of codes and combinatorial designs in this application. Then you will learn the general framework of the hill-climbing algorithm, and work on a careful implementation of this algorithm. A lot of time will be devoted to the design of this algorithm, its implementation and experimenting with variations.

Number of students required: 1 or 2
Additional information: Basic linear algebra knowledge is required for understanding linear codes. Good grasp of data structures and practice in programming is required for the design and implementation of the algorithm. We will provide background reading. Regular meetings with supervisor will be scheduled.



Project code: moura3
Project title:  Covering arrays for software test generation
Status:  not available

Description: You must design a bunch of tests for a program, so as to catch most errors while keeping the number of tests not too high.  Suppose the program has 4 input parameters and each parameter can be assigned one of 2 possible values.  For example, a program computes life insurance premiums based on the following parameters: gender (M/F), smoker (Y/N), age (young/old) and serious disease among ancestors (Y/N). There is experimental evidence that in many applications most errors occur from the interaction between two parameters (for example, there is an error in the program when calculating premiums for non-smoker males regardless of the other parameter values). Therefore, our objective is to generate  the minimum number of tests, while guaranteeing pairwise coverage of parameter values.
 Example:  In the example above, testing all possible parameter combinations would require 2^4 = 16 tests. But we can achieve pairwise coverage with only 5 tests:
 
test 1 test 2 test 3 test 4 test 5
gender male female female female male
smoker smoker non-smoker non-smoker smoker non-smoker
age old young old young young
 anc. disease yes yes no no no

In this project, you are going to write a tool for generating a set of tests guaranteeing some desired coverage while minimizing the number of tests. The input parameters for your program will be the number of parameters and the number of values for each parameter; the user will also specify the desired coverage (complete pairwise coverage or coverage of specified pairs). We are going to experiment with 2 types of algorithms: one is based on an integer programming (IP) formulation for this problem (your program will create an integer programming model which will be fed into a package for its solution) and the second will be a greedy heuristic (GH) implemented by you. The IP algorithm will give the minimum number of tests but it is expected to take a long time, while the GH will be quick but may generate too many tests. The idea is to compare both methods in terms of running time spent in the generation and quality of solution (how big is the test set generated). If two students are involved in this project, the experimental comparison can be done more professionally.
Number of students required:  1 or 2
Additional information: Experience with programming and algorithms is required. You will need C or C++ in order to communicate with the Integer Programming solver (Cplex).  Note that the problem of designing an efficient test generator is a hard research problem!!! The objective of this project is to play with two possible solution methods and see how far each method can go. In order to do a reasonable comparison between the methods, the test problems should be carefully chosen. So, we could have a much nicer project with 2 participants sharing the programming tasks. I strongly encourage that the pair of students know each other well and have previous experience working or studying together. We will provide background reading. Regular meetings with supervisor will be scheduled.



Project code: mouraN, N=4,5,6,...
Project title: ?????
Status:  available
Description:  The problems described above involve some type of combinatorial design and some combinatorial optimization algorithm to solve the problem. If you have interest in a similar type of problem in another application area, you can contact me to discuss the possibility of a different project. I can provide you with survey papers on applications of combinatorial designs to computer science, communications and cryptography. I may also come back with new ideas for projects from this workshop  in November.