List of suggested projects (2003)

The projects proposed below are problems that I am particularly interested on.
You may be in touch with me throughout the project in order to help you with the design of your algorithm.
Through email I can send you the links to the references cited here which I have but should not be publicly
available (the authors authorised the use for this course).

Please contact me by email saying the first two projects you would like in order of preference.
I'll have to make sure there are no repeated projects, and will respect your choice if possible.

1) The next two projects involve permutation codes.

Definition: a permutation code of length n and distance d is a set B of permutations from
some fixed set of n symbols such that the Hamming distance between any two distinct
x,y in B is at least d.
M(n,d) is the maximum number of codewords on a permutation code of length n and distance d.

We know that M(n,n-1)=n(n-1) when n is a prime power.
M(6,5)=18; M(10,9)>= 35 (best bound found by computer); M(12,11)>=60 (very weak bound).
So the first interesting cases to solve for M(n,n-1) are n=10 and n=12.
The paper below gives you more results.

Reference:
Chu, Colbourn and Dukes, "Constructions for permutation codes in powerline communications",
to appear in Designs, Codes and Cryptogtaphy (preprint).

For both 1A and 1B you can (implicitly) consider the following graph:
verticies are permutations, and two vertices from an edge if the permutations
have distance at least d. The problem of determining M(n,d) reduces to fiinding
the largest clique in this graph.

1A) Permutation codes: comparison of several greedy heuristics.
Compare various greedy clique finding algorithms.
Each algorithm adds vertices according to the following greedy heuristics:

1. Choose the next possible vertex in lexicographical ordering.
2. Choose the next possible vertex in Johnson-Trotter ordering.
3. Choose the next possible vertex in order of largest degree in the remaining subgraph considered.

Do a carefull implementation of the 3 heuristics and perform a thorough testing.
In order to deal with interesting cases you must be careful with memory issues and use
appropriatte data structures.
Hopefully imporvememnts on the bounds in the paper above can be acchieved.

1B) Permutation codes: heuristic searches

Fix M,n,d and design algorithms to check whether there exists a permutation code of length n, distance d
and M codewords.
Suggested methods: pick two out of Hillclimbing, Tabu Search or Simulated Annealing.

Suggested neighbourhood funtion:
Represent the code as an M by n array.
A neighbour could be one obtained by a single swap (transposition) in one of the permutations (one
fo the rows of the matrix).
The minimum distance may be very small to start with, and we want to increase it until it reaches d.

2) Intersecting partition systems: finding all non-isomorph systems which are maximal (setinclusionwise)

We will be dealing with partitions of a set of size n into g equal parts, so n=k*g.
Two partitions are said to intersect if they share a common part.
Example: P1={{1,2},{3,4},{5,6}} and P2 ={{1,3}{2,4},{5,6}} intersect since they share the part {5,6}.
A partition system (a set of partitions) I is intersecting if any two partitions in I intersect.

Problem: Given k and g, determine all non-isomorphic intersecting partition systems that are
maximal with respect to set inclusion.

Approach:
1) Build a graph with partitions being the vertices and partitions being connected by an edge
if they intersect.
2) Find all cliques in the graph that are maximal with respect to set inclusion.
3) Remove isomorphic copies of partition systems by coverting a partition system into
a graph a using Nauty software for ismorphism detection. Nauty software is available
from the web page of the researcher Brendan McKay in Australia.

I'm particularly interested in having a lot of information on k=2 and 3.

References:
Consult with me for specifics of the problem. I may give you a paper, but it is not so relevant to understand the project.
Use textbook for an algorithm for finding all cliques.
Consult the nauty user guide for graph isomorphism testing (may also ask me).

3) Finding covering arrays via Tabu Search

A covering array CA(N,k,g) is an N by K array A with symbols from the alphabet G={0,1,...,g-1}
such that for any two rows i, j and any two numbers a,b in G, there exists at least a column
l such that a_i,l= a and a_j,l=b.
We are interested in CA(N,k,g) with the smallest possible N.

Please see the undergraduate project "moura3" to see an application of covering arrays
to software testing. Covering arrays are used to design a test set for software which garantees
that pairwise interactions of input parameters for the software are covered/tested.

In this project we'll concentrate on "almost uniform" arrays, that is arrays with rows having
almost the same number of occurrence of each symbol. So that the number of occurences
of a symbol is either ceil(N/g) or floor (N/g).

Given N,g,k, design a Tabu Search algorithm for constructing an almost uniform CA(N,k,g).
Suggested neighbourhood function:
we can move from one array to another array by swapping distinct symbols in a row.

Compare several different combinations of variations on the Tabu search algorithm:
* exhaustive neighbourhood search versus sampling the neighbourhood
* variations on the lifetime L
* variations on c_max

Careful data structures should be designed to make sure the changes on the objective
function (measuring degree of infeasibility) can be quickly computed.

There exists a conjecture that there exists always an almost uniform one that is
optimal (has minimum N among). I'd like to verify it for small cases with small g=3,4,...

Reference:
1) John Stardom, "Metaheuristics and the search for covering and packing arrays",
Master's thesis, Simon Fraser University, 2001.
2) Textbook, for Tabu Search description.
(I may give you other references)