Speaker: David Sankoff, University of Ottawa Time: Tuesday, November 26, 2:30 p.m. Place: room: SITE 5084, SITE building, University of Ottawa Title: GENOME RECONSTRUCTION WITH PARALOGY Abstract: A genome can be modeled as a set of strings (the set of chromosomes) over an alphabet A -- the set of different genes. The evolutionary divergence of two genomes can be represented by a series of edit operations of a few types which may involve arbitrarily long substrings of these chromosomes. Good algorithms for calculating edit distances based on these operations are known only when the genomes are permutations, i.e. when each genome contains exactly one of each of the |A| genes. We address the problem of reconstructing ancestral genomes, given N present-day genomes and a fixed phylogenetic tree T, in full generality, i.e where the strings making up a present-day or reconstructed genome may include any number of occurrences (copies, paralogs, members of the same gene family) of each gene g in A. As supplementary information, we are given a gene tree T(g) for each g in A, representing the degree of evolutionary relationship among all the instances of gene g in all the N genomes. To formulate and solve a combinatorial optimization version of the reconstruction problem, we draw on three separate and hitherto unrelated types of analysis. The first is reconciliation analysis, which explains a gene tree T(g) in terms of the species tree T and an optimal set of duplication and loss events. The second is exemplar analysis, which is a way of extending genomic edit distances on permutations to distances on general genomes with no constraints on the number of occurrences of each gene. Finally, we use the concept of the median genome, which is an extension of genomic distance to more than two genomes. These three techniques fit together as a natural way to approach the problem of reconstructing the genomes at the ancestral nodes of T.