bioinformatics
»» home :: projects ::
``Don’t say you don’t have enough time. You have exactly the same number of hours per day that were given to Helen Keller, Pasteur, Michaelangelo, Mother Teresa, Leonardo da Vinci, Thomas Jefferson, and Albert Einstein.'' H. Jackson Brown, Jr., writer

CSI 4900 Projects

  1. Evaluating space-filling curve representations of protein sequences with convolution and recurrent networks.
  2. Splice site prediction in mitochondrial genomes
    • End goal (long term, possibly beyond the scope of a CSI 4900 project):
      • Creating a tool to predict splice sites specifically for mitochondrial (mt) genomes.
        • There are many splice site predictions tools, but I don’t know of any tool specific to the organelle genomes.
        • By splice site prediction, we mean the bondary between exons and introns in eukaryotic genomes.
      • Steps:
        • Creating a comprehensive, large scale, data set of splice sites junction.
          • This would be done using RNA-Seq data.
            • RNA-Seq data is the high-throughput sequencing of expressed DNA.
            • This involves mapping the RNA-Seq reads to the reference genome using existing tools.
        • Creating a tool
          • Using deep learning (CNN, BLSTM, etc.) to predict splice site junctions.
        • Future work,
          • Adding expression level information to quantify the expression of the transcrips.
          • Adding exon/intron predictors to possibly improve accuracy.
      Many papers have been published on the subject, including this one.

Past Projects:

  1. D3.js visualisation of sequence and structure RNA motifs

    2014 F, Joseph Sleiman

  2. An Efficient and Effective Algorithm to Evolve Regular Expressions

    2012 F, Manuel Belmadani

  3. Application Web pour la gestion des ressources du 412e Escadron

    2011, Jean-Philippe Pellerin

  4. Iterative Maximum Parsimony Multiple Sequence Alignment (ParAli)

    2010 W, Derek O'Brien

  5. A Genetic Programming Approach to RNA-RNA Interaction Motif Discovery (GP-RNA^2)

    2009 F, Christopher Saunders

  6. Approximate Matching of RNA Secondary Structure Expressions Containing Pseudoknots (pkSeed)

    2006 F, Penny J.X. Pan

  7. Progressive Simultaneous Alignment and Structure Prediction of Multiple RNA Sequences (hD)

    2006 W, Luke Cen

  8. Implementation of Range Minimum Query Algorithm (RMQ)

    2006 W, Ayse Abacioglu

  9. Approximate Matching of RNA Secondary Structure Expressions (RNA Matching)

    2005 W, Sol Ackerman

  10. Implementing a Parallel Version of Dynalign for the SunFire V880 architecture (pD)

    2004 W, Philippe Desjardins

  11. A Genetic Programming Approach to RNA Secondary Structure Motif Discovery (GP)

    2003 F, Robert Collier

  12. Simulating Genetic Drift (Sim)

    2003 F, Alain Gagnon

  13. RNA Secondary Structure Viewer (RS2V)

    2003, F, Dina Bilenkis

  14. String Algorithms in Java (Suffix Trees)

    2003 W, Daniela Cernea

  15. Learning Representations of Protein Inter-Domain Linkers Using Inductive Logic Programming (Linkers)

    2003 W, Patrick Wisking

  16. Intelligent Agents for Updating Biological Databases (AgentDB)

    2003 W, Navneet Bhalla

  17. Simulator for the TC-1101 Computer (VM)

    2002 W, Yvgeniya Lozdernik

  18. Protein Viewer/Modeler Written with Java 3D (Java Protein 3D Viewer)

    2002 W, Andrew Henry, Elton Lum and Devin Kennedy