Welcome to CSI5387, Data Mining and Machine Learning: Concepts, Techniques, and Applications

Offered in the Fall 2012 semester (Sep. – Dec. 2012):
Fri 4-7PM, CBY B012

Class starts Sep. 7.
CBY building is in the southern part of the uOttawa campus.
Check the map at http://www.uottawa.ca/maps/


Instructor: Dr. Stan Matwin

Office hours: TBA, for now, just send me an email for a meeting: stan@eecs.uottawa.ca

This course is changed from year to year, and the existing
slides may be modified.

Course syllabus

If you have not taken an AI class:

I recommend you read the following chapters from "AI: the Modern Approach" by Russeell, Norvig, 3rd edition:




Assignment 1 is here, due Oct 8, and its dataset is here
Solution for Assignment 1 is here.

Assignment 1 marks are here. Some marks are still TBD.


Here is project statement

project presentation SM
project presentation Erico
Here are Cohen papers 1 and 2
Here is our paper.
dataset Estrogen
dataset Triptan
dataset Oral Hypoglycemics
dataset Beta Blockers

Paper 1
Paper 2

Course marks (including project and final exam) will be here

Final exam:

Dec. 14,4-7PM, CBY B205

2010 final exam
2009 final exam

Course marks


Course material for download

set 1
additional material for class 1 and 2 (by V. Kumar, University of Minnessota)
set 2
additional material on instability of DTs for class 3
additional slides on instability of DTs for class 3 (with thanks to Rob Holte and Ken Dwyer, U of A)
additional material on PAC for class 3
set 3
set 4
set 5
set 6
set 7
set 8 
cost curves slides
by Dr.C. Drummond
set 9

set 10





I do not recommend a single textbook for this class because there isn’t one that will cover all our needs. I will use material from three books:


Mitchell, T., “Machine Learning”. This is an excellent book, but at this point quite a bit outdated. Tom promises to finish a new edition, and we are ll waiting!!

Flach, "The Art and Science of Algorithms that Make Sense of Data", to appear, Cambridge University Press, Sep. 2012. See here for the first two chapters list of contents.

Witten, I, Frank, E., “Data Mining: Practical Machine Learning Tools and Techniques", Second Edition (Morgan Kaufmann Series in Data Management Systems). A great companion to the practical component of the class and to the WEKA system that we will use.

  Torgo, Data Mining with R: Learning with Case Studies, (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series), there is a Kindfle edition

Hastie, T. Tibshirani, R. Friedman, J., "The Elements of Statistical Learning", an excellent textbook for part of the material, with a highly mathematical slant

            Cornuéjols, A., Miclet, L. “Apprentissage artificiel: consepts et algorithmes" (don’t worry, you can make it thru this class without knowing French)
            Han, J., Kamber, M.  "Data Mining. Concepts and techniques"


And no panic, you do NOT need to buy five or six books. I am working to have all of them on reserve in the uOttawa library.  Some of these books are available s e-books, and only chapters will be recommended as additional reading The main material for the course will be my .ppt slides and the papers listed below.