CSI5388: Topics in Machine Learning
Instructor Nathalie Japkowicz
Office: STE 5029
Telephone: 562-5800 ext. 6693 (Note: e-mail is more reliable)
Meeting Times and Locations
- Location: MCD 121
Office Hours and Locations
- Times: Monday, 2:45pm-3:45pm and Wednesday 1:00pm-2:00pm,
or by appointment
- Location: STE 5029
Machine Learning or data Mining are the areas of Artificial Intelligence
concerned with the problem of building computer programs that automatically
improve with experience. This course will focus on advanced issues from these
fields. Issues such as Feature Selection, Class Imbalances, Cost-sensitivity,
One-class learning, Classifier Combination, Performance Evaluation and
Visualization (and other topics) will be discussed in depth. Students will be
expected to read and criticize articles from the recent literature, complete
practical assignments, and proposeand complete a research project.
csi5387, although the two courses can be taken at the same time with
permission of the instructor.
Machine Learning is the area of Artificial Intelligence concerned with the
problem of building computer programs that automatically improve with
experience. This course will cover, in depth, some advanced topics in the
The course will consist of a mixture of regular lectures and student
presentations. The regular lectures will cover broad introductions to some of
the major areas of research currently under investigation. The student
presentations will be based on recent research papers that describe new
results in these areas.
Students will be evaluated on short written commentaries of research papers
(20%), on oral presentations of research papers (20%),
and on a final class project of the student's choice (60%). For the class
project, students can propose their own topic or choose from a list of
suggested topics which will be made available at the begining of the
term. Project proposals will be due in mid-semester. Group discussions are
highly encouraged for the research paper commentaries and students will be
allowed to submit their reviews in teams of 3 or 4. However,
projects must be submitted individually.
- Week 1: Review of Machine Learning's main concepts
- Week 2: Topics in the Evaluation of Machine Learning algorithms
(ROC Curves, Cost Curves) based on:
ROC Graphs: Notes and Practical Considerations for Data Mining Researchers, by Tom Fawcett, Submitted to Knowledge Discovery and Data Mining, 2003.
- Week 3: Intro to Genetic Algorithms (Fitness Functions, Genetic Operators,
- Week 4: Intro to Unsupervised Learning
- Week 5: Combining Supervised and Unsupervised Learning, Intro
to Radial-Basis Functions.
- Week 6: Advice on how to formulate a project
proposal, presentation by a librarian on ressources available for ML
- Week 7: Topics in Feature Selection (Wrapper, Filter and Hybrid Methods)
- Week 8: BREAK
- Week 9: Single Class Learning Methods (Autoassociators, single-class SVMs)
- Supervised versus Unsupervised Binary-Learning by Feedforward Neural Networks , Japkowicz, N., Machine Learning Volume 42, Issue 1/2, pp. 97-122, January 2001.
- One-Class SVMs for Document Classification
Larry M. Manevitz, Malik Yousef; 2(Dec):139-154, 2001.
- Combining One class classifiers, David M. J. Tax and Robert P. W. Duin, Lecture Notes in Computer Science, 2096, 2001
- Week 10: The Class Imbalance Problem
- Week 11: Support Vector Machines and Set Covering Machines
- Week 12: Combination of Classifiers (Bagging, Random Forests, Boosting,
Error-Correcting Codes, Mixtures-of-Experts)
- Week 13: Project Presentations
A compilation of selected research papers from the recent literature.
- Machine Learning, Tom Mitchell, McGraw Hill, 1997.
Introduction to Machine Learning, Nils J. Nilsson (Draft of a Proposed
New Textbook available on the Web)
- Data Mining, Witten, I & Frank, E., Morgan-Kaufmann, 2000.
Machine Learning Ressources on the Web: