CSI5387 Concept Learning Systems/Machine Learning


Instructor

Nathalie Japkowicz
Office: STE 5-029
Phone: 562-5800 ext. 6693
E-mail: nat@site.uottawa.ca

Meeting Times and Locations

  • Time: Mondays 1:00pm-4pm
  • Location: CBY B-202

Office Hours and Locations

  • Times:
    Mondays 4:15pm-5:15pm
    Thursdays 11:45am-12:45pm
  • Location: STE 5-029;

Overview

Machine Learning is the area of Artificial Intelligence concerned with the problem of building computer programs that automatically improve with experience. The intent of this course is to present a broad introduction to the principles and paradigms underlying machine learning, including presentations of its main approaches, discussions of its major theoretical issues, and overviews of its most important research themes.

Course Format

The course will consist of a mixture of regular lectures and student presentations. The regular lectures will cover descriptions and discussions of the major approaches to Machine Learning as well as of its major theoretical issues. The student presentations will focus on the most important themes we survey. These themes will mostly be approached through recent research articles from the Machine Learning literature.

Evaluation

Students will be evaluated on short written commentaries and oral presentations of research papers (20%), on a few homework assignments (30%), and on a final class project of the student's choice (50%). For the class project, students can propose their own topic or choose from a list of suggested topics which will be made available at the beginning of the term. Project proposals will be due in mid-semester. Group discussions are highly encouraged for the research paper commentaries and students will be allowed to submit their reviews in teams of 3 or 4. However, homework and projects must be submitted individually.

Pre-Requisites

Students should have reasonable exposure to Artificial Intelligence and some programming experience in a high level language.

Required Textbooks

Additional References .

 

Other Reading Material

Research papers will be available from Conference Proceedings or Journals available from the Web. 

(Links appear in the Syllabus table below, in the Readings column)

List of Major Approaches Surveyed

  • Version Spaces
  • Decision Trees
  • Artificial Neural Networks
  • Bayesian Learning
  • Instance-Based Learning
  • Support Vector Machines
  • Meta-Learning Algorithms
  • Rule Learning/Inductive Logic Programming
  • Unsupervised Learning/Clustering
  • Genetic Algorithms

List of Theoretical Issues Considered

  • Experimental Evaluation of Learning Algorithms
  • Computational Learning Theory

List of Major Themes Surveyed

To expose you to current topics of interest in the data mining and machine learning community, I have chosen to study the topics listed in the following poll of Dec 1-5, 2008.

KDnuggets : Polls : Important Data Mining Topics

Which of the following Data Mining (DM) topics are most important for your work or research? (Choose top 3) [113 voters]

Scaling up DM algorithms for huge data (46)

41%

Mining text (33)

29%

Automating data cleaning (30)

27%

Dealing with unbalanced and cost-sensitive data (29)

26%

Mining data streams (20)

18%

Mining links and networks (19)

17%

Unified theory of DM (18)

16%

DM for biological problems (16)

14%

DM with privacy (10)

8.9%

Mining images (8)

7.1%

DM for security applications (6)

5.4%

Distributed (multi-agent) DM (4)

3.6%

Other (21)

8%

Course Support:

·         Schedule of Presentations

·         Timetable for Homework

·         Suggested Outline for Paper Commentaries

·         Project Description

·         Guidelines for the Final Project Report

Machine Learning Ressources on the Web:

·         David Aha's Machine Learning Resource Page

·         UCI Machine Learning

·         WEKA

·         Free Book: Information Theory, Inference, and Learning Algorithms, David MacKay



Syllabus:

Week

Topics

Readings

Week 1:

Jan 12

Introduction 1: Organizational Meeting

Introduction 2: Overview of Machine Learning

Texts:
Witten & Frank: Chapter 1

Week 2:

Jan 19

Approach: Versions Space Learning

Additional Slides on: inductive learning theory, version spaces, decision trees and neural nes

Approach: Decision Tree Learning

 

Texts:
 Nilsson: Chapter 3

Texts:
Witten & Frank, Sections 4.3 & 6.1

Week 3:

Jan 26

Homework 1 HANDED OUT on Monday

Theoretical Issue: Experimental Evaluation of Learning Algorithms I



Theme: Text Mining

Texts:
Witten & Frank, Chapter 5

Theme: Text Mining

 

 


 

Week 4:

Feb 2


 Theoretical Issue:

Theme: Evaluation of learning Systems

Texts:
Witten & Frank, Chapter 5; Japkowicz & Shah, Chapter 6

Theme: Evaluation of learning Systems

Week 5:

Feb 9

Homework 1 DUE on Monday

 

Approach: Artificial Neural Networks



Theme: Cost-Sensitive Learning

Texts:                                                                                                                         Witten & Frank, pp. 223-235

Theme: Cost-Sensitive Learning and Class Imbalances

 

Week 6:

Feb 16

STUDY BREAK

STUDY BREAK

Week 7:

Feb 23
 
Project Proposal DUE on Monday


Homework 2 HANDED OUT on Monday

Approach: Bayesian Learning


Theme: Scaling up Data Mining

Texts: Witten & Frank, Sections 4.2 and 6.7

Theme: Scaling up Data Mining

 

Week 8:

March 2

Approach: Instance-Based Learning

Theme: Mining Data Streams

Texts: Witten & Frank, Sections 4.7 and 6.4

Theme: Mining Data Streams

 

 

 

Week 9:

March 9

Homework 2 DUE on Monday

Approach: Rule Learning

Theme: Data Cleaning

 

Texts: Witten & Frank, Sections 4.4 and 6.2

Theme: Data Cleaning

 

 

Week 10:

Mar 16  

Homework 3 HANDED OUT on Monday

Approach: Support Vector Machines


Theme: Privacy Preserving Data Mining

Texts: Witten & Frank, Sections 4.6 and 6.3

Theme Papers:

Your choice of 3 papers from this very nice bibliography (each presenter chooses a paper):

 

 

Week 11:

Mar 23  

Approach: Classifier Combination

 Theme: Mining Link and Network Data

Texts: Witten & Frank, Section 7.5

Theme Papers: Mining Link and Network Data

Week 12:

Mar 30

Homework 3 DUE on Monday

Theoretical Issue: Computational Learning Theory



Theme:  Data Mining for Security Applications

Texts: See Tom Mitchell’s book

Theme Papers: Data Mining for Security Applications

Week 13:

 Apr 6

Approach: Unsupervised Learning

Approach:  Genetic Algorithms

Projects Presentation

Texts: Witten & Frank, Sections 4.8 and 6.6.

Texts: See Tom Mitchell’s book