CSI5387 Concept Learning Systems/Machine Learning


Instructor

Nathalie Japkowicz
Office: STE 5-029
Phone: 562-5800 ext. 6693
E-mail: nat@site.uottawa.ca

Meeting Times and Locations

  • Time: Mondays 1:00pm-4pm
  • Location: CBY B-202

Office Hours and Locations

  • Times: TBA
  • Location: STE 5-029;

Overview

Machine Learning is the area of Artificial Intelligence concerned with the problem of building computer programs that automatically improve with experience. The intent of this course is to present a broad introduction to the principles and paradigms underlying machine learning, including presentations of its main approaches, discussions of its major theoretical issues, and overviews of its most important research themes.

Course Format

The course will consist of a mixture of regular lectures and student presentations. The regular lectures will cover descriptions and discussions of the major approaches to Machine Learning as well as of its major theoretical issues. The student presentations will focus on the most important themes we survey. These themes will mostly be approached through recent research articles from the Machine Learning literature.

Evaluation

Students will be evaluated on short written commentaries and oral presentations of research papers (20%), on a few homework assignments (30%), and on a final class project of the student's choice (50%). For the class project, students can propose their own topic or choose from a list of suggested topics which will be made available at the beginning of the term. Project proposals will be due in mid-semester. Group discussions are highly encouraged for the research paper commentaries and students will be allowed to submit their reviews in teams of 3 or 4. However, homework and projects must be submitted individually.

Pre-Requisites

Students should have reasonable exposure to Artificial Intelligence and some programming experience in a high level language.

Required Textbooks

Additional References .

 

Other Reading Material

Research papers will be available from Conference Proceedings or Journals available from the Web. 

(Links appear in the Syllabus table below, in the Readings column)

List of Major Approaches Surveyed

  • Version Spaces
  • Decision Trees
  • Artificial Neural Networks
  • Bayesian Learning
  • Instance-Based Learning
  • Support Vector Machines
  • Meta-Learning Algorithms
  • Rule Learning/Inductive Logic Programming
  • Unsupervised Learning/Clustering
  • Genetic Algorithms

List of Theoretical Issues Considered

  • Experimental Evaluation of Learning Algorithms
  • Computational Learning Theory

List of Major Themes Surveyed

To expose you to current topics of interest in the data mining and machine learning community, I have chosen to study the topics listed in the following poll of Dec 1-5, 2008.

KDnuggets : Polls : Important Data Mining Topics

Which of the following Data Mining (DM) topics are most important for your work or research? (Choose top 3) [113 voters]

Scaling up DM algorithms for huge data (46)

41%

Mining text (33)

29%

Automating data cleaning (30)

27%

Dealing with unbalanced and cost-sensitive data (29)

26%

Mining data streams (20)

18%

Mining links and networks (19)

17%

Unified theory of DM (18)

16%

DM for biological problems (16)

14%

DM with privacy (10)

8.9%

Mining images (8)

7.1%

DM for security applications (6)

5.4%

Distributed (multi-agent) DM (4)

3.6%

Other (21)

8%

Course Support:

·         Schedule of Presentations (TBA)

·         Timetable for Homework

·         Suggested Outline for Paper Commentaries

·         Project Description

·         Guidelines for the Final Project Report

Machine Learning Ressources on the Web:

·         David Aha's Machine Learning Resource Page

·         UCI Machine Learning

·         WEKA

·         Free Book: Information Theory, Inference, and Learning Algorithms, David MacKay



Syllabus:

Week

Topics

Readings

Week 1:

Jan 4-8

Introduction: Organizational Meeting

 

Week 2:

Jan 9-15

Introduction: Overview of Machine Learning

 

Approach: Versions Space Learning

Texts:
Witten & Frank: Chapter 1

Texts:
 Nilsson: Chapter 3

Week 3:

Jan 16-22

Homework 1 HANDED OUT on Monday

Approach: Decision Tree Learning



Theme: Text Mining

Texts:
Witten & Frank, Sections 4.3 & 6.1

Theme: Text Mining

 

 


 

Week 4:

Jan 23-29


 Theoretical Issue: Experimental Evaluation of Learning Algorithms


Texts:
Witten & Frank, Chapter 5; Japkowicz & Shah, Chapter 6

Theme: Evaluation of learning Systems

Week 5:

Jan 30- Feb 5
Homework 1 DUE on Monday

Approach: Artificial Neural Networks



Theme: Cost-Sensitive Learning

Texts:                                                                                                                         Witten & Frank, pp. 223-235

Theme: Cost-Sensitive Learning and Class Imbalances

 

Week 6:

Feb 6 - 12
 

Project Proposal DUE on Thursday

Homework 2 HANDED OUT on Thursday

Approach: Bayesian Learning


Theme: Scaling up Data Mining

Texts: Witten & Frank, Sections 4.2 and 6.7

Theme: Scaling up Data Mining

 

Week 7:

Feb 13 - 19

Approach: Instance-Based Learning

Theme: Mining Data Streams

Texts: Witten & Frank, Sections 4.7 and 6.4

Theme: Mining Data Streams

Week 8:

Feb 20 - 26

STUDY BREAK

STUDY BREAK

Week 9:

Feb 27 -  Mar 5

Homework 2 DUE on Monday

Approach: Rule Learning

Theme: Data Cleaning

 

Texts: Witten & Frank, Sections 4.4 and 6.2

Theme: Data Cleaning

 

 

Week 10:

Mar 6 - 12

Homework 3 HANDED OUT on Monday

Approach: Support Vector Machines


Theme: Privacy Preserving Data Mining

Texts: Witten & Frank, Sections 4.6 and 6.3

Theme Papers:

Your choice of 3 papers from this very nice bibliography (each presenter chooses a paper):

 

 

Week 11:

Mar 13 - 19

Approach: Classifier Combination

 Theme: Mining Link and Network Data

Texts: Witten & Frank, Section 7.5

Theme Papers: Mining Link and Network Data

Week 12:

Mar 20 - 26

Homework 3 DUE on Monday

Theoretical Issue: Computational Learning Theory



Theme:  Data Mining for Security Applications

Texts: See Tom Mitchell’s book

Theme Papers: Data Mining for Security Applications

Week 13:

Mar 27 – Apr 2

Approach: Unsupervised Learning

Approach:  Genetic Algorithms


Texts: Witten & Frank, Sections 4.8 and 6.6.

Texts: See Tom Mitchell’s book

Week 14:

Apr 3 – 9

Projects Presentation

 

Week 15:

Apr 10

Projects Presentation