CSI 7163B – COMP6605P

 

Advanced topic in Computer Systems:  Machine Learning and Data Mining Systems

Category: S, A

 

Tue, 1-4; r. room LPR154 for the first class, later SITE5084

 

Instructor:  Dr. Stan Matwin, SITE 5100, stan@site.uottawa.ca

 

Website: www.site.uottawa.ca/~stan/csi/7163

 

 Assignment 1 is here

 

Project is here

Message from William 29/3 is here:

Dear Students,

I apologize for the short notice, but I think I can manage an information

session to help you out with questions you have regarding your projects.

I will make myself available from 13:00 to 15:00 on Friday 31/03/2007 in

the Tamale Lab room SITE 3-033. One note though, I must leave for a 15:00

meeting after then, please keep that in mind.

In addition, I will have your assignments ready to hand back on Tuesday

after Tamale seminar. You can come to the same room shown above to get

your assignments back (but only after Tamale, since I have an engagement

before then).

Kind Regards

 

WE

 

Project help for data conversion to the BOW and TF/IDF format is here

Project is due Apr. 3

 

Slides on the t-test are here

 

 

In this class, we will cover the basics of Machine Learning and Data Mining (classifier induction; Decision Trees, Support Vector Machines, Naïve Bayes; selected applications: text mining, bioinformatics).
Hands on experience with WEKA – the open source, state of the art data mining suite. The class is combined with a research seminar in data and text mining.

 

There are no formal prerequisites beyond general familiarity with Artificial Intelligence. Please consult the instructor if in doubt.

 

The grades will consist of: a hands-on data mining project in Weka, participation in the seminar part, and an essay; exact breakdown to be given later.
Depending on the number of students, a presentation may be added to this.

 

Textbook:

Data Mining : Practical Machine Learning Tools and Techniques, Second Edition by Ian H. Witten and Eibe Frank (Paperback - 2005)
Class notes (.ppt) will be published later

 

Slides used in class are here, .ppt and .pdf (2 per page)

 

WEKA workshop slides by William Elazmeh are here

 

The paper on feature selection for text classification is here

Introduction to SVM is here

The Scholkopf paper is here