Data Mining
Fall 2008
York University

Semester: Fall 2008
Course/Sect#: CSE-6412
Time: Tue 2:30pm-4:00pm
Thu 2:30pm-4:00pm
Location: FS 106
Instructor: Aijun An
Office: CSB 2048
Office Hours: Tue and Thur: 4:00pm - 5:00pm
Phone #: 416-736-2100 x44298

Welcome to the Data Mining course, CSE-6412, for Fall 2008. Materials, instructions, and notices for the course will accumulate here over the semester.

Message Board

February 25, 2009
The project presentation schedule is posted.
February 24, 2009
Project presentations will take place in room 306 Lumbers at 1pm on Thursday February 26, 2009.
February 18, 2009
Please be remined that the final exam will take place tomorrow at 1:30pm-3:30pm in CSE 3033.
February 2, 2008
Classes at York resume today. We will have our first class meet tomorrow at the class time. Welcome back!
November 3, 2008
Paper presentation schedule is posted.
October 31, 2008
An FAQ page for A2 is set up. Please see A2 Frequently Asked Questions.
October 29, 2008
The reading list for student paper presentations is posted. See the links below in the "Paper Review and Presentation" section for the reading list and requirements for the presentation.
October 28, 2008
Today's class is cancelled due to a sick child that the instructor has to take care of. We will discuss in the next class when to have a makeup class. Hope you all stay well in the flu/cold season.
October 22, 2008
Assignment 2 is posted. See the link below in the "Assignments" section.
September 24, 2007
Assignment 1 is posted. See the link below under "Assignments".
September 2, 2008
The web site is set up. Welcome to the course!


Data mining or knowledge discovery from databases (KDD) is one of the most active areas of research in databases. It is at the intersection of database systems, statistics, AI/machine learning, and data visualization. In this course, we will introduce the concepts of data mining and present data mining algorithms and applications. Topics include association rule mining, sequential pattern mining, classification models, and clustering.


  • Required: an introductory course on database systems and an introductory course on probability.
  • Preferred: basic knowledge on statistics.

Reference Books and Materials

  • Jiawei Han and Micheline Kamber, Data Mining -- Concepts and Techniques, Morgan Kaufmann, Second Edition, 2006.
  • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2006.
  • Ian H. Witten and Eibe Frank, Data Mining -- Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, 2005.
  • Margaret H. Dunham, Data Mining -- Introductory and Advanced Topics, Prentice Hall, 2003.
  • Some conference/journal papers (More will be posted over the semester).

Grading Scheme

  • Assignments (25%)
  • Final exam (30%)
  • Paper review and presentation (10%)
  • Course project (25%)
  • Participation (10%)

Lecture Notes


Paper Review and Presentation


Useful On-line Information