Data Mining
Fall 2010
York University

Semester: Fall 2010
Course/Sect#: CSE-6412
Time: Tue 2:30pm-4:00pm
Thu 2:30pm-4:00pm
Location: CC 335
Instructor: Aijun An
Office: CSB 2048
Office Hours: Tue and Thur: 4:00pm - 5:00pm
Phone #: 416-736-2100 x44298

Welcome to the Data Mining course, CSE-6412, for Fall 2010. Materials, instructions, and notices for the course will accumulate here over the semester.

Message Board

December 20, 2010
Please be reminded that project presentations will take place on Wednesday December 22 at 1:00-4:00pm in room CSE 3033. The project report and related programs are due on Friday December 24 at 5pm. See the page for project requirements (follow the link below in the project section) for what and how to submit.
December 9, 2010
This is to remind that the final exam is scheduled for Wednesday December 15 at 12:00-2:30pm in CSE 3033. Project presentations will take place on Wednesday December 22 at 1:00-4:00pm in CSE 3033.
November 16, 2010
Project requirements have been posted. Please see the link under the Project Section below.
November 10, 2010
I will be away on university business from November 25 to December 1. The make-up lesson for this period has been scheduled for Friday November 12 from 11:00am to 3pm in room 306 Lumbers. Note that this make-up class also includes make-up for yesterday's class, which was suspended due to the gas leak.
November 9, 2010
Paper presentation schedule is posted.
November 6, 2010
An FAQ page for A2 is set up. Please see A2 Frequently Asked Questions.
November 4, 2010
The reading list for student paper presentations is posted. See the links below in the "Paper Review and Presentation" section for the reading list and requirements for the presentation.
November 2, 2010
Assignment 2 is posted. See the link below in the "Assignments" section.
October 2, 2010
Assignment 1 is posted. See the link below under "Assignments".
September 13, 2010
This web site is set up. Welcome to the course! The first lecturer will be at 2:30 - 4:00pm on Tuesday September 14.


Data mining or knowledge discovery from databases (KDD) is one of the most active areas of research in databases. It is at the intersection of database systems, statistics, AI/machine learning, and data visualization. In this course, we will introduce the concepts of data mining and present data mining algorithms and applications. Topics include association rule mining, sequential pattern mining, classification models, and clustering.


  • Required: an introductory course on database systems and an introductory course on probability.
  • Preferred: basic knowledge on statistics.

Reference Books and Materials

  • Jiawei Han and Micheline Kamber, Data Mining -- Concepts and Techniques, Morgan Kaufmann, Second Edition, 2006.
  • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2006.
  • Ian H. Witten and Eibe Frank, Data Mining -- Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, 2005.
  • Margaret H. Dunham, Data Mining -- Introductory and Advanced Topics, Prentice Hall, 2003.
  • Some conference/journal papers (More will be posted over the semester).

Grading Scheme

  • Assignments (25%)
  • Final exam (Wed December 15 at 12:00 - 2:30pm in room CSE 3033) (30%)
  • Paper review and presentation (10%)
  • Course project (25%)
  • Participation (10%)

Lecture Notes


  • Assignment 1 (12%) (Due Tuesday October 19 in class) Please note that you need a user name and a password to access the assignment. Please check your email for the user name and password.
  • Assignment 2 (13%) (Due Monday November 15 by 5pm)

Paper Review and Presentation


Useful On-line Information