Data Mining
Winter 2009
York University

Semester: Winter 2009
Course/Sect#: CSE-4412
Time: Tue 1pm-2:30pm
Thu 1pm-2:30pm
Location: CB 120
Instructor: Aijun An
Office: CSB 2048
Office Hours: Tue and Thur 4-5pm
Phone #: 416-736-2100 x44298

Welcome to the Data Mining course, CSE-4412, for Winter 2009. Materials, instructions, and notices for the course will accumulate here over the semester.

Message Board

June 8, 2009
Grades are posted. You can check yours by using "courseInfo 4412 2008-09 W". Feedbacks on your project will be mailed to you.
May 24, 2008
Please be reminded that the final exam will take place at 9:00am-11:30am on Tuesday May 26 in SLH 107. More information regarding the exam has been emailed to you last Friday night. Please check your email.
May 8, 2009
Project is posted. Please see the link below in the Assignments and Project section.
May 3, 2009
The deadline for A2 submission is extended to Friday May 8 by 10pm.
April 30, 2009
An FAQ page for Assignment #2 is created. Please see here.
April 23, 2009
Assignment #2 is posted. Please see the link below in the Assignments and Project section.
April 21, 2009
A1 and midterm marks are posted. You can see yours using "courseInfo 4412".
April 13, 2009
Please note that you are allowed to bring a calculator to the midterm test tomorrow.
April 12, 2008
Solutions to A1 questions are posted. Please see here. The user name and password are the same as the ones for accessing A1.
April 8, 2009
Please be reminded that the midterm test will be held on Tuesday April 14 at the class time in SLH 107. Please note that the room is different from our regular lecture room.
March 24, 2009
Assigment #1 is posted. Please see the link below in the Assignments and Project section. The access to the assignment is password-protected. The username and password have been sent to your cse email account. If you do not receice them, let me know by email.
March 19, 2009
If you are interested in York programming contests, the first contest of the Winter term will take place tomorrow at 3pm-5pm. See here for details.
March 4, 2009
This web page is set up. Welcome to the course!


Data mining or knowledge discovery from databases (KDD) is one of the most active areas of research in databases. It is at the intersection of database systems, statistics, AI/machine learning, and data visualization. In this course, we will introduce the concepts of data mining and present data mining algorithms and applications. Topics include association rule mining, sequential pattern mining, classification models, clustering, and text mining.


  • Required: an introductory course on database systems.
  • Preferred: basic concepts in probability and statistics.


  • Textbook
  • Reference Books
    • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2006.
    • Ian H. Witten and Eibe Frank, Data Mining -- Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, 2005.
    • Margaret H. Dunham, Data Mining -- Introductory and Advanced Topics, Prentice Hall, 2003.
    • Some conference/journal papers (will be posted over the semester).

Grading Scheme

  • Assignments (25%)
  • Midterm (20%) (April 14 in class time. Location: SLH 107)
  • Project (20%)
  • Final exam (35%)


Assignments and Project

To be posted over the semester
  • Assignment 1 (11%) (Due Tuesday April 7 in class)
  • Assignment 2 (14%) (Due Wednesday May 6 by 5pm, extended to Friday May 8 by 10pm )
  • Project (20%) (Due Thursday May 21 at 6pm)

Useful On-line Information

Academic Honesty