Data Mining
Winter 2010
York University

Semester: Winter 2010
Course/Sect#: CSE-4412
Time: Tue 2:30pm-4:00pm
Thu 2:30pm-4:00pm
Location: BC 325
Instructor: Aijun An
Office: CSB 2048
Office Hours: Thursdays and Fridays 12:00-1:00pm
Phone #: 416-736-2100 x44298

Welcome to the Data Mining course, CSE-4412, for Winter 2010. Materials, instructions, and notices for the course will accumulate here over the semester.

Message Board

April 30, 2010
Grades are posted. You can check yours by using "courseInfo 4412 2009-10 W".
April 6, 2010
Please be reminded that the final exam will take place at 2pm-4:30pm on Thursday April 8. The location is VH 1154. Information regarding the exam has been emailed to you this morning. Please check your email.
March 19, 2010
Project is posted. Please see the link below in the Assignments and Project section.
March 16, 2010
The deadline for A2 submission is extended to Friday March 19 by 6pm.
March 5, 2010
An FAQ page for Assignment #2 is created. Please see here.
March 5, 2010
For those who haven't got back their A1 or midterm paper, A1 and midterm marks are posted. You can see yours using "courseInfo 4412".
March 3, 2010
Assignment #2 is posted. Please see the link below in the Assignments and Project section.
February 17, 2010
Please be reminded that the midterm test will be held on Tuesday February 23 at the class time in HNE 001. Please note that the room is different from our regular lecture room. For sample test questions, click here. For sample solutions to A1, see here.
February 1, 2010
I have set up an FAQ page to answer frequently-asked questions about A1. Please click here. The username and password are the same as the ones used for accessing the A1 questions.
January 21, 2010
Assigment #1 is posted. Please see the link below in the Assignments and Project section. The access to the assignment is password-protected. The username and password have been sent to your cse email account. If you do not receice them, let me know by email.
January 3, 2010
This web page is set up. Welcome to the course!


Data mining or knowledge discovery from databases (KDD) is one of the most active areas of research in databases. It is at the intersection of database systems, statistics, AI/machine learning, and data visualization. In this course, we will introduce the concepts of data mining and present data mining algorithms and applications. Topics include association rule mining, sequential pattern mining, classification models, clustering, and text mining.


  • Required: an introductory course on database systems.
  • Preferred: basic concepts in probability and statistics.


  • Textbook
  • Reference Books
    • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2006.
    • Ian H. Witten and Eibe Frank, Data Mining -- Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, 2005.
    • Margaret H. Dunham, Data Mining -- Introductory and Advanced Topics, Prentice Hall, 2003.
    • Some conference/journal papers (will be posted over the semester).

Grading Scheme

  • Assignments (25%)
  • Midterm (20%)
  • Project (20%)
  • Final exam (35%) (Time: 2pm-4:30pm on April 8. Location:VH 1154)


Assignments and Project

Useful On-line Information

Academic Honesty