Data Mining
Winter 2011
York University

Semester: Winter 2011
Course/Sect#: CSE-4412
Time: Tue 2:30pm-4:00pm
Thu 2:30pm-4:00pm
Location: VH 1018
Instructor: Aijun An
Office: CSE 2048
Office Hours: Tuesdays and Thursdays 4:10-5:10pm
Phone #: 416-736-2100 x44298

Welcome to the Data Mining course, CSE-4412, for Winter 2011. Materials, instructions, and notices for the course will accumulate here over the semester.

Message Board

April 29, 2011
Grades are posted. You can check yours by using ePost.
April 5, 2010
Please be reminded that the final exam will take place at 10:00am-12:30pm on Sunday April 10. The location is VH 1016. Information regarding the exam has been emailed to you.
March 15, 2011
Project is posted. Please see the link below in the Assignments and Project section.
March 7, 2011
An FAQ page for Assignment #2 is created. Please see here.
March 4, 2011
Midterm marks are posted. You can see yours using ePost.
March 3, 2011
Assignment #2 is posted. Please see the link below in the Assignments and Project section.
February 23, 2011
Please be reminded that the midterm test will be held on Tuesday March 1 in class. For sample test questions, click here. The username and password are the same as the ones used for accessing the A1 questions.
January 22, 2011
Assigment #1 is posted. Please see the link below in the Assignments and Project section. The access to the assignment is password-protected. The username and password have been sent to your cse email account. If you do not receice them, let me know by email.
January 3, 2011
This web page is set up. Welcome to the course!


Data mining or knowledge discovery from databases (KDD) is one of the most active areas of research in databases. It is at the intersection of database systems, statistics, AI/machine learning, and data visualization. In this course, we will introduce the concepts of data mining and present data mining algorithms and applications. Topics include association rule mining, sequential pattern mining, classification models, clustering, and text mining.


  • Required: an introductory course on database systems.
  • Preferred: basic concepts in probability and statistics.


  • Textbook
  • Reference Books
    • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2006.
    • Ian H. Witten and Eibe Frank, Data Mining -- Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, 2005.
    • S.M. Weiss and N. Indurkhya, Predictive Data Mining, Morgan Kaufmann, 1998.
    • Margaret H. Dunham, Data Mining -- Introductory and Advanced Topics, Prentice Hall, 2003.
    • Some conference/journal papers (will be posted over the semester).

Grading Scheme

  • Assignments (25%)
  • Midterm (20%)
  • Project (20%)
  • Final exam (35%)


Assignments and Project

Useful On-line Information

Academic Honesty