Data Mining
Fall 2014
York University

Semester: Fall 2014
Course/Sect#: CSE-6412
Time: Mon 10:00am-11:30am
Wed 10:00am-11:30am
Location: BC 322
Instructor: Aijun An
Office: CSB 2048
Office Hours: Mon and Wed: 12:00pm - 1:00pm
Phone #: 416-736-2100 x44298

Welcome to the Data Mining course, CSE-6412, for Fall 2014. Materials, instructions, and notices for the course will accumulate here over the semester.

Message Board

December 10, 2014
Please be reminded that the final exam is scheduled for Monday December 15 at 10:00am-12:30pm in LAS 3033. You can find some sample questions here.
December 2, 2014
The project proposal presentation schedule is posted.
November 17, 2014
Some sample project reports from previous terms are posted. See the link below under "Project". You will need to use the user name and password that you use for downloading the lecture notes to access these reports.
November 10, 2014
Project requirements are posted. See below under "Project".
November 6, 2014
Paper presentation schedule is posted.
November 3, 2014
The reading list for student paper presentations is posted. See the links below in the "Paper Review and Presentation" section for the reading list and requirements for the presentation.
October 24, 2014
An FAQ page for A2 is set up. Please see A2 Frequently Asked Questions. Also, as mentioned in class, the due time for A2 is moved to Tuesday November 4 at 5pm.
October 17, 2014
Assignment 2 is posted. See the link below in the "Assignments" section.
September 18, 2014
Assignment 1 is posted. See the link below under "Assignments".
September 5, 2014
This web site is set up. Welcome to the course! The first lecturer will be at 10:00 - 11:30am on Monday September 8.


Data mining or knowledge discovery from databases (KDD) is one of the most active areas of research in databases. It is at the intersection of database systems, statistics, AI/machine learning, and data visualization. In this course, we will introduce the concepts of data mining and present data mining algorithms and applications. Topics include association rule mining, sequential pattern mining, classification models, and clustering.


  • Required: an introductory course on database systems and an introductory course on probability.
  • Preferred: basic knowledge on statistics.

Reference Books and Materials

  • Jiawei Han, Micheline Kamber and Jian Pei, Data Mining -- Concepts and Techniques, Morgan Kaufmann, Third Edition, 2011.
  • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2006.
  • Ian H. Witten and Eibe Frank, Data Mining -- Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, 2005.
  • S.M. Weiss and N. Indurkhya, Predictive Data Mining, Morgan Kaufmann, 1998.
  • Margaret H. Dunham, Data Mining -- Introductory and Advanced Topics, Prentice Hall, 2003.
  • Some conference/journal papers
  • More books can be found here

Grading Scheme

  • Assignments (25%)
  • Final exam (30%)
  • Paper review and presentation (10%)
  • Course project (25%)
  • Participation (10%)

Lecture Notes


  • Assignment 1 (13%) (Due Wednesday October 8 in class) Please note that you need a user name and a password to access the assignment. Please check your email for the user name and password.
  • Assignment 2 (12%) (Due Tuesday November 4 by 5pm)

Paper Review and Presentation


Useful On-line Information