Data Mining
CSE-4412
Fall 2012
York University


Semester: Fall 2012
Course/Sect#: CSE-4412
Time: Tue 1:00pm-2:30pm
Thu 1:00pm-2:30pm
Location: CC 109
Instructor: Aijun An
Office: CSE 2048
Office Hours: Tuesdays and Thursdays 2:40-3:30pm
Phone #: 416-736-2100 x44298
e-mail: aan@cse.yorku.ca


Welcome to the Data Mining course, CSE-4412, for Fall 2012. Materials, instructions, and notices for the course will accumulate here over the semester.


Message Board

December 29, 2012
Grades are posted. You can check yours by using ePost.
December 6, 2012
Please be reminded that the final exam will take place at 7:00pm-9:30pm on Friday December 7. The location is ACW 004.
November 15, 2012
Project is posted. Please see the link below in the Assignments and Project section.
November 7, 2012
Marks are posted. You can see yours using ePost.
November 7, 2012
The due date for Assignment 2 is extended to Wednesday November 14 at 8pm.
November 6, 2012
Solutions to midterm questions are posted. Click here to download.
November 6, 2012
An FAQ page for Assignment #2 is created. Please see here.
October 31, 2012
Assignment #2 is posted. Please see the link below in the Assignments and Project section.
October 21, 2012
Please be reminded that the midterm test will be held on Thursday October 25 at the class time in CB 129. For sample test questions, click here. The username and password are the same as the ones used for accessing the lecture notes.
October 18, 2012
A sample solution to A1 is posted. Click here to download.
October 11, 2012
The midterm test will be held on Thursday October 25 during the class time. The classroom for the midterm is CB 129.
September 25, 2012
Assigment #1 is posted. Please see the link below in the Assignments and Project section. The access to the assignment is password-protected. The username and password are the same as the ones you use for downloading the lecture notes.
September 5, 2012
This web page is set up. Welcome to the course!


Description

Data mining or knowledge discovery from databases (KDD) is one of the most active areas of research in databases. It is at the intersection of database systems, statistics, AI/machine learning, and data visualization. In this course, we will introduce the concepts of data mining and present data mining algorithms and applications. Topics include association rule mining, sequential pattern mining, classification models, clustering, and text mining.


Prerequisites

  • Required: a course on data structures and an introductory course on database systems.
  • Preferred: basic concepts in probability and statistics.


Materials

  • Textbook
  • Reference Books
    • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2006.
    • Ian H. Witten and Eibe Frank, Data Mining -- Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, 2005.
    • S.M. Weiss and N. Indurkhya, Predictive Data Mining, Morgan Kaufmann, 1998.
    • Margaret H. Dunham, Data Mining -- Introductory and Advanced Topics, Prentice Hall, 2003.
    • Some conference/journal papers (will be posted over the semester).


Grading Scheme

  • Assignments (25%)
  • Midterm (20%) (Thursday October 25 in CB129 at the class time)
  • Project (20%)
  • Final exam (35%)


Lectures


Assignments and Project

  • Assignment 1 (13%) (Due Thursday October 11 in class)
  • Assignment 2 (12%) (Due Wednesday November 14 by 8pm )
  • Project (20%) (Due Monday December 3 at 5pm)

Useful On-line Information

Academic Honesty