CSE-4411M
Database Management Systems

York University
Winter 2013
Syllabus
Instructor: Parke Godfrey
Office: CS&E Building #2050
Office Hours: 3-5pm Wednesdays
& by appointment
Ph#: 416-736-2100 x66671
e-mail: godfrey@cse.yorku.ca
Term: Winter 2013
Time: Tu&Th 10:00-11:30pm
Place: CB #122
Textbook: Raghu Ramakrishnan & Johannes Gehrke
Database Management Systems
Third Edition, 2003
WCB/McGraw Hill.
ISBN: 0-07-246563-8
URL: http://www.cs.wisc.edu/~dbbook
Class URL: http://www.cse.yorku.ca/course/4411/
  The Course
Description (from the academic calendar)

This course is the second course in database management. It introduces concepts, approaches, and techniques required for the design and implementation of database management systems.

Course Objectives and Content

In this course, we go "under the hood" to learn how a relational database management system is built. Students will learn the issues involved in designing efficient database systems, and the strategies, data-structures, and algorithms used in the implementation of such systems.

The course is designed in three parts: the physical database, the query processor, and database management. Specific contents include the following.

I. The Physical Database
  • file organizations
  • indexes
    • tree-structured indexes
    • hash-based indexes
  • external sorting
II. The Query Processor
  • evaluation of relational operators
    • selection
    • projection
    • joins (the many ways)
    • set operations
    • aggregate operations
  • relational query optimization
    • query evaluation plans
    • translating SQL queries into algebra
    • considering alternative plans
    • cost models and estimations
III. Database Management
  • transaction management
  • concurrency control
  • crash recovery
  • physical database design and tuning

The content of CSE-3421 is assumed.

Required Textbook / Reading

Course materials will be primarily drawn from the assigned readings in the assigned textbook. Some auxiliary readings may be assigned during the course.

 
  Grading Criteria / Course Requirements
Components
  percentage when
test #1 20% 12 February
test #2 20% 19 March
final exam 35% ? April
assignments 3 3⅓% three, due through the semester
projects 2 7% two, due through the semester

The grading policy is standard. Collaboration and discussion on the assignments is fine. Note however, that unless you work through the exercises yourself, you will not get much benefit from them.

York University's rules for academic honesty and plagiarism are always in effect. ( See below.) Discussion is fine on the assignments and projects. However, collaboration is not. The work must be your own.

Course Projects & Assignments

There will be two projects, in sum worth 10% of the final grade. Students will be required to design and implement some of the key algorithms and data-structures for a relational database system. Your code will be in Java.

The projects will probably consist as follows. We may use the Minibase system. (See Appendix B in the textbook.) The Minibase system provides a code base for a simple database system. Your code will augment the Minibase code. The good news is that you do not have to do everything from scratch; the bad news is that you will have to understand the Minibase code to use it.

Late projects will not be accepted, unless prior approval has been obtained with good reason.

There will be three assignment sets through the semester. It is important to put effort into the assignments for understanding of the materials, and because similar problems will appear on the exams. Solutions to assignment sets will be made available after the turn-in dates for study purposes.

All written work for the assignments need not be typed; but if you hand-write them, you must write legibly for credit. Late assignment sets will not be accepted, unless prior approval has been obtained with good reason.

 
  Calendar

The following schedule is tentative, which means we may deviate from it some during the term. We may need to slow down, or speed up, on certain topics, depending on how things go. As well, we might not get to everything on the schedule, though we shall try. It is your responsibility to keep track of due-dates and what we are covering.

Wk# Days Topic Reading Due
#1 Tu 8 Jan You want me to code what?! Introduction    
I. The Physical Database
  Th 10 Jan Where can I keep this? Storage & the DSM Ch 8  
#2 Tu 15 Jan Paging Dr. Codd... Ch 9  
  Th 17 Jan The Buffer Pool & Formats    
#3 Tu 22 Jan Disks & Files    
  Th 24 Jan Yo. Trees are for the birds. Ch 10  
#4 Tu 29 Jan Tree-structured Indexes    
  Th 31 Jan Hash-based Indexes Ch 11 A#1
#5 Tu 5 Feb Take it outside! Ch 13  
  Th 7 Feb External Sorting   P#1
#6 Tu 12 Feb Test #1 Arghhh!  
II. The Query Processor
  Th 14 Feb You ought to see some of my relations! Ch 12  
Reading Week: 16–22 Feb
#7 Tu 26 Feb Evaluation of The Relational Operators (overview)  
  Th 28 Feb I once joined these two tables. Ch 14  
#8 Tu 5 Mar Relational Query Optimization: Algorithms    
  Th 7 Mar Listen up. Here's the plan... Ch 15 A#2
#9 Tu 12 Mar Relational Query Optimization: Plans    
  Th 14 Mar      
#10 Tu 19 Mar Test #2 Not again!  
III. Database Management
  Th 21 Mar One at a time, please. Ch 16  
#11 Tu 26 Mar Transaction Management    
  Th 28 Mar & Concurrency Control Ch 17 P#2
#12 Tu 2 Apr Crash & burn. Crash Recovery Ch 18  
  Th 4 Apr Wrap-up   A#3
final exam period (10–26 Apr)
#13 ? Apr Final Exam

A#: Assignment '#' due.
P#: Project '#' due.

Assignments and projects are due on the Thursday of the week indicated above. For example, Assignment #2 due in Week #8 — with lectures on Tuesday 5 March and Thursday 7 March — would be due on Thursday 7 March. Specific details for turning in assignments and projects will be set. For your assignment solutions, they should be dropped off in the class's dropbox next to #1003, the main office for CS&E, in the Lassonde (CS&E) building. For your project solutions, they should be submitted online by the submit command on a PRISM machine by 11:59pm.

 
  Policies
Exams & Attendance

Exams must be taken when scheduled unless the student has a medical documentation or can demonstrate special circumstances for a need for a rescheduled exam. The student must obtain approval from the instructor.

Class attendance is important as the student will have an opportunity to ask for clarification of course and text material. There will be problem solving sessions during class period so that students gain experience applying the theory in practice. However, attendance itself is not part of the grade or otherwise enforced.

Academic Integrity / Honesty / Plagiarism

The Department of Computer Science (& Engineering) Academic Honesty Guidelines are in effect for this course, as, indeed, they are for any CS&E course.

Plagiarism is defined as taking the language, ideas, or thoughts of another, and representing them as your own. If you use someone else's ideas, cite them. If you use someone else's words, clearly mark them as a quotation. Note that plagiarism includes using another's computer programs or pieces of a program. All noted instances of plagiarism will be reported.

These policies are not intended to keep students from working with other students. One can learn much working with others, so this is to be encouraged. Should you encounter any situations for which you are uncertain whether the collaboration is permitted or not, please ask.