CSE-4411M
Database Management Systems

York University
Winter 2013
Test #2: Preparation
  Layout

Probably four or five questions. I am aiming to make it shorter than midterm #1 so there is not as much time pressure. The type of questions and layout of the test will be similar the first test.

exercise
  • E.g., What is the I/O cost of this query plan?
    How many tuples is this query likely to return? (Use assumptions of System R to estimate.)
short answer
  • E.g., Which join algorithm would be most appropriate for this query?
analysis
  • E.g., How would we need to modify the 2-pass sort-merge join to accommodate...

 
  Coverage

Midterm #2 is not intended to be cumulative, and so will not test directly on the subjects covered up to Midterm #1, the physical database, per se.

So not covered:

  • physical database
    • disk space manager
    • buffer pool manager
    • file, page, and record formats
    • internals of index data structures
      • B+ trees
      • extendable and linear hashs
    • basics of the external sort algorithm

That said, everything we have done since Test I builds on the material from before. So in that sense, you are responsible for the previous material for this exam, by necessity.

  1. indexes (needed for access paths) (Ch.8) REVIEW
    • prefix matching
    • cost model for using
    • clustered vs unclustered
    • ...
  2. external sorting (Ch.13)
    • basics REVIEW
      Many other relational algorithms are based on this, and have the same performance issues.
    • advanced issues (e.g., double buffering, block reads and writes, etc.)
  3. query evaluation, overview (Ch.12)
    1. access paths
    2. cost model & System R
      1. reduction factors
      2. cost estimation
    3. query trees & alternative plans
      1. pipelining
      2. optimization "rules" like pushing selects
      3. ...
  4. evaluating the relational operators (Ch.14)
    1. select & access paths
    2. project with distinct
    3. join
      1. nested loops join
        1. block nested loops join
        2. index nested loops join
      2. sort-merge join
        1. na´ve: sort first, then merge
        2. 2-pass: accomplishes join in two passes
      3. hash join (two pass)
    4. the set operations
    5. aggregation
  5. relational optimizer (Ch.12&15)
    As covered in Chapter 12, have gone through Chapter 15 very generally. Nothing advanced from Chapter 15.
    1. System R
    2. left-join trees
    3. pipelining
    4. general about the join-enumeration algorithm

 
  Policies

The test will be closed-note, closed-book. You may bring a calculator, but the test will be designed so that any math shouldn't need one.

There will be space on the text packet for writing answers. I will bring extra paper, in case anyone needs it, to attach.

The test will be for the class lecture time, so 75 minutes.

 
Parke Godfrey