Test #2 Preparation Guide | CSE-4411M : Database Management Systems | winter term 2013

CSE-4411M
Database Management Systems
York University
Winter 2013

Test #2: Preparation

Layout

Probably four or five questions. I am aiming to make it shorter than midterm #1 so there is not as much time pressure. The type of questions and layout of the test will be similar the first test.

exercise

E.g., What is the I/O cost of this query plan?
How many tuples is this query likely to return? (Use assumptions of System R to estimate.)

short answer

E.g., Which join algorithm would be most appropriate for this query?

analysis

E.g., How would we need to modify the 2-pass sort-merge join to accommodate...

Coverage

Midterm #2 is not intended to be cumulative, and so will not test directly on the subjects covered up to Midterm #1, the physical database, per se.

So not covered:

physical database
- disk space manager
- buffer pool manager
- file, page, and record formats
- internals of index data structures
  - B+ trees
  - extendable and linear hashs
- basics of the external sort algorithm

That said, everything we have done since Test I builds on the material from before. So in that sense, you are responsible for the previous material for this exam, by necessity.

indexes (needed for access paths) (Ch.8) REVIEW
- prefix matching
- cost model for using
- clustered vs unclustered
- ...
external sorting (Ch.13)
- basics REVIEW
  Many other relational algorithms are based on this, and have the same performance issues.
- advanced issues (e.g., double buffering, block reads and writes, etc.)
query evaluation, overview (Ch.12)
1. access paths
2. cost model & System R
  1. reduction factors
  2. cost estimation
3. query trees & alternative plans
  1. pipelining
  2. optimization "rules" like pushing selects
  3. ...
evaluating the relational operators (Ch.14)
1. select & access paths
2. project with distinct
3. join
  1. nested loops join
    1. block nested loops join
    2. index nested loops join
  2. sort-merge join
    1. naīve: sort first, then merge
    2. 2-pass: accomplishes join in two passes
  3. hash join (two pass)
4. the set operations
5. aggregation
relational optimizer (Ch.12&15)
As covered in Chapter 12, have gone through Chapter 15 very generally. Nothing advanced from Chapter 15.
1. System R
2. left-join trees
3. pipelining
4. general about the join-enumeration algorithm

Policies

The test will be closed-note, closed-book. You may bring a calculator, but the test will be designed so that any math shouldn't need one.

There will be space on the text packet for writing answers. I will bring extra paper, in case anyone needs it, to attach.

The test will be for the class lecture time, so 75 minutes.

Parke Godfrey