COSC-6421: Advanced Database Systems
Assn #2: SQL
Querying the York River Bookseller Database
you will work with an existing database,
the York River Bookseller's Database (YRB DB).
each will install his or her own private copy of the database
on which to work.
You are to implement some SQL queries over the YRB DB,
using our IBM DB2 server.
There are five queries that you are to write.
Each is worth four points,
for a total of twenty points for Assn #2.
I am the manager and I provide you with English specifications
for the queries that I want answered from the database.
You are providing SQL implementations of these queries for YRB DB
to do the job.
Each query is given a name for bookkeeping.
Also, the answer table for each is provided!
You have successfully implemented the SQL query
if you get the same result when run in DB2 on YRB DB.
(And, of course, you have not cheated by making a simple query on purpose
that manages to print the same results!)
Otherwise, your query is unsuccessful.
Grading per query is as follows:
two points for style and logical correctness;
and two points for "compiling" and producing the correct answer set.
scripts are provided:
The script yrb-create will create the YRB DB schema for you.
It will also populate the tables with mock data.
The script yrb-drop is provided for convenience.
It will drop your copy of YRB DB from your DB2 schema space.
If you mess things up,
you can always drop YRB DB and then re-create it easily.
To create the YRB DB in your DB2 schema space:
Read the schema definition in yrb-create for YRB DB
to understand the design and what the YRB DB is about.
York River Booksellers is an online bookstore.
Customers belong to various clubs.
Everybody belongs to at least the club basic.
Books are available via different offers per club.
Thus the price of a book is determined by which offer (thus club)
it was bought under.
Consider an order to be all the books
a customer bought at the same time (when in purchase).
These will be shipped together.
DB2 database named c6421 has been created on the DB2 server
on Prism for your use.
Log into, or open a window, on any Prism machine
There, you can connect to the database
to set up your YRB DB and execute your queries.
See the resources links on our class homepage for basic
information about DB2, connecting, and basic commands.
To how many university clubs
(for example, Guelph, UofT, Waterloo, and York)
does each customer belong?
Do not allow duplicate rows in the answer table.
If a customer belongs to none, list zero for that customer.
Also, you are not allowed to use the words
Guelph, UofT, Waterloo, or York in your query.
We may add new university clubs in the near future.
Order by name + city.
Answer table for uniclub.
See the SQL operator like in the textbook
(pp 140-141, 3rd ed.).
Which pairs of customers have bought at least three books in common?
Print three columns: two with the customers' names and
one with the number of books in common.
Do not return any duplicates.
Furthermore, say 'Mark Dogfurry' and 'Zebulon
Zilio' have four books in common, only output ('Mark Dogfurry',
'Zebulon Zilio', 4) and not ('Zebulon Zilio', 'Mark Dogfurry', 4)!
If 'Mark Dogfurry' and 'Zebulon Zilio' have each bought the same
book three times, this does not count. It has to be at least three
Order by the names.
For each tuple, be certain that name first (column one)
is alphabetically before name second (column two).
Answer table for pairs.
All the books a customer orders at the same time (when) are considered
to be part of the same "order".
Those books are shipped together to the customer and the customer is billed
for the entire order.
What is the bill for each order?
Show the customer's name and city,
the date and time of the order (not the timestamp!), and the bill.
(The column function date extracts the date from a timestamp,
and the column function time extracts the date from a timestamp.)
Order by name + city + when.
Answer table for orders.
Want to shorten one of your decimal column types so your lines do not
wrap on output?
You can cast it to be smaller.
- cast(col_a as decimal(5,2))
This will print decimals (numbers) with 5 numerals to the left of the
"." and 2 to the right.
What is the total weight of each order?
Order by weight, descending,
and then by customers' names (in cases of ties on weight).
Answer table for weights.
For each language,
what is the city that has the largest
percentage of book sales in that language,
across cities which have had at least 20 books in total sold?
Measure book sales in number of books.
For example, say 25% of the books sold in Yorktown are
French, 22% in Harrisonburg are French, 18% in Richmond, and so
forth (and each of this cities qualifies in that at least 20 books
have been sold in each),
"French Yorktown 25" would be in the output.
Show language, city, percentage, and number of books
(sold in that city in that language) for the output rows.
Order by language + city.
Answer table for percentage.
Due by 11:59pm Friday 16 March 2007.
write your query in a file with the corresponding name as above
Do not include the answers generated by each of your queries.
I will test your queries myself.
When you are finished,
use the submit command to turn in your work.
We shall publish soon the submit script that you are to use.
The command will look something like:
- % submit 6421 two
uniclub pairs orders weights percentage
SQL provides a useful clause called with.
It is easiest to show it's use by example.
Who has bought a book via some club
that he or she could have gotten less expensively
using another club to which he or she belongs?
List the customer's name, the book's title and year,
the quantity bought, the price for which it was bought,
and the best price for which it could have been bought.
best (cid, title, year, lowest) as
(select distinct M.cid, O.title, O.year, min(price)
from yrb_member M, yrb_purchase P, yrb_offer O
where M.club = O.club and
M.cid = P.cid and
P.title = O.title and P.year = O.year
group by M.cid, O.title, O.year)
select C.name, P.title, P.year, qnty, price, lowest
from yrb_customer C, yrb_purchase P, Best B, yrb_offer O
where P.cid = B.cid and P.title = B.title and
P.year = B.year and P.title = O.title and
P.year = O.year and P.club = O.club and
C.cid = P.cid and
O.price > B.lowest
order by C.name, P.title, P.year;
Answer table for query refund.
One may have more tables in the with clause, if need be:
first (...) as
second (...) as
third (...) as
Note that you can use the (temporary) view first
inside the definition for second,
and the views first and second
inside the definition for third,
and so forth.
DB2 is picky about nested SELECTS always being named
from (select ...) as A,
(select ...) as B,
Even if you do not need the alias names A or B
later on, give the sub-queries (nested SELECTS) names anyway.
Otherwise, DB2 will complain.