Assignment 14

University of Konstanz
Databases & Information Systems Group
Prof. M. H. Scholl / Jens Teubner / Svetlana Vinnik
Information Systems
Assignments
Winter 2003/04
Assignment 14
(Handout date: Feb 3, 2004, Due date: Feb 10, 2004)
Exercise 1
Relational Algebra / SQL
12 Pts
A grocery store uses the following relation schema:
OpenOrders
order id
ClosedOrders
prod id
order id
prod id
Suppliers
suppl id
name
Products
prod id
descr
city
amount
date
amount
Offers
in stock
min
date
suppl no
price
suppl no
suppl id
price
prod id
price
max
• ‘Offers’ contains the products that supplier suppl id currently offers for price.
• ‘Products’ contains all products; amount describes how many items are in stock, there should be at
least min items in stock, but not more than max.
Express the following queries in Relational Algebra and SQL:
(a) The descriptions of all products that are currently in any open order.
(b) Which products appear in an open order, but there is a better offer by (maybe another) supplier (output:
product id, description, price in order, current offer)?
Express the following queries in SQL:
(a) What is the current minimum price for ‘Milk’ ?
(b) For which products will we go above the stock maximum, if all open orders are shipped?
The database administrator wants to redesign the database schema. He decides to put all orders (open
and closed) into a single table ‘Orders’. The status shall be marked with an additional field ‘status’ that can
either be ‘o’ or ‘c’. To make sure all applications will still be working when he has finished his modifications, he
wants to create views that return the same results as the original tables. Write down the SQL statements
the administrator will have to enter. (Create the new table, enter the existing content, delete the old
tables and create the new views.)
Exercise 2
Prolog
Implement Assignment 4, Exercise 1 in Prolog (Movie database).
(a) Define a fact base that represents the given tables and their content.
(b) Express the following queries as Prolog predicates:
1. List the titles of all films.
8 Pts
2. List the names of all female actresses.
3. Which actors/actresses starred in ‘Bourne Identity, The’ ? List their names.
4. List the titles of all films with female roles.
5. List the titles of all films with no female roles.
6. List the names of all actors/actresses that played together with ‘Grant, Hugh’ in any film.
Exercise 3
B+-Indexstrukturen (Extra-Aufgabe ohne Wertung)
Fahrer
fahrer id
15
6
10
3
9
2
4
17
name
Antonio Pizzonia
Kimi Räikkönen
Heinz-Harald Frentzen
Juan Pablo Montoya
Nick Heidfeld
Rubens Baricello
Ralf Schumacher
Jacques Villeneuve
team
Jaguar-Racing
McLaren-Mercedes
Sauber-Petronas
Williams-BMW
Sauber-Petronas
Ferrari
Williams-BMW
BAR-Honda
0 Pts
nation
Brasilien
Finnland
Deutschland
Kolumbien
Deutschland
Brasilien
Deutschland
Kanada
Der Betreiber einer Formel-Eins-Informationsseite im Internet will die Daten aller Fahrer erfassen. Ein
Praktikant gibt die obige Liste in das Datenbanksystem ein. Das Datenbanksystem verwendet zur Indizierung
des Attributs fahrer id einen B+-Baum der Ordnung 1 (2 Schlüsselwerte je Knoten; Fan-out: 3). Nachdem
der Praktikant die ersten drei Tupel eingegeben hat, sieht der B+-Baum wie folgt aus:
10
6
10 15
Der Praktikant gibt nun die restlichen Tupel in die Datenbank in der obigen Reihenfolge von oben nach
unten ein (also beginnend mit ‘Juan Pablo Montoya’).
(a) Wie sieht der B+-Baum nach jedem einzelnen Schritt aus wenn die Tupel weiter eingegeben
werden?
Kommt es zum Split eines Knoten, verhält sich der B+-Baum so, wie in der Vorlesung angegeben: d. h. die
jeweils größere Hälfte“ wandert in den rechten Knoten. Redistribution kennt das Datenbanksystem nicht.
”
(b) Schätzen Sie für B+- und Hash-Index den Aufwand in Anzahl Seitenzugriffen ab, der für eine
große Anzahl Datensätze n für das Auffinden eines Tupels notwendig ist. Wie groß ist der Aufwand
im Regelfall, wie können sich ungünstig verteilte Daten auswirken?
(c) Beim erstmaligen Laden von großen Tabellen (“Bulk Loading”) liegen die Quelldaten oft bereits vorsortiert vor. Welche Nachteile entstehen, wenn beim Ladevorgang der normale Einfüge-Algorithmus
verwendet wird? Wie könnte man einen Bulk-Loading-Algorithmus konzipieren, der diese Nachteile behebt (und natürlich nur mit vorsortierten Daten funktioniert)?