Overview • Homework • SQL data definition language • SQL data manipulation language (apart from querying) • SQL ≠ SQL • Some advanced SQL concepts Relational Database Systems 1 Wolf-Tilo Balke Joachim Selke query block SELECT Institut für Informationssysteme Technische Universität Braunschweig www.ifis.cs.tu-bs.de ALL expression AS DISTINCT column name * table name . attribute names Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig Exercise 7.1 Exercise 7.1 2. Explain how the GROUP BY clause works. What is the difference between the WHERE and the HAVING clause? • Please answer the following questions briefly. Use your own words! 1. What are the two major components of SQL and what function do they serve? • • • Data definition language (DDL): Used to create/modify the “structure” of the database, e.g. by defining schemas, tables, domains, and lots of other stuff. Data manipulation language (DML): Used to deal with the actual data, that is, insert/delete/modify data tuples from tables. Also used for querying. Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig • 3 4. What is the difference between correlated and uncorrelated subqueries? • If there is a GROUP BY clause, then no item in the SELECT list is allowed to include any reference to an attribute, unless that attribute is the argument to some aggregate function. NULL values are ignored by aggregate functions, with COUNT(*) being the only exception. Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 4 Exercise 7.1 3. What restrictions apply to the use of the aggregate functions within the SELECT statement? How do NULL values affect the aggregate functions? • GROUP BY partitions the rows of a table according to some given list of attributes. Exactly those tuples make up a partition that share identical values with respect to all attributes in the list. Special aggregate functions can be used to access group properties (e.g. number of rows). WHERE is imposes conditions on individual rows (before grouping), HAVING imposes conditions on groups. For example, HAVING COUNT(*) > 5 selects exactly those groups containing more than five rows. Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig Exercise 7.1 • 2 • 5 A subquery is correlated (to its surrounding query context) iff it uses a non-static value defined in the outer query. Therefore, correlated subqueries typically need to be evaluated once for each row processed by the outer query. Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 6 Exercise 7.2 Exercise 7.2 city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) • An employee database: city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) a) SELECT e.name, ci.name FROM employee e JOIN city ci ON e.city = ci.id JOIN worksfor w ON e.id = w.employee JOIN company co ON w.company = co.id WHERE co.name = ’First Bank Corporation’ • Give an expression in SQL for each of the following queries: ... Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 7 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig Exercise 7.2 city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) b) Find all employees who live in the same city and on the same street as their manager. c) SELECT e.name, e.street, eci.name FROM employee e JOIN city eci ON e.city = eci.id JOIN managedby mb ON e.id = mb.employee JOIN employee m ON mb.manager = m.id WHERE e.street = m.street AND e.city = m.city Find all employees who live in city in which their company has its headquarters. SELECT e.name, e.street, eci.name FROM employee e JOIN city eci ON e.city = eci.id JOIN worksfor w ON e.id = w.employee JOIN company co ON w.company = co.id WHERE e.city = co.headquarters 9 Exercise 7.2 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 10 Exercise 7.2 city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) d) Find all employees who live in a city in which their company has its headquarters or a subsidiary. WITH cc (company, city) AS (SELECT id, headquarters FROM company UNION SELECT company, city FROM hassubsidiary) SELECT e.name, e.street, eci.name FROM employee e JOIN city eci ON e.city = eci.id JOIN worksfor w ON e.id = w.employee JOIN cc ON w.company = cc.company WHERE e.city = cc.city Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8 Exercise 7.2 city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig Find the names and cities of all employees who work for the First Bank Corporation. e) Find the company that has the most employees. (Let’s assume there is a least one company that has employees.) WITH ce (company, numepms) AS (SELECT c.id, COUNT(*) FROM company c JOIN worksfor w ON c.id = w.company GROUP BY c.id) SELECT name, numemps FROM ce WHERE numemps = (SELECT MAX(numemps) FROM ce) 11 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 12 Exercise 7.2 Exercise 7.2 city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) f) city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) Find the company that has the smallest payroll (the payroll of a company is the sum of salaries of all their employees). g) Find all employees who earn more than each employee of the Small Bank Corporation. WITH payroll (company, p) AS (SELECT c.id, SUM(w.salary) FROM company c LEFT JOIN worksfor w ON c.id = w.company GROUP BY c.id) SELECT name, p FROM payroll WHERE p = (SELECT MIN(p) FROM payroll) Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig SELECT e.name, e.street, eci.name FROM employee e JOIN city eci ON e.city = eci.id JOIN worksfor w ON e.id = w.employee WHERE w.salary > (SELECT MAX(salary) FROM worksfor w2 JOIN company co2 ON w2.company = co2.id WHERE co2.name = ’Small Bank Corporation’) 13 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig Exercise 7.2 Exercise 7.2 city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) i) h) Find all employees who do not have a manager. SELECT e.name, e.street, eci.name FROM employee e JOIN city eci ON e.city = eci.id LEFT JOIN managedby mb ON e.id = mb.employee WHERE mb.manager IS NULL Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig Find all companies that have subsidiaries in exactly the same cities as the Small Bank Corporation. WITH sbc (city) AS (SELECT city FROM hassubsidiary hs JOIN company co ON hs.company = co.id WHERE co.name = ’Small Bank Corporation’) SELECT co.name FROM company co LEFT JOIN hassubsidiary hs ON co.is = hs.company LEFT JOIN sbc ON hs.city = sbc.city GROUP BY co.id, co.name HAVING COUNT(sbc.city) = COUNT(*) AND COUNT(*) = (SELECT COUNT(*) FROM sbc) 15 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig Exercise 7.2 • Assume that there is a table numbers that consists of a single integer-valued attribute value. Explain in natural language what is returned by the following SQL statements: Find those companies whose employees earn a higher salary, on average, than the average salary at the First Bank Corporation. a) WITH avgsal (coid, coname, salary) AS (SELECT co.id, co.name, AVG(salary) FROM company co JOIN worksfor w ON co.id = w.company GROUP BY co.id, co.name) SELECT coname FROM avgsal WHERE salary > (SELECT MAX(salary) FROM avgsal WHERE coname = ’First Bank Corporation’) Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 16 Exercise 7.3 city(id, name) company(id, name, headquarters → city) hassubsidiary(company → company, city → city) employee(id, name, street, city → city) worksfor(employee → employee, company → company, salary) managedby(employee → employee, manager → employee) j) 14 SELECT * FROM numbers WHERE value > (SELECT AVG(value) FROM numbers) Returns all values that are large than the average value (in numbers). 17 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 18 Exercise 7.3 b) Exercise 7.3 SELECT value FROM (SELECT n.value, (SELECT COUNT(*) FROM numbers WHERE value < n.value) x FROM numbers n) WHERE (SELECT 0.5 * COUNT(*) FROM numbers) <= x Returns all values (in numbers) for that there are at least ⌊m / 2⌋ smaller values (in numbers), where m is the cardinality of the table numbers. c) SELECT value - a FROM numbers, (SELECT AVG(value) a FROM numbers) WHERE value >= a Returns all values (in numbers) that are at least as large as the mean (of the values in the table numbers), where the mean is subtracted from each value. Or: Returns all values that are at least as large as the median (of the values in the table numbers). 19 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.1 Recap 8.1 SQL DDL Environment “BigCompany database server” Catalog “humanResources” Schema “people” • Last week, you learned how to query an existing relational database ... • What’s missing? Catalog “production” Schema “taxes” Schema “products” ... ... Table “staff” Table “has Office” 20 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig Schema “training” Schema “testing” ... ... 21 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.1 SQL DDL: Schemas – How to create schemas, tables, ... – How to drop schemas, tables, ... – How to alter schemas, tables, ... DDL – How to insert new tuples into existing tables? – How to delete tuples from existing tables? – How to update tuples in existing tables? DML 8.1 SQL DDL: Tables • CREATE SCHEMA creates a new schema with a given name for a given owner • CREATE TABLE creates a new table with a given name – If no schema name is provided, the current username is used – If no explicit owner is provided, also the current user is used – Contains column definition for each column – Contains additional table-specific structural constraints • Example – CREATE SCHEMA heroes AUTHORIZATION balke , CREATE TABLE schema name AUTHORIZATION schema name name ( column def ) table constraint … CREATE SCHEMA 22 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig … user name AUTHORIZATION user name Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 23 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 24 8.1 SQL DDL: Tables 8.1 SQL DDL: Tables data type SMALLINT – Each column has a name and a data type – Each column may have multiple column options – Example: INTEGER DECIMAL ( precision • CREATE TABLE person ( name VARCHAR(200), age INTEGER) column def ) , scale FLOAT CHAR ( length ) VARCHAR ( length ) column name data type DATE column option TIME TIMESTAMP … Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 25 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 7.3 SQL DDL:Tables Name Syntax description Integer INTEGER Integer value Float FLOAT Floating point number of approximate precision (the supported precision is implementation-dependent) Numeric NUMERIC(p, s) 8.1 SQL DDL: Tables column def column name CHAR(x) A textual string of length x Character varying VARCHAR (x) A textual string of length at most x Date DATE Year, month, and day Time TIME Hours, minutes, and seconds Timestamp TIMESTAMP A date and a time data type column option column option NOT NULL DEFAULT A number with p digit before the decimal point and s digits after the decimal point Character 26 NULL constant … CHECK ( check condition ) CONSTRAINT PRIMARY KEY const. name UNIQUE reference clause … Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 27 8.1 SQL DDL: Tables 28 8.1 SQL DDL: Tables • Column constraints: – NOT NULL: The NULL value is not allowed for the column – Example: – Restricts possible values for the current column – May have a unique name indicated by CONSTRAINT <name> • CREATE TABLE person ( name VARCHAR(200) NOT NULL, age INTEGER NOT NULL) • If name is omitted, system creates a default name – CHECK: User-defined constraint. To be valid, values have to satisfy the condition. – Example: – DEFAULT: Defines the default value if a value is not explicitly set • CREATE TABLE person ( name VARCHAR(200), age INTEGER CONSTRAINT adult CHECK (age >= 18)) • Usually a constant or NULL • If omitted, NULL is the default Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 29 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 30 8.1 SQL DDL:Tables 8.1 SQL DDL:Tables – UNIQUE: No duplicate values are allowed within this attribute – PRIMARY KEY: Each table may have a primary key (optionally, but recommended) made up of at least one column • This option can only be used if the uniqueness constraints concerns only a single attribute • For multi-attribute uniqueness constraints, there is a different option (later) • Implies NOT NULL • This option can only be used if the primary key consists of only one column • For multi-column primary keys, you need a different option (later) • Implies NOT NULL and UNIQUE – Note: In DB2, NOT NULL has to be specified nevertheless ... – Example: – Again, DB2 needs an explicit NOT NULL – Additionally, a referential clause may be specified (see next slides) • CREATE TABLE person ( name VARCHAR(200) NOT NULL UNIQUE, age INTEGER NOT NULL) Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 31 32 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.1 SQL DDL: Referential Integrity 8.1 SQL DDL: Referential Integrity Example: • Rows in tables may refer to rows in other tables to capture relationships • Of course, you may not refer to a non-existing row Conceptual ER model power strength alias * Hero id has id * Super power name real name – Referential integrity between primary keys and foreign keys ensures that references are correct description ER model in 1NF Alias alias name has 1 * Hero id power strength * id * has Super power real name Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 33 8.1 SQL DDL: Referential Integrity – Either used by constraints in column options or within table constraints primary key Hero id realName foreign key reference clause HasPower heroId powerId 34 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig • Referential integrity can be defined using the REFERENCES clause Tables refer to others by primary keys and foreign keys aliasName description 8.1 SQL DDL: Referential Integrity Resulting tables: HasAlias heroId name REFERENCES table name ( powerStrength column name ) , primary key Power id name ON DELETE description Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 35 NO ACTION SET NULL CASCADE ON UPDATE NO ACTION CASCADE Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 36 8.1 SQL DDL: Referential Integrity 8.1 SQL DDL: Referential Integrity • Optionally, you may specify what happens if a row that is referenced will be deleted or modified • Example: – CREATE TABLE employee( id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(100) NOT NULL) – CREATE TABLE managedBy( employee INTEGER NOT NULL REFERENCES employee, manager INTEGER NOT NULL REFERENCES employee) – ON DELETE: If a referenced row is deleted, ... • NO ACTION: ...reject the deletion (that is, it cannot be performed) • SET NULL: ...delete it and set all referencing foreign keys to NULL • CASCADE: ...delete it along with all rows referring to it – ON UPDATE : If the primary key of a referenced row is modified, ... • NO ACTION: ...reject the modification (that is, it cannot be performed) • CASCADE: ...change all values of referencing foreign keys – Default is ON DELETE NO ACTION ON UPDATE NO ACTION ON DELETE 37 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig NO ACTION SET NULL CASCADE ON UPDATE NO ACTION CASCADE Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 38 8.1 SQL DDL:Table Constraints 8.1 SQL DDL:Table Constraints , name CREATE TABLE column def ( ) table constraint • Table constraints behave similar to constraints in column options … table constraint – If no name is provided, a name is automatically generated – The CHECK condition may contain any Boolean predicate – In contrast to column options, table constraints may declare primary keys consisting of multiple attributes – Foreign keys declare references to primary keys of other tables CHECK ( check condition ) PRIMARY KEY const. name CONSTRAINT ( column name ) , FOREIGN KEY column name ( ) reference clause • See referential integrity , Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 39 8.1 SQL DDL:Table Example HasAlias heroId primary key HasPower heroId realName powerId foreign key powerStrength primary key Power id name description 8.1 SQL DDL:Table Example delete alias if hero is deleted Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 40 • CREATE TABLE hero ( id INTEGER NOT NULL PRIMARY KEY, realName VARCHAR(100)) • CREATE TABLE power( id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(100), description VARCHAR(255)) link hasAlias to hero • CREATE TABLE hasAlias ( heroId INTEGER REFERENCES hero ON DELETE CASCADE ON UPDATE CASCADE, aliasName VARCHAR(100) NOT NULL, PRIMARY KEY (heroId, aliasName)) aliasName Hero id Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 41 update alias if hero is updated Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 42 8.1 SQL DDL:Table Example 8.1 SQL DDL: Drop Tables • For deleting tables, there is the DROP TABLE command • CREATE TABLE hasPower ( heroId INTEGER, powerId INTEGER, powerStrength INTEGER NOT NULL, PRIMARY KEY (heroId, powerId), FOREIGN KEY (heroId) REFERENCES hero ON DELETE CASCADE ON UPDATE CASCADE, FOREIGN KEY (powerId) REFERENCES power ON DELETE CASCADE ON UPDATE CASCADE ) , DROP TABLE table name CASCADE RESTRICT – If RESTRICT is used, you may only drop empty tables that are not referenced by any other table – If CASCADE is used, all referencing tables are also deleted (including all stored rows) • DB2 does not support CASCADE … – If neither is used, the table does not have to be empty, but must not be referenced by another one – Example • DROP TABLE hero CASCADE, power CASCADE Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 43 8.1 SQL DDL: Alter Tables 8.1 SQL DDL: Alter Tables – If you add a new column with a NOT NULL constraint, you also need to provide a default value – When dropping a column, you must either choose • After a table has been created, you may alter it by adding/removing columns or constraints ALTER TABLE name ADD COLUMN • CASCADE to also delete all tables referring to the column • RESTRICT to allow the drop only if there is no referring column (default) column def – If the name of a constraint is auto-generated, you need to look it up in the system catalog – Example: table constraint DROP PRIMARY KEY CONSTRAINT name • ALTER TABLE hasPower DROP powerStrength ADD COLUMN powerCharacteristic VARCHAR(100) CHECK FOREIGN KEY RESTRICT column name CASCADE Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 45 8.2 SQL DML • Now we have wonderful, empty tables • We need to put data into them! – Creating, changing, altering schemas, tables, ... CREATE SCHEMA CREATE TABLE ALTER TABLE DROP TABLE – INSERT INTO TABLE statement – You can specify into what columns you want to insert data • Default: All columns • Data manipulation language (DML) – New values may be stated explicitly as a table matching the insert columns – Querying • Alternatively, values may be provided by a query result • SELECT – Adding and updating data literal table • INSERT INTO • UPDATE • DELETE Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 46 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.2 SQL DML: Insert • Data definition language (DDL) • • • • 44 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig INSERT INTO table name ( column name ) , 47 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig query 48 8.2 SQL DML: Insert 8.2 SQL DML: Insert NULL literal table VALUES expression ( • INSERT INTO hero (id, realName) VALUES (1, ’Charles F. Xavier’), (2, ’Jean Grey’) • INSERT INTO hasAlias VALUES ( 1, ’Professor X’), ( 1, ’Onslaught’), ( 2, ’Phoenix’), ( 2, ’Marvel Girl’) ) , NULL expression , • A literal table is defined extensionally: James Howlet Charles Xavier Jean VALUES (’James’, ’Howlet’) Grey VALUES (’Charles’, ’Xavier’), (’Jean’, ’Grey’) Wolverine VALUES ’Wolverine’, (’Professor X’), ’Phoenix’ Professor X Phoenix , 49 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.2 SQL DML: Insert • Existing rows can also be changed using the UPDATE statement – Very similar to the SELECT statement – Update finds rows fulfilling a given condition and changes some of its rows by assigning new values – INSERT INTO heroesStartingWithA ( SELECT * FROM heroes WHERE realName LIKE ’A%’) 51 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.2 SQL DML: Update – UPDATE hero SET realName = NULL alias name • Multiply all powerStrength values by 10 – UPDATE hasPower SET powerStrength = powerStrength * 10 expression = • Change the name of hero with id 1 NULL , ( column name – UPDATE hero SET name= ’Charles Francis Xavier’ WHERE id = 1 DEFAULT , )= expression ( • Change name and id of Jean Grey ) – UPDATE hero SET (id, name)=(’3’, ’Jean Grey-Summers’) WHERE name = ’Jean Grey’ – Change of id is propagated to other tables when ON UPDATE CASCADE is used in table definition NULL DEFAULT ( WHERE 52 • Hide the real name of each hero table name column name Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.2 SQL DML: Update AS SET 50 8.2 SQL DML: Update • Of course, subqueries may also be used in INSERT statements UPDATE TABLE Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig query ) • Again, subqueries can be used in the WHERE clause search condition Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 53 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 54 8.2 SQL DML: Delete 8.2 SQL DML: Delete • The DELETE statement is used to delete rows from a table • Again, subqueries can be used here – DELETE FROM heroes h WHERE NOT EXISTS (SELECT * FROM hasAlias a WHERE a.heroId = h.id) – Deletes all rows satisfying a certain search condition – Example: • Delete Jean Grey – DELETE FROM hero WHERE name = ’Jean Grey’ • Delete all heroes – DELETE FROM hero table name DELETE FROM AS alias name WHERE search condition Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 55 8.3 SQL ≠ SQL Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 56 8.3 SQL ≠ SQL A timeline of SQL standardization: • First, the good news: – SQL has been standardized by the ISO in 1987 – The standard is well-maintained and under active development • 1986 – ANSI SQL – Relations, attributes, views – SELECT ... FROM ... WHERE ... • SQL-86, SQL-89, SQL-92, SQL:1999, SQL:2003, SQL:2006, SQL:2008 • 1987 – Many “big” database vendors participate in the standardization process: – SQL-86 (ISO/IEC 9075:1986) • 1989 • IBM, Ingres, Microsoft, Oracle, Sybase, ... – SQL-89 (SQL1) • ≈ SQL-86 + restricted referential integrity Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 57 8.3 SQL ≠ SQL Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 58 8.3 SQL ≠ SQL • 1992 • 1999/2000 – SQL:1999 (SQL3) – 5 parts, 2,084 pages – ≈ SQL-92 + object-orientation, recursive queries, triggers, OLAP, user-defined types, regular expressions – Computationally complete, object-oriented database programming language, descriptive and procedural – Core (about180 features) – SQL-92 (SQL2) – 3 parts, 1,120 pages – Entry Level • ≈ SQL-89 + CHECK (attribute) – Intermediate Level • ⊇ Entry Level + domains, CHECK (relation), CASE, CAST, JOIN, EXCEPT, INTERSECT • ≈ SQL92 Entry Level + parts of Intermediate and Full Level – Full Level – 9 Packages (about 240 features) • ⊇ Intermediate Level + assertions, nested select, nested from Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig • enhanced date/time, enhanced integrity, OLAP, PSM, CLI, basic object support, enhanced object support, trigger, SQL/MM 59 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 60 8.3 SQL ≠ SQL 8.3 SQL ≠ SQL • Solution: • Recursive queries (SQL:1999): – How to find all rivers that flow into the North Sea? – Multiple joins? flows_into river mouth Oker Aller Aller Weser Weser North Sea Elbe North Sea Edder Wietze Flöth Wietze Wietze Aller Isar Danube Inn Danube Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig WITH RECURSIVE flows_into_ns (river) AS (SELECT river FROM flows_into WHERE mouth = ’North Sea’ UNION SELECT fi.river FROM flows_into AS fi JOIN flows_into_ns AS fins ON fi.mouth = fins.river) SELECT river FROM flows_into_ns • Note: DB2 uses a different syntax – No RECURSIVE, UNION ALL instead of UNION, no explicit JOIN 61 8.3 SQL ≠ SQL Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 62 8.3 SQL ≠ SQL • 2003 • 2006 – SQL:2006 – Successor of SQL:2003 – A new extension for XML handling – SQL:2003 – 14 parts, 3,606 pages – MULTISET as an explicit construct (with numerous operations, such as: MULTISET UNION, MULTISET EXCEPT, MULTISET INTERSECT, CARDINALITY) – Sequence generators • Importing, storing, querying, and manipulating XML data • Support for XQuery • Concurrently access (object-)relational data and XML data • 2008 • CREATE SEQUENCE <sequence name> AS <type name> [START WITH <value>] [INCREMENT BY <value>] [NO MINVALUE | MINVALUE <value>] [NO MAXVALUE | MAXVALUE <value>] [NO CYCLE | CYCLE] – SQL:2008 – Successor of SQL:2006 – “Maintenance release” (some new but minor features) – Base type XML for mappings between SQL and XML Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 63 8.3 SQL ≠ SQL Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 64 8.3 SQL ≠ SQL • A common myth among software developers: • Well, here are the bad news: – There are still too many variants of SQL (both syntactic and semantic differences) If your application uses only standard SQL, then it is portable. • True application portability remains a challenge – The standard has been used to introduce two kinds of features: 1. Features that are well-understood and widely implemented 2. New and largely untried technologies, hoping that vendors follow the lead and deliver new functionalities • If you don’t believe me, here are some examples ... – Vendors don’t care too much about the standard Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 65 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 66 8.3 SQL ≠ SQL 8.3 SQL ≠ SQL • CREATE TABLE test (x VARCHAR(100)) • CREATE TABLE name (first VARCHAR(100), middle VARCHAR(100), last VARCHAR(100)) • INSERT INTO test VALUES (’a’) INSERT INTO test VALUES (’ä’) INSERT INTO test VALUES (’_’) INSERT INTO test VALUES (NULL) • INSERT INTO name VALUES (’George’, ’Walker’, ’Bush’) INSERT INTO name VALUES (’Horst’, ’’, ’Köhler’) INSERT INTO name VALUES (’Angela’, NULL, ’Merkel’) • SELECT * FROM test ORDER BY x • What result do you expect? • What’s the SQL standard’s opinion? – ’’ (empty string) means that we know that there is no middle name – NULL means that we don’t know whether there is a middle name – a, ä, and _ are ordered as defined in the unicode standard (default collation, possibly language-specific) – The SQL standard leaves it open where to place NULL values ... • Sounds like a good design? What do you think? • Optional NULLS FIRST / NULLS LAST keyword in ORDER BY clause • What’s happening in practice? – According to the SQL standard, this approach is fine ... – ... unless your RDBMS is Oracle (’’ is the same as NULL) Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig – Depending on the RDBMS and its configuration, this query may return any order ... 67 8.3 SQL ≠ SQL • Whitespace is whitespace, isn’t it? – The SQL standard defines the following notions: – Since SQL:1999, all unicode whitespace characters (ordinary space, tabulator, non-breaking space, em space, ...) are equivalent – However, using non-breaking spaces in DB2 produces strange errors: Environment Cluster Catalog Schema – The reality: • • • • • SELECT * FROM test Database server (unsupported) Database Schema – But attention: • In MySQL, there are no catalogs, “schema” and “database” are synonyms • In Oracle, there is exactly one schema per user; CREATE/ALTER SCHEMA x <command> executes <command> on all objects located in schema x Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 69 8.3 SQL ≠ SQL SQL0104N An unexpected token "SELECT " was found following "BEGIN-OF-STATEMENT". Expected tokens may include: "SELECT". SQLSTATE=42601 • Hint: You cannot see it, but after SELECT, there is a non-breaking space. • This typically becomes a problem when copy-and-pasting code ... – It’s probably the same for other RDBMS • I haven’t tried yet ... Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 70 8.3 SQL ≠ SQL • The BOOLEAN data type • The statement terminator “;” – CREATE TABLE customers (id INTEGER PRIMARY KEY, name VARCHAR(100), is_vip BOOLEAN, is_blacklisted VARCHAR(100)) – SELECT id, name FROM customers WHERE is_vip AND NOT is_blacklisted – Practice? – According to the SQL standard, (almost) every SQL statement has to be terminated by a semicolon – What’s happening in practice? • Many RDBMS treat the terminator as being optional (which is fine, but may cause some problems) • Some RDBMS either strictly require a terminator or complain if it is present • In some RDBMS, this behavior can be configured ... • Not supported by Oracle, DB2, and MS SQL Server – Summary: No matter what you do, it causes problems! Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 68 8.3 SQL ≠ SQL • What about terminology? • • • • Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig – Official workarounds: Use CHAR or INTEGER ... • Supported by MySQL (non-conforming) and PostgreSQL 71 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 72 8.3 SQL ≠ SQL 8.3 SQL ≠ SQL • Helpful tools: • Summary – SQL::Translator (Perl module) http://search.cpan.org/dist/SQL-Translator – SQL is not SQL – In some cases, even identical SQL code works differently on different RDBMS – SwisSQL Console (GUI tool, discontinued) http://www.swissql.com/products/sql-translator/sql-converter.html – SwisSQL API (.NET/Java SQL wrapper, discontinued) • Current trends? – Open-source RDBMS (PostgreSQL, MySQL, Derby, ...) typically try to adhere to the standard http://www.swissql.com/products/sqlone-apijava/sqlone-apijava.html http://www.swissql.com/products/sqlone-apidotnet/sqlone-apidotnet.html • However, many advanced features are not supported yet – Very recently, DB2 added support for Oracle’s SQL (in DB2 9.7) Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 73 8.4 Advanced Concepts 74 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.4 Type Casting • Type casting • Ranking functions • CASE expressions • SQL is a strongly typed language – Basically, this means that e.g. INTEGER is different from VARCHAR(100) • If data types are incompatible, type casting may be used to make them compatible – CAST expression – During casting, precision may be lost (e.g. FLOAT → INTEGER) – Example: • CAST (powerStrength AS NUMERIC(3, 2)) • CAST (alias || realName AS CHAR(255)) CAST expression CAST ( expression AS data type ) NULL Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 75 8.4 Type Casting Target SMALLINT, INTEGER, DECIMAL, FLOAT SMALLINT, INTEGER, DECIMAL, FLOAT CHAR, VARCHAR, LONG VARCHAR, CLOB CHAR, VARCHAR, LONG VARCHAR, CLOB, BLOB CHAR, VARCHAR SMALLINT, INTEGER, DECIMAL, DATE, TIME, TIMESTAMP, VARGRAPHICS SMALLINT, INTEGER, DECIMAL CHAR DATE, TIME, TIMESTAMP CHAR, VARCHAR DATE DATE TIME TIME TIMESTAMP TIMESTAMP GRAPHICS, VARGRAPHICS, LONG VARGRAPHICS, DBCLOB GRAPHICS, VARGRAPHICS, LONG VARGRAPHICS, DBCLOB Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 76 8.4 Ranking Functions • In DB2, possible castings are: Source Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig • Since SQL:2003, there are special functions for working with result lists • Examples: – Output only every other row of the list – Create a ranking with explicit ranks (1, 2, 3, ...) – On what rank position is some given row? 77 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 78 8.4 Ranking Functions 8.4 Ranking Functions – Example: At which position is Wolf-Tilo? • ROW_NUMBER() returns the position of each row in the result list – SELECT name, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS pos WHERE name = ’Wolf-Tilo’ – Example: SELECT name, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS pos person name salary name Christoph 45000 Larry Wolf-Tilo 75000 Larry 200000000 Joachim 45000 salary pos name salary 200000000 1 person Christoph 45000 Wolf-Tilo 75000 2 Wolf-Tilo 75000 Joachim 45000 3 Larry Christoph 45000 4 Joachim 200000000 name salary pos Wolf-Tilo 75000 2 45000 Depending on the implementation, the last two rows may switch positions 79 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 80 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.4 Ranking Functions 8.4 Ranking Functions – Example: Show only rows at even positions: • RANK() returns the rank of each row in the result list – SELECT name, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS pos WHERE (pos % 2) = 0 • % means modulo person – Example: SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS rank name salary name salary name Christoph 45000 name salary pos Christoph 45000 Larry Wolf-Tilo 75000 Wolf-Tilo 75000 2 Wolf-Tilo 75000 200000000 Christoph 45000 4 Larry Larry Joachim person 45000 200000000 Joachim 81 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 45000 salary rank 200000000 1 Wolf-Tilo 75000 2 Joachim 45000 3 Christoph 45000 3 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.4 Ranking Functions 8.4 CASE Expressions • DENSE_RANK() works like RANK() but does not skip ranks on ties (as it is usually done) • Very often codes are used for storing more complex information – Example: – Retrieving the account information for owner “Clark” with appropriate account descriptions needs a join – Indicate all customers with a negative balance with the string “not creditworthy” in the query result SELECT name, salary, RANK() OVER (ORDER BY salary ASC) AS rank, DENSE_RANK() OVER (ORDER BY salary ASC) AS drank, person name salary name salary rank drank Christoph 45000 Christoph 45000 1 1 Wolf-Tilo 75000 Joachim 45000 1 1 Wolf-Tilo 75000 3 2 200000000 4 3 Larry 200000000 Joachim 45000 Larry Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 82 83 account owner balance type acc_type type description Clark 367,00 0 0 checking account Louis -675,00 0 1 savings account Clark 54987,00 1 2 credit card account Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 84 8.4 CASE Expressions 8.4 CASE Expressions – The simple WHEN clause: • The CASE expression allows a value to be selected based on the evaluation of one or more conditions (similar to “if-then-else”) • Compares an expression to each case expression one by one – If expression is equal to search value, the corresponding result expression is returned – Comes in two flavors: • If no match is found, then some default (ELSE clause) is returned case expression CASE simpleWHEN-clause searchedWHEN-clause – If ELSE is omitted, then NULL is returned ELSE NULL ELSE END result expression Simple when clause expression WHEN case expression THEN result expression NULL 85 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.4 CASE Expressions Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 86 8.4 CASE Expressions – The searched WHEN clause: • Example: simple WHEN clause • Checks search conditions from left to right • Stops as soon as a search condition evaluates to true – Directly decode the account type – Returns the corresponding result then • SELECT owner, CASE type WHEN 0 THEN ’checking account’ WHEN 1 THEN ’savings account’ WHEN 2 THEN ’credit card account’ END account owner balance type FROM account Clark 367,00 0 Louis -675,00 0 Clark 54987,00 1 • If no condition is true, the value given by the ELSE clause is returned (or NULL, if there is no ELSE clause) WHEN searched when clause 87 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8.4 CASE Expressions search condition THEN result expression NULL Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 88 8.5 More on SQL • Example: searched WHEN clause – Retrieve credit rating of customers based on their checking accounts • SELECT owner, CASE WHEN balance < 0 THEN ’not credit-worthy’ WHEN balance = 0 THEN ’questionable’ ELSE ’credit-worthy’ END account owner balance type FROM account Clark 367,00 0 WHERE type = 0 Louis -675,00 0 Clark 54987,00 1 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 89 • There are many more SQL statements than we are covering in our lectures • Moreover, there are many different SQL dialects • If you don’t want to get mad, do the following: – Don’t care too much about the SQL standard (unless you are actually implementing an RDBMS) – Read the SQL manuals of your RDBMS! Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 90 8.5 More on SQL • Example: DB2’s SQL reference (version 9.7) – http://publib.boulder.ibm.com/infocenter/db2luw/v9r7 Relational Database Systems 1 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 91