6. SQL/XML 6.1Introduction 6.2 Publishing relational data in XML 6.3 XML data type XML Databases 6.4 Queries 6. SQL/XML 6.5 Validation Silke Eckstein Andreas Kupfer Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 6.6 SQL/XML standard 6.7 Overview 6.8 References XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6.1 Introduction 6.1 Introduction • Creating XML documents from a database • SQL/XML – Introduced in the last chapter – On a more or less conceptual level – Storage of XML in all big commercial DBMS available – Proprietary solution for embedding in SQL – SQL/XML = Part 14 of the SQL-Standard: XML functionality – Incorporates the corresponding standards for XML (XML Schema, XQuery) – Basic idea: • Not handled so far – Creating XML documents inside a database – Retrieving data from XML documents – Changing XML document content • Solution: Integration in database • Mapping of SQL concepts to XML (see last chapter) • Own datatype to store XML – SQL/XML 3 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig [Kud07] 6.1 Introduction Storing XML documents inside the database as values of type XML – SQL charset to unicode (depends on implementation) – SQL identifiers to XML names – SQL data types to XML schema data types – SQL values to XML values – SQL tables to XML and XML schema documents – SQL schemas to XML and XML schema documents – SQL catalogues to XML and XML schema documents Generating XML documents using SQL/XML functions – Datatype XML with Mapping between SQl and XML belonging functions SQL XQuery – Mapping between SQL and XML SQL database XML datatype – Embedding XQuery in SQL [Tür08] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 4 • Mapping SQL database to XML <City> <City> <Name> Braunschweig </Name> <Zip>38100</Zip> <Zip>38106</Zip> <State> Niedersachsen </State> </City> XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6.1 Introduction <City> • SQL/XML 2 5 [Tür08] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6 Mapping SQL tables CREATE TABLE Account ( Name CHAR(20), Balance NUMERIC(12,2), ); Name Joe Jim Mapping SQL table columns to XML elements <xsd:complexType name="ROW.ACCOUNT"> <xsd:sequence> <xsd:element name="NAME" type="CHAR_20"/> <xsd:element name="BALANCE" type="NUMERIC_12_2"/> </xsd:sequence> </xsd:complexType> 6.1 Introduction • Relational table: Cities Balance Mapping table <xsd:complexType name="TABLE.ACCOUNT"> 2000 rows to XML <xsd:annotation><xsd:appinfo> <row> <xqlxml:sqlname type="BASE TABLE" 3500 elements localName="ACCOUNT"/> </xsd:appinfo></xsd:annotation> <ACCOUNT> <xsd:sequence> <row> <xsd:element name="row" <NAME>Joe</NAME> type="ROW.ACCOUNT"/> <BALANCE>2000</BALANCE> </xsd:sequence> </row> </xsd:complexType> <row> <NAME>Jim</NAME> <BALANCE>3500</BALANCE> </row> </ACCOUNT> [Tür08] <xsd:element name="ACCOUNT" type="TABLE.ACCOUNT"/> XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig Niedersachsen Braunschweig 38106 Niedersachsen Hannover 30159 Niedersachsen ... <City> <Name>Braunschweig</Name> <Zip>38100</Zip> <Zip>38106</Zip> <State>Niedersachsen</State> </City> ... ... <State name="Niedersachsen"> <City name="Braunschweig"> <Zip>38100</Zip> <Zip>38106</Zip> </City> </State> ... XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 8 6.2 Publishing relational data • XMLELEMENT creates an XML element – Example: creating name and content 6.2 Publishing relational data in XML XMLELEMENT( NAME "City", 'Bad Oeynhausen' ) 6.3 XML data type Creates <City>Bad Oeynhausen</City> 6.4 Queries – Can contain attributes, comments and other elements and options 6.5 Validation XMLELEMENT( NAME "City", XMLCOMMENT ( "Example 2" ), XMLATTRIBUTES('Bayern' AS "State", '80469' AS "Zip" ),'München' ) 6.6 SQL/XML standard 6.7 Overview Creates <City State="Bayern" Zip="80469"><!– Example 2 --> München</City> 6.8 References 9 [Kud07] 6.2 Publishing relational data • XMLELEMENT referencing the database XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 10 6.2 Publishing relational data • XMLELEMENT nesting – Can be used directly from an SQL statement – Example SELECT XMLELEMENT( NAME "City", XMLCOMMENT ( "Example 3" ), XMLATTRIBUTES( "State", "Zip" AS "PLZ" ), "City" ) FROM Cities WHERE …; SELECT XMLELEMENT( NAME XMLELEMENT( XMLELEMENT( XMLELEMENT( FROM Cities WHERE …; • Creates "City", NAME "Name", "City" ), NAME "State", "State" ), NAME "Zip", "Zip" ) ) • Creates <City STATE="Niedersachsen" PLZ="38100"> <!– Example 3 --> Braunschweig </City> XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig State 38100 7 6.1Introduction [Kud07] Zip Braunschweig • Many possible XML documents 6. SQL/XML XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig City <City> <Name>Braunschweig</Name> <State>Niedersachsen</State> <Zip>38100</Zip> </City> 11 [Kud07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 12 6.2 Publishing relational data • XMLELEMENT syntax diagram 6.2 Publishing relational data • XMLFOREST – Constructs a forest of elements without attributes SELECT XMLFOREST ( "City", "State" ) FROM Cities; • Creates <City>Braunschweig</City><State>Niedersachsen</State> <City>Braunschweig</City><State>Niedersachsen</State> <City>Hannover</City><State>Niedersachsen</State> [IBM] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 13 [Kud07] 6.2 Publishing relational data • XMLFOREST syntax diagram XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 14 6.2 Publishing relational data • XMLCONCAT – Concatenates multiple XML fragments into a single XML pattern – Compare outputs SELECT XMLELEMENT("city", City) AS "CITY", XMLELEMENT("zip", Zip) AS "ZIP", XMLELEMENT("state", State) AS "STATE" FROM Cities; SELECT XMLCONCAT( XMLELEMENT("city", CITY), XMLELEMENT("zip", ZIP), XMLELEMENT("state", STATE) ) FROM Cities; [IBM] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 15 [Pow07] 6.2 Publishing relational data 16 6.2 Publishing relational data • XMLAGG • XMLAGG – Allows sorting – Aggregates seperate lines of output into a single string [Tür08] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig SELECT CITY, XMLAGG( XMLELEMENT(NAME "Zip", Zip)) AS "Zipcodes" FROM Cities GROUP BY City; SELECT XMLAGG( XMLELEMENT("address", Zip||' '||City) ORDER BY Zip DESC) FROM Cities; • Creates <address>38106 Braunschweig</address> <address>38100 Braunschweig</address> <address>30159 Hannover</address> City Braunschweig Zipcodes <Zip>38100</Zip> <Zip>38106</Zip> Hannover <Zip>30159</Zip> XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig • Creates – Disadvantage: Can only aggregate a single element, and thus fields are concatenated 17 [Pow07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 18 6. SQL/XML 6.3 XML data type • Storing XML in relational databases is possible as 6.1Introduction – Character data (VARCHAR, Character Large OBject) – New data type XML 6.2 Publishing relational data in XML 6.3 XML data type • A value of the data type XML can contain 6.4 Queries – whole XML document – XML element – a set of XML elements 6.5 Validation 6.6 SQL/XML standard • All XML publishing operators from chapter 6.2 create values of the data type XML, not a string 6.7 Overview 6.8 References 19 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6.3 XML data type 6.3 XML data type • Specification of XML type XML(SEQUENCE) XML [({DOCUMENT|CONTENT|SEQUENCE} [({ANY|UNTYPED|XMLSCHEMA schema name})])] NULL or document node Untyped elements & attributes, elements not NULL XML(CONTENT(ANY)) Validated against schema • Modifiers are optional • Primary type modifier 1 element child XML(CONTENT(UNTYPED)) 1 element child – DOCUMENT (XML document) – CONTENT (XML element) – SEQUENCE (sequence of XML elements) XML(CONTENT(XMLSCHEMA)) XML(DOCUMENT(ANY)) 1 element child • Secondary type modifier – UNTYPED – XMLSCHEMA (typed) – ANY (may be typed) Validated against schema XML(DOCUMENT(UNTYPED)) [Tür08] XML(DOCUMENT(XMLSCHEMA)) XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 21 [Tür08] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6.3 XML data type • Example: Definition of an XML type column CREATE TABLE Groups ( ID INTEGER, Name XML ); – CREATE TABLE XMLDOCUMENT OF XMLTYPE; • Create a table containing an XMLType data type column [Pow07] NUMBER NOT NULL, XMLTYPE, XPK PRIMARY KEY (ID) XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 22 6.3 XML data type • Create a table that is an XML data type in itself CREATE TABLE XML ( ID XML CONSTRAINT ); 20 23 ID Name 123 <Groups>Annabelle</Groups> 234 <Groups>Magdalena, Marius</Groups> 345 <?xml version 1.0?> <Groups> <Person>Patrick</Person> <Person>Robert</Person> </Groups> 654 <Groups>Rebecca</Groups> <Groups>Torben</Groups> XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 24 6.3 XML data type 6.3 XML data type • Characteristics • Parsing & Serialization – Allowed values: – XMLParse: • XML documents (including prolog) • XML content according to XML 1.0 (includes pure text comments, PI?) • NULL • Parses a string value using an XML parser • Produces value whose specific type is XML(DOCUMENT(ANY)), or …CONTENT…, or – No comparison possible (compare CLOB in SQL) • User can define an order, if comparison is necessary – XMLSerialize – No corresponding type in programming languages for embedding in SQL available – Standard defines operators to convert to other SQL data types [Kud07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig <City> <Name> Braunschweig </Name> <Zip>38100</Zip> <Zip>38106</Zip> <State> Niedersachsen </State> </City> • Transforms an XML value into a string value (CHAR, VARCHAR, CLOB, or BLOB) <City> <Name> Braunschweig </Name> <Zip>38100</Zip> <Zip>38106</Zip> <State> Niedersachsen </State> </City> 25 [Mel05] 6. SQL/XML XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 26 6.4 Queries • Motivation 6.1Introduction – How can SQL applications locate and retrieve information in XML documents stored in an SQL database cell? – Invoking XML query language within SQL statements 6.2 Publishing relational data in XML 6.3 XML data type 6.4 Queries 6.5 Validation • Retrieve information — in SELECT list • Locate information — in WHERE clause 6.6 SQL/XML standard – Details on XML query language XQuery later 6.7 Overview 6.8 References XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 27 [Mel05] 6.4 Queries • XMLQuery syntax XMLQUERY(<XQuery expression> [PASSING <argument list>] {NULL | EMPTY} ON EMPTY) • argument list := <SQL value> AS <XQuery variable> – A new SQL expression, invoked as a pseudofunction, whose data type can be an XML type —such as XML(CONTENT(ANY))—or an ordinary SQL type – Example SELECT XMLQUERY( • • XMLExists '<State name="{$Name}"><City>{$City}</City></State>' PASSING State as $Name, City AS $City NULL ON EMPTY) AS CityList FROM Cities; – A new SQL predicate, invoked as a pseudo-function, returning true when the contained XQuery expression returns anything other than the empty sequence (false) or SQL null value (unknown) XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 28 6.4 Queries • XMLQuery [Mel05] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 29 CityList <State name="Niedersachsen"><City>Braunschweig</City></State> <State name="Niedersachsen"><City>Hannover</City></State> [Tür08] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 30 6.4 Queries 6.4 Queries CREATE TABLE Papers (ID INTEGER, Paper XML); • XMLTABLE – Provides an SQL view of XML data ID Paper 123 <Paper>…<author>Alice</author><title>Perpetual Motion</title><year>1999</year></Paper> 345 <Paper><year>2005</year><author>Bob</author><author>Charlie </author><title>Beer</title>…</Paper> • Output is not of the XML type – Evaluates an XQuery “row pattern” with optional arguments (as with XMLQuery) – Element/attribute values mapped to columns using XQuery “column patterns” – Names & types of columns required; default values optional – Syntax: SELECT ID, XMLQUERY( 'FOR $a IN $p//author RETURN <Authors>{$a/text()}</Authors>' PASSING Paper AS "p") AS AuthorNames FROM Papers; ID AuthorNames 123 <Authors>Alice</Authors> 345 <Authors>Bob</Authors> <Authors>Charlie</Authors> [Tür08] XMLTABLE (<XQuery expression> PASSING <argument list> COLUMNS <column list>) column := <name> <type> PATH <path expression> XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 31 6.4 Queries 32 6.1Introduction SELECT ID, t.* FROM Papers p, XMLTABLE( 'for $root in $papers where $root//author/text() = "Bob" return $root/Paper' PASSING p.Paper as "papers" COLUMNS About VARCHAR(30) PATH '/Paper/title', Created INTEGER PATH '/Paper/year' ) AS t; ID About Created 345 Beer 2005 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6.2 Publishing relational data in XML 6.3 XML data type 6.4 Queries 6.5 Validation 6.6 SQL/XML standard 6.7 Overview 6.8 References 33 6.5 Validation 34 • Schema registration – Is like integrity constraints in DBs – Requires an XML Schema – XML Schemas may be registered with the SQL-server • Implementation-defined mechanism • Known by SQL name & by target namespace URI Register XMLSCHEMA 'http://www.Alfred-Moos.de/GrussSchema.xsd' FROM 'file://c:/XML_Schemata/GrussSchema.xsd' AS GrussSchema COMPLETE ; CREATE TABLE Dokument_XML (Dokument_XML_Nr CHAR (4) NOT NULL PRIMARY KEY, Dokument XML, CONSTRAINT validieren CHECK (Dokument IS VALIDATED ACCORDING TO XMLSCHEMA ID GrussSchema ) ) ; – Schema does need a unique name • Used by XMLValidate(), IS VALID, and to restrict values of XML(DOCUMENT-or-CONTENT(XMLSCHEMA)) XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6.5 Validation • Validation of XML [Mel05] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6. SQL/XML • XMLTable: Example [Tür08] [Mel05] 35 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 36 6.5 Validation 6.5 Validation • New functions and predicates: • Schema definition – XMLValidate – Syntax • Validates an XML value against an XML Schema (or target namespace), returning new XML value with type annotations XML(CONTENT(XMLSCHEMA) <schema> [<elements>])) – IS VALID <schema> := URI <namespace> [LOCATION <loc>] | NO NAMESPACE [LOCATION <loc>] | ID <registered schema name> – <element> := [NAMESPACE <namespace>] ELEMENT <element name> • Tests an XML value to determine whether or not it is valid according to an XML Schema (or target namespace); return true/false without altering the XML value itself – IS DOCUMENT • determines whether an XML value satisfies the (SQL/XML) criteria for an XML document – IS CONTENT • determines whether an XML value satisfies the (SQL/XML) criteria for XML content [Tür08] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 37 [Mel05] 6.5 Validation • Predefined schemas (build-in namespaces) – xs:http://www.w3.org/2001/XMLSchema – xsi:http://www.w3.org/2001/XMLSchema-instance – sqlxml:http://standards.iso.org/iso/9075/2003/sqlxml – Security issues • Schemas cannot “disappear” without SQLserver knowing about it • Schemas cannot be “hijacked” (altered in inappropriate ways) without SQL-server knowing about it • Documents cannot be marked “valid” against schemas unless SQL-server knows about them XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 38 6.5 Validation • Benefits of schema registration [Mel05] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig – More depending on the DB implementation • Completely supported per XML+Namespaces: – XMLElement, XMLForest, XMLTable – Default namespace, explicit namespace (prefix) – Declare namespace within scopes of WITH clause, column definitions, constraint definitions, insert/delete/update statements, compound statements 39 [Mel05] 6. SQL/XML XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 40 6.6 SQL/XML standard • SQL/XML standard published as 6.1Introduction 6.2 Publishing relational data in XML – ISO/IEC 9074-14:2003 6.3 XML data type – ISO/IEC 9075-14:2006 • Mappings and Publishing Functions 6.4 Queries • Adds XQuery, including Data Model,Validation – ISO/IEC 9075-14:2008 6.5 Validation • Updates • Something else? 6.6 SQL/XML standard 6.7 Overview 6.8 References XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 41 [Mel05] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 42 6.6 SQL/XML standard 6.6 SQL/XML standard • Overview of some operators for the XML type • SQL/XML:2003 plus – Additional publishing functions – XQuery data model – More precise XML type (modifiers) – XMLQuery, XMLTable – XMLValidate, IS VALID – XMLExists, IS DOCUMENT, IS CONTENT – Casting between XML type and SQL types [Mel05] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig – XMLELEMENT – creates an XML element node – XMLFOREST – creates a sequence of XML element nodes from a table – XMLCOMMENT – creates an XML comment node – XMLTEXT – creates a text node – XMLPI – creates a processing instruction – XMLAGG – aggregates XML values of a group – XMLCONCAT – concatenates XML type values – XMLTRANSFORM – applies an XSL to a document 43 [Tür08] 6.6 SQL/XML standard • Review of SQL/XML – Two components • A data type XML to store XML data • Functions to map relational structures to XML – XMLPARSE – a well-formed SQL text to XML value – XMLSERIALIZE – converts an XML value to a SQL text – XMLDOCUMENT – creates an XML document node from an XML value – XMLVALIDATE – validates an XML value with a schema – XMLQUERY – evaluates an XQuery expression – XMLTABLE – transforms an XQuery result to a SQL table – XMLITERATE – transforms an XQery sequence to a SQL table XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 45 6.7 Overview 1. 2. 3. 4. 5. Introduction XML Basics Schema definition XML query languages I Mapping relational data to XML 6. SQL/XML 7. XML processing 44 6.6 SQL/XML standard • ... Overview of some operators for the XML type [Tür08] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig – Only construction operators • No extraction of values or search • But construction operators are based on XQuery – Mapping of tables, schemas, catalogues ignores some information from the relational schema • UNIQUE • REFERENCES • CHECK – Further extensions are expected [Kud07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 46 6.8 References • "XML und Datenbanken" [Tür08] 8. XML query languages II 9. XML storage I 10. XML storage - index 11. XML storage - native 12. Updates / Transactions 13. Systems 14. XML Benchmarks – Can Türker – Lecture, University of Zurich, 2008 • Beginning XML Databases. [Pow07] – Gavin Powell – Wiley & Sons, 2007, ISBN 0471791202 • "XML-Datenbanken", [Kud07] – Thomas Kudraß – Lecture, HTWK Leipzig, WS2007/2008 • "SQL/XML", [Mel05] – Jim Melton, – Oracle Corp. 2005 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 47 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 48 6.8 References Questions, Ideas, Comments • XQuery und SQL/XML in DB2-Datenbanken: Verwaltung und Erzeugung von XMLDokumenten in DB2 [Moo08] • Now, or ... – Alfred Moos – Vieweg+Teubner, 2008 • ISO/IEC 9075-14:2003 Information Technology Database Languages - SQL - Part 14: XML-Related Specifications (SQL/XML) • DB2 SQL-Reference, IBM, March 2008 [IBM] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 49 • Room: IZ 232 • Office our: Tuesday, 12:30 – 13:30 Uhr or on appointment • Email: [email protected] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 50