XML Databases - IfIS - Technische Universität Braunschweig

Werbung
XML Databases
13. Systems
Silke Eckstein
Andreas Kupfer
Institut für Informationssysteme
Technische Universität Braunschweig
http://www.ifis.cs.tu-bs.de
13. Systems
13.1 Introduction
13.2 DB2
13.3 Oracle
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.7 Overview and References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
2
13.1 Introduction
• So far, the lecture and exercises focused on
methods and DB2
– Does it already represent all XML databases?
– How applicable are DB2
methods with other DBMS?
• ... we are now going to have a closer look at some
of the database systems
– Differences in storage structure
– Some syntax examples
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
3
13. Systems
13.1 Introduction
13.2 DB2
13.3 Oracle
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.7 Overview and References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
4
13.2 DB2 storage
• IBM DB2 XML storage since V9
XML documents
Application
file
system
Database
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
5
13.2 DB2 storage
• Mapping XML data to relational databases
– Variants:
• XML column approach: based on XML data type
• XML collection approach: based on decomposition of XML documents
into database tables and attributes
– Table with XML column:
• Diverse XML datatypes:
PureXML
– XML:
modelbased / hierarchical storage
– XMLCLOB:
XML documents stored as CLOBs
– XMLVARCHAR:
XML documents stored as VARCHAR XML
– XMLFILE:
XML documents stored in file system Extender
• XML schema validation for datatype XML only
• In addition: materialized views
– Extract selected XML content from documents
– Materialise this content into so-called side tables
– Side tables are defined in Document Access Definition (DAD)
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
6
13.2 DB2 storage
• "pureXML and relational hybrid database"
[IBM06a]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
7
13.2 DB2
• Ways to put XML
data into the
database
(PureXML)
[IBM06b]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
8
13.2 DB2 storage
• Ways to get
XML data out of
the database
(PureXML)
[IBM06b]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
9
13.2 DB2 syntax
• PureXML – Queries and Indexes
– Application of SQL in XQuery:
XQUERY db2-fn:xmlcolumn (‘t1.xml1’)
– Delivers the value of column xml1 of table t1 as a node
sequence (column must be of type XML)
XQUERY db2-fn:sqlquery (’SELECT xml1 FROM t1’)
– Delivers the single-column XML value of table t1 as a node
sequence (column must be of type XML)
– Definition of a path index:
CREATE INDEX Idx_Author_Path ON Book (Content)
GENERATE KEY
USING XMLPATTERN '//Author' AS SQL VARCHAR(50)
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
10
13.2 DB2 syntax
• XML Extender – Mapping between XML and SQL
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
11
13.2 DB2 syntax
• XML Extender – Tables with XML Types
– XML extension setup with XML Extender Admin Wizard
or command line:
> dxxadm enable_db XMLDB
– Definition of tables accepting XML documents:
• Variant 1: Create with XML Extender Admin Wizard
• Variant 2: SQL
CREATE TABLE Buch
(Inhalt DB2XML.XMLVARCHAR)
– Insertion of an XML document:
INSERT INTO Buch (Inhalt)
VALUES (DB2XML.XMLVARCHARFromFile('C:\XMLDIR\buch01.xml'))
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
12
13.2 DB2 syntax
• XML Extender – Queries
– SQL-XML Extender offers functions for queries and
updates
• Extract functions:
DB2XML.EXTRACT<datatype>(<XML value expression>, <XPath expression>)
• Example:
SELECT a.RETURNEDVARCHAR
FROM Buchlob, TABLE(DB2XML.EXTRACTVARCHARS(Inhalt, '//Autor')) a
– Limited support of SQL/XML standard
• XMLAGG
• XMLELEMENT
• XMLATTRIBUTE
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
13
13.2 DB2 syntax
• ExtractXXX(<XML value expression>, <XPath expression>)
"IBM DB2 Universal Database XML Extender
Administration and Programming, Version 8, 2002"
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
14
13.2 DB2 syntax
• XML Extender – Updates
– Updates possible with special XML Extender methods
– Syntax:
DB2XML.UPDATE(<XML value expression>, <XPath expression >, <new value>)
– Restriction: predicates with elements are not supported
• Example: not supported predicate
UPDATE Buchlob
SET Inhalt = DB2XML.UPDATE(Inhalt, '//Verlag[text()="dpunkt"]/@Ort'
'Zürich')
• Example: supported predicate
UPDATE Buchlob
SET Inhalt = DB2XML.UPDATE(Inhalt, '//Buch[@ISBN="3-89864-1481"]/Verlag/@Ort', 'Köln')
– With XML column approach updates are transferred to side tables
automatically
– In PureXML an XML value can only be fully replaced
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
15
13.2 DB2 syntax
• XML Extender – Indexing
– Index support
• Value index (B-Tree, Bitmap, etc.) on side tables (XML
Extender)
• Full text index (with Text Extender) on XML types
– Extension of full text index for IR on XML
• Path information included in index
• Support for path expressions
• Example:
Retrival model
SELECT Inhalt
FROM Buchlob
WHERE contains(dscrHandel, ‘MODEL order
SECTION(//Buch/Beschreibung) "Datenbank"‘) = 1
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
16
13.2 DB2
• Summary IBM DB2 XML Support
XML storage model
Extensible, object relational
Schema definition
Validation possible
Storage type
Model-based (PureXML), text-based or
userdefined schema-based (XML Extender)
Mapping DB
[Tür08]
XML
DAD (XML Extender)
XML data type
Available (PureXML)
Value/function index
Standard DBS indexes on side tables
Full text index
With TextExtender
Path index
Available
Queries
SQL/XML with XQuery support
Full text search
With TextExtender
Manipulation
SQL functions with XPath
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
17
13. Systems
13.1 Introduction
13.2 DB2
13.3 Oracle
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.7 Overview and References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
18
13.3 Oracle 11g storage
Architecture
Figure taken from Oracle®
XML Developer's Kit
Programmer's Guide 11g
Release 1 (11.1), April 2008
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
19
13.3 Oracle 11g storage
Architecture (2)
Figure taken from Oracle®
XML DB Developer’s
Guide 11g Release 1 (11.1)
October 2007
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
20
13.3 Oracle 11g storage
• Mapping variants from XML to databases
– XML column approach: Column is based on XML type
vs.
– XML table approach: Table is based on XML type
• Using objectrelational extensions of Oracle
– XMLTYPE as predefined object type with SQL/XML
functions as methods
– Intermedia-Text-Package with full text functions
– DBMS_XMLDOM package with DOM methods
– DBMS_XMLSCHEMA package with administration and
generation methods
– DBMS_XMLGEN package with methods to generate XML
from SQL
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
21
13.3 Oracle 11g storage
• Storage options
–
–
–
–
text-based (unstructured as CLOB)
binary (compact storage in XML binary format)
schema-based (object-relational storage requires XML Schema)
hybrid (semistructured)
Figure taken from
Oracle® XML DB
Developer’s Guide
11g Release 1 (11.1)
October 2007
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
22
13.3 Oracle 11g storage
Figure taken from
Oracle® XML DB
Developer’s Guide
11g Release 1 (11.1)
October 2007
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
23
13.3 Oracle 11g syntax
• XML-column vs. XML-table approach
– Table with XML column
CREATE TABLE <table name> ( <column name> XMLTYPE)
[XMLTYPE [COLUMN] <column name>
[STORE AS {OBJECT RELATIONAL | CLOB ( <LOB parameter>) | BINARY XML ( <LOB parameter>) })
[XMLSCHEMA <url> ELEMENT [ <url> #] <element> ]]
schema-based
text-based
binary
– XML table
CREATE TABLE <table name> OF XMLTYPE
[XMLTYPE
[STORE AS {OBJECT RELATIONAL | CLOB ( <LOB parameter>) | BINARY XML ( <LOB parameter>) })
[XMLSCHEMA <url> ELEMENT [ <url> #] <element> ]]
– Inserting documents in both cases
INSERT INTO table VALUES (XMLTYPE (getDocument('input1.xml')));
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
24
13.3 Oracle 11g syntax
• User-defined function getDocument(file) to
read XML documents
CREATE DIRECTORY xmldir AS 'c:\xmldir';
GRANT READ ON DIRECTORY xmldir TO PUBLIC WITH GRANT OPTION;
CREATE FUNCTION getDocument(filename VARCHAR2)
RETURN CLOB
AUTHID CURRENT_USER IS
xbfile BFILE;
xclob CLOB;
BEGIN
xbfile := BFILENAME('xmldir', filename);
DBMS_LOB.open(xbfile);
DBMS_LOB.createTemporary(xclob TRUE, DBMS_LOB.session);
DBMS_LOB.loadFromFile(xclob, xbfile,
DBMS_LOB.getLength(xbfile));
DBMS_LOB.close(xbfile);
RETURN xclob;
END;
/
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
25
13.3 Oracle 11g syntax
• Package DBMS_XMLSCHEMA offers methods
to register, compile, generate and delete XML
Schemas
DBMS_XMLSCHEMA.registerSchema( 'schema-URL', 'schema-name' );
DBMS_XMLSCHEMA.registerSchema( 'text.xsd', getDocument('test.xsd') );
DBMS_XMLSCHEMA.compileSchema( 'schema-URL' );
DBMS_XMLSCHEMA.generateSchema( 'schema-URL', 'type-name' );
DBMS_XMLSCHEMA.deleteSchema( 'schema-URL', DeleteOption );
DeleteOption:
DELETE_RESTRICT
DELETE_INVALIDATE
DELETE_CASCADE
DELETE_CASCADE_FORCE
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
26
13.3 Oracle 11g syntax
• Some methods of the XMLTYPE
– XMLTYPE(<value-expr>) is the constructor. Expression can be a string or a
user defined type
– getClobVal()/getStringVal() returns XML value as CLOB or string
– getNumVal() only applicable to text nodes containing a numeric string
– isFragment() returns 1 if instance has more than one root element
– existsNode(<XPath-expr>) returns 1 if the expression returns a node
– extract(<XPath-expr>) extracts a part of the XML value
– transform(<XML-value-expr>) transforms according to a stylesheet
– toObject() converts to an object
– isSchemaBased() returns 1 if the XML value is based on a schema
– getSchemaURL() returns the URL to the schema
– getRootElement() returns the root element or NULL for fragments
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
27
13.3 Oracle 11g syntax
• Queries
– Support of SQL/XML functions
•
•
•
•
•
•
•
XMLQUERY
XMLTABLE
XMLAGG
XMLELEMENT
XMLATTRIBUTE
XMLFOREST
…
– And additional functions
• EXTRACT
• EXISTSNODE
• ...
– Full text search with the Intermedia-Text-Package
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
28
13.3 Oracle 11g syntax
• EXTRACT
– extracts an excerpt of the XML value described by an
XPath query
EXTRACT( <XML-value-expression>, <XPath-expression> [, <Namespace>] )
SELECT
EXTRACT( VALUE(b), '//@ISBN' ) AS ISBNumber,
EXTRACT( VALUE(b), '//Title/text()' ) AS Title_content,
EXTRACT( VALUE(b), '//Title' ) AS Title_element
FROM Book b;
ISBNumber
Title_content
Title_element
3-89864-148-1
XML & Datenbanken
<Title>XML & Datenbanken</Title>
3-89864-219-4
SQL-1999 & SQL:2003
<Title>SQL-1999 & SQL:2003</Title>
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
29
13.3 Oracle 11g syntax
• EXISTSNODE
– Returns 0 if the query returns the empty sequence
EXISTSNODE( <XML-value-expression>, <XPath-expression> [, <Namespace>] )
Example:
SELECT
EXTRACT( VALUE(b), '//@ISBN' ) AS ISBNumber,
EXTRACT( VALUE(b), '//Title/text()' ) AS Title_content,
EXTRACT( VALUE(b), '//Title' ) AS Title_element
FROM Book b
WHERE EXISTSNODE( VALUE(b), '//Book[@ISBN="3-89864-219-4"]' ) = 1;
ISBNumber
Title_content
Title_element
3-89864-219-4
SQL-1999 & SQL:2003
<Title>SQL-1999 & SQL:2003</Title>
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
30
13.3 Oracle 11g syntax
• Indexing
– Full text index
CREATE INDEX xmlfulltextidx ON Book b
(VALUE(b)) INDEXTYPE IS CTXSYS.CONTEXT;
– Path index
CREATE INDEX xmlpathidx ON Book b
(VALUE(b)) INDEXTYPE IS CTXSYS.CTXXPATH;
– Functional index
(value index)
– XML index
CREATE INDEX xmlfunctionalidx ON Book b
(EXTRACTVALUE(VALUE(b),'//@year'));
CREATE INDEX xmlidx ON Book b
(VALUE(b)) INDEXTYPE IS XDB.XMLIndex;
• Creates a set of secondary indexes
– Path index with all XML tags and fragments
– Value index with the order of the document (node positions)
– Value index to index the values of the nodes
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
31
13.3 Oracle 11g syntax
• Using indexes
– Query using the path index:
SELECT EXTRACTVALUE (VALUE(b),'//Title') AS Title FROM Book b
WHERE EXISTSNODE (VALUE(b),'/Book/Publisher[text()="dpunkt"]') = 1;
– Query using the full text index:
SELECT SCORE (o), EXTRACT(VALUE(b),'//@ISBN') AS ISBN FROM Book b
WHERE CONTAINS (VALUE(b),'Java', o) > o ORDER BY SCORE (o) DESC;
– Query using the functional index:
SELECT EXTRACTVALUE (VALUE(b),'//Title') AS Title FROM Book b
WHERE EXTRACTVALUE (VALUE(b),'//Year') = 2009;
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
32
13.3 Oracle 11g syntax
• Manipulation – UPDATEXML
– Change a part (defined by an XPath query) of the XML value
UPDATEXML (<XML-value-expr>, <replacement-list> [, <namespace>])
<replacement-list> := <XPath-expr>, <value-expr>
– Example to change the value of an attribute:
UPDATE Book b
SET VALUE(b) = UPDATEXML (VALUE(b),'//Publisher[text()="dpunkt"]/@City', 'Zürich');
• Manipulation – DELETEXML
– Deletes a sequence of nodes (selected by an XPath query)
from the XML value
DELETEXML (<XML-value-expr>, <XPath-expr> [, <namespace>])
– Example to delete a specific Author node:
UPDATE Book b SET VALUE(b) =
DELETEXML (VALUE(b),'//Book[@ISBN="3-89864-148-1"]/Author[text()="Holger Meyer"]');
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
33
13.3 Oracle 11g syntax
• XML views
– Allow XML-based views on SQL and XML values
– Are based on the principle of object views
• The object type is XMLTYPE in this case
– Example:
CREATE VIEW DpunktBooks OF XMLTYPE
WITH OBJECT ID DEFAULT
AS SELECT VALUE (b) FROM Book b
WHERE EXISTSNODE (VALUE(b),'//Publisher[text()="dpunkt"]') ;
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
34
13.3 Oracle 11g syntax
• Export of database contents with XML syntax
– Standard mapping: SQL
XML with
DBMS_XMLGEN.getXML('query')
• Top level elements result from columns
• Simple types (with scalar values) as elements with PCDATA
• Structured types and their attributes as elements with subelements for
attributes
• Complex attributes as hierarchically nested elements
• Collection types are mapped to lists of elements
• Object references and referential integrity as ID/IDREF within the
document
• Table content is mapped to ROWSET elements:
<ROWSET>
<ROW num="1" > … </ROW>…<ROW num="n" > … </ROW>
</ROWSET>
– User defined transformation from SQL to XML is possible with XSLT
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
35
13.3 Oracle 11g
• Summary Oracle XML support
XML storage model
Extensible, object relational
Schema definition
Validation possible
Storage type
Text-based or schema-based
Mapping DB
[Tür08]
XML
By SQL/XML functions, schema
generators, XML views
XML data type
Available
Value/function index
Available
Full text index
Available
Path index
Available
Queries
SQL/XML with XQuery support
Full text search
With the Intermedia-Text-Package
Manipulation
SQL methods with XPath
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
36
13. Systems
13.1 Introduction
13.2 DB2
13.3 Oracle
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.7 Overview and References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
37
13.4 SQL Server storage
• Microsoft SQL Server Architecture
Application
XML documents
Database
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
38
13.4 SQL Server storage
• Mapping XML data to relational databases
– 4 storage variants:
•
•
•
•
Native (binary) storage
Text-based storage as CLOB
Model-based storage according to EDGE approach
Schema-based storage via STORED-queries
– Datatype XML with methods based on XQuery
•
•
•
•
•
Query() – evaluates an XQuery and returns a value of type XML
Value() – evaluates an XQuery and returns a scalar SQL value
Exist() – returns true, if XQuery result is not empty
Modify() – updates a value of type XML
Nodes() – returns subtree of XML value
– Integrated Usage of SQL and XQuery
• Access to SQL data in XQuery via sql:column() and
sql:variable()
• Evaluation of XQuery expressions in SQL via XML methods from above
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
39
13.4 SQL Server syntax
• Native storage – table definition
– Schema registration
CREATE XML SCHEMA COLLECTION BuchXSD
AS '<?xml version="1.0"?>…'
– Table definition
CREATE TABLE Buch (
Id
INT PRIMARY KEY,
Inhalt XML BuchXSD)
)
– Insertion of an XML document from a file
INSERT INTO Buch
SELECT 1, xCol
FROM (SELECT *
FROM OPENROWSET
(BULK 'C:\XMLDIR\buch1.xml', SINGLE_BLOB) AS xCol)
AS R(xCol)
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
40
13.4 SQL Server syntax
• Native storage – SQL/XML queries & updates
– Find all author elements from books whose first author is
"Gunter Saake"
SELECT Inhalt.query('//Autor') AS Autoren
FROM Buch
WHERE Inhalt.exist('/Buch[Autor[1] = "Gunter Saake"]') = 1
Autoren
<Autor>Gunter Saake</Autor><Autor>Ingo Schmitt</Autor>
<Autor>Can Türker</Autor>
<Autor>Gunter Saake</Autor><Autor>Kai-Uwe Sattler</Autor>
– Update the value of the attributes "City" from all those
publisher elements to "Zürich", where the publisher is "dpunkt"
UPDATE Buch
SET Inhalt.modify
('replace value of (//Verlag[. = "dpunkt"]/@Ort)[1]
with "Zürich"')
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
41
13.4 SQL Server syntax
• Native storage – indexing
– Definition of a primary XML indexes
CREATE PRIMARY XML INDEX Idx_Inhalt ON Buch (Inhalt)
• Creates clustered index with entries of form (ID, ORDPATH,TAG, NODETYPE,
VALUE, PATH_ID, ...)
• necessary in order to create secondary indexes
– Secondary XML index types:
PATH | PROPERTY | VALUE
• Path index (path, value)
• Property index (primary key, path, value)
• Value index (value, path)
– Definition of a secondary XML index:
CREATE XML INDEX Idx_Inhalt_Path ON Buch (Inhalt)
USING XML INDEX Idx Inhalt
FOR <Indextyp>
– Full text index is also supported:
CREATE FULLTEXT INDEX Idx_Inhalt_FT ON Buch (Inhalt)
KEY INDEX b
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
42
13.4 SQL Server syntax
• Model-based storage with EDGE
– Invocation of OPENXML without WITH clause creates EDGE
table
• Schema structure of the EDGE tables:
[Tür08]
Column
Datatype
Task
id
bigint
unique node id
parentid
bigint
parent node id
nodetype
int
distinguishes elements, attributes, comments
localname
nvarchar
tag
prefix
nvarchar
XML namespace prefix
namespaceuri
nvarchar
XML namespace URI
datatype
nvarchar
datatype (derived from DTD or XML schema)
prev
bigint
id of previous node (in document order)
text
ntext
node content
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
43
13.4 SQL Server syntax
• Model-based storage with EDGE
EXEC sp_xml_preparedocument @hdoc OUTPUT, @xmldoctext
INSERT INTO EDGE
SELECT *
FROM OpenXML (@hdoc, '', 0)
EXEC sp_xml_removedocument @hdocC
– Results in EDGE table:
id
parent
nodetype
localname
prefix
namespaceuri
datatype
prev
text
0
NULL
1
book
NULL
NULL
NULL
NULL
NULL
6
3
#text
NULL
NULL
NULL
NULL
'Vossen'
...
17
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
44
13.4 SQL Server syntax
• Schema-based storage of STORED queries
– SQL extension with OPENXML
– OPENXML transforms XML contents into database tables
(shredding)
– OPENXML therefore offers possibility to implement STORED
queries
– Example for the realization of a STORED query:
EXEC sp_xml_preparedocument @hdoc OUTPUT, @xmldoctext
INSERT INTO book
SELECT *
FROM OpenXML (@hdoc, '//book/', 0) WITH
(
title
NVARCHAR(3000) './title',
publisher NVARCHAR(200) './publisher',
isbn
NVARCHAR(15)
'./isbn'
)
EXEC sp_xml_removedocument @hdoc
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
45
13.4 SQL Server syntax
• Mapping of databases to XML
– Variant 1: Standard transformation with SQL SELECT and FOR
XML clause
• FOR XML RAW: Transformation in ROW-XML elements and XML
attributes
• FOR XML AUTO:
– Semantically rich XML element names
– Foreign key relationships are transformed into hierarchies
• FOR XML EXPLICIT: User controls XML assembling through metadata
(EDGE)
– Variant 2: User defined XML view
• Use of a (available) XML schema
• Annotation of the schema with information about tables and columns
• Accesss from the application to the XML view via:
– IIS functionality
– ADO (ActiveX Data Objects) – middleware for DB access
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
46
13.4 SQL Server syntax
• Updates
– SQL Server does not offer functions to update XML documents
stored as CLOBs
• Results in heavy restrictions of text-based approach
– Updates for schema-based approach possible via so called
updategrams
• Builds on annotated XML schemas
• Updates are specified as an XML document
• New namespace:
xmlns:updg="urn:schemas-microsoft-com:xml-updategram"
– Element before: Definition of a previous state (to be modified)
– Element after:
Definition of the new state
• Different update operations through varying element contents
– Insert:
before element remains empty
– Delete:
after element remains empty
– Update:
both elements have non-empty contents
• Automatic execution of necessary database operations
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
47
13.4 SQL Server syntax
• Updates: updategram example
– Update of publisher information
<ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram">
<updg:sync >
<updg:before>
<Buch>
<Titel> Objektdatenbanken </Titel>
<ISBN>3-8266-00258-7 </ISBN>
<Verlag> Thomson </Verlag>
</Buch>
</updg:before>
<updg:after>
<Buch>
<Titel> Objektdatenbanken </Titel>
<ISBN>3-8266-00258-7 </ISBN>
<Verlag> International Thomson Publishing </Verlag>
<Buch>
</updg:after>
</updg:sync>
</ROOT>
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
48
13.4 SQL Server
• Summary SQL Server XML support
XML storage model
Relational
Schema definition
inline DTD or XML schema
Storage type
Native: XML column
text-based: CLOB column
modelbased: with OPENXML
user-defined schema-based: with OPENXML-STORED queries
Mapping DB
[Tür08]
XML
Automatically: FOR XML clause
user-defined: XSD annotations
XML data type
Available
Value index
Available
Full text index
No XML specific functions
Path index
Available
Queries
SQl extensions (query and value not compatible with
SQL/XML), XQuery
Manipulation
XML method modify with updategrams
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
49
13. Systems
13.1 Introduction
13.2 DB2
13.3 Oracle
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.7 Overview and References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
50
13.5 Tamino storage
• Architecture
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
51
13.5 Tamino storage
• Architecture (2)
XML Output
Data from external
sources and/or
internal data storage
[Tür08]
Query (URL)
XML Objects, DTDs
Data to external
sources and/or
internal data storage
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
52
13.5 Tamino storage
• Storage structures: Mapping of XML
– Tamino uses native storage structures for XML data
– Native storage is supplemented with diverse classical index types
• B-Tree index
• Full text index
• Path index
– Storage alternatives:
• Storage of well-formed XML documents without schema
• Storage of valid XML documents
– Annotation of schema definition with storage alternatives
– Storage hierarchy:
•
•
•
•
[Tür08]
Tier 1: Tamino
Tier 2: Collection
Tier 3: Document type (defined by set of XML schema definitions)
Tier 4: document instance
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
53
13.5 Tamino syntax
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:tsd="namespaces.softwareag.com/tamino/TaminoSchemaDefinition">
<xs:annotation> <xs:appinfo>
<tsd:schemaInfo name="book">
<tsd:collection name="books"></tsd:collection>
<tsd:doctype name="book">
<tsd:logical> <tsd:content>open<tsd:content></tsd:logical>
</tsd:doctype>
</tsd:schemaInfo>
</xs:appinfo> </xs:annotation>
<xs:element name = "book">
<xs:complexType> <xs:sequence>
<xs:element name = "title" type = "xs:string"></xs:element>
…
<xs:element name = "summary" type = "xs:string">
<xs:annotation> <xs:appinfo>
<tsd:elementInfo>
<tsd:physical> <tsd:native>
<tsd:index> <tsd:text></tsd:text> </tsd:index>
</tsd:native> </tsd:physical>
</tsd:elementInfo>
</xs:appinfo> </xs:annotation>
</xs:element>
</xs:sequence> </xs:complexType>
</xs:element>
Storage:
Example schema
with annotations
for text index
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
54
13.5 Tamino syntax
• Queries
– Access possibilities
• Program controlled, e.g. via DCOM components
• Ad-hoc queries with X-Plorer query tool
• "Interactive Interface"
– Supported query languages
• XPath 1.0 dialect with extensions for text search (also possible without
index)
– Containedness (~=)
/Buch[Titel ~= "Datenmodelle"]/Beschreibung
– Wildcard character (*)
/*[. ~= "*XML*"]
– Consideration of context (NEAR)
/*[/Autor ~= "Gunter" NEAR "Saake"]
• XQuery dialect
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
55
13.5 Tamino syntax
• Updates
– Operations
• Delete:
UPDATE DELETE $buch//Verlag[@Ort="Zürich"]/@Ort
• Insert:
UPDATE INSERT <Preis Waehrung="EUR">35</Preis>
INTO $buch[@ISBN="3-8266-0258-7"]
• Replace:
UPDATE REPLACE $buch//Verlag[@Ort="Zürich"]/@Ort
WITH ATTRIBUTE Ort {"Wiesbaden"}
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
56
13.5 Tamino syntax
• Indexing
– Classical indexes for data
• Numbers and strings
– Text indexes for document centric parts
• With wildcards
– Structure index
• Full
• Condensed
– Combined index
• Multiple elements and attributes, even on different levels
– Multi path index
• Different paths indexed together
– Reference index
• Hierarchy aware index
[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
57
13.5 Tamino
• Summary Tamino
Model
Native
Schema definition
Validation possible
Storage type
Model-based
Mapping DB
[Tür08]
XML
Native
XML data type
Available
Value index
Available
Full text index
Available
Path index
Available
Queries
Tamino X-Query (with extensions and small
differences compared to W3C XQuery)
Full text search
Supported
Manipulation
Supported
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
58
13. Systems
13.1 Introduction
13.2 DB2
13.3 Oracle
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.7 Overview and References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
59
13.6 Summary
• DB2 knowledge is transferable to other XML DB
– Big commercial database software
• Small syntax differences between big commercial DBMS
• Many different storage possibilities, but quite similar
– Bigger syntax differences with native XML DBMS
• XML data type and SQL/XML widely supported
and best choice for many tasks
– But optimizing performance (physical tuning) requires
different storage possibilities, because it is application
dependant
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
60
13.6 Summary
• Notable features
– DB2
• offers XML Extender in addition to PureXML
• allowing physical tuning
– Oracle
• has an objectrelational design
• XML integrates more seamlessly
• double access to XMLTYPE
– SQL Server
• offers straight forward access to EDGE tables
• new updategrams for schema-based storage
– Tamino
• only uses a single data model
• own query language with extensions to XPath and XQuery
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
61
13.7 References
• "XML und Datenbanken" [Tür08]
– Can Türker
– Lecture, University of Zurich, 2008
• "XML und Datenbanken" [KM03]
– M. Klettke, H. Meier
– dpunkt.verlag, 2003
• " DB2 9 pureXML Guide" [IBM06a]
– IBM
– December 2006
• "DB2 Version 9. XML Guide" [IBM06b]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
62
12.6 Overview
Introduction and Basics
1. Introduction
2. XML Basics
3. Schema Definition
4. XML Processing
Querying XML
5. XPath & SQL/XML
Queries
6. XQuery Data Model
7. XQuery
XML Updates
8. XML Updates & XSLT
Producing XML
9. Producing XML
Storing XML
10. XML storage
11. Relational XML storage
12. Storage Optimization
Systems
13. Technology Overview
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
63
Questions, Ideas, Comments
• Now, or ...
• Room:
IZ 232
• Office our:
Tuesday, 12:30 – 13:30 Uhr
or on appointment
• Email:
[email protected]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
64
Herunterladen