12c SQL Pattern Matching – wann werde ich das benutzen?

Werbung
12c SQL Pattern Matching –
wann werde ich das benutzen?
Andrej Pashchenko
Senior Consultant
Trivadis GmbH
BASEL BERN BRUGG DÜSSELDORF
HAMBURG KOPENHAGEN LAUSANNE
FRANKFURT A.M. FREIBURG I.BR. GENF
MÜNCHEN STUTTGART WIEN ZÜRICH
Unser Unternehmen.
Trivadis ist führend bei der IT-Beratung, der Systemintegration, dem Solution
Engineering und der Erbringung von IT-Services mit Fokussierung auf
und
-Technologien in der Schweiz, Deutschland, Österreich und
Dänemark. Trivadis erbringt ihre Leistungen aus den strategischen Geschäftsfeldern:
BETRIEB
Trivadis Services übernimmt den korrespondierenden Betrieb Ihrer IT Systeme.
2
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Mit über 600 IT- und Fachexperten bei Ihnen vor Ort.
KOPENHAGEN
14 Trivadis Niederlassungen mit
über 600 Mitarbeitenden.
HAMBURG
Über 200 Service Level Agreements.
Mehr als 4'000 Trainingsteilnehmer.
Forschungs- und Entwicklungsbudget:
CHF 5.0 Mio.
DÜSSELDORF
Finanziell unabhängig und
nachhaltig profitabel.
FRANKFURT
STUTTGART
FREIBURG
BASEL
WIEN
MÜNCHEN
Erfahrung aus mehr als 1'900 Projekten
pro Jahr bei über 800 Kunden.
BRUGG
ZÜRICH
BERN
LAUSANNE
GENF
3
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Über mich
Senior Consultant bei der Trivadis GmbH, Düsseldorf
Schwerpunkt Oracle
– Application Development
– Application Performance
– Data Warehousing
22 Jahre IT-Erfahrung, davon 16 Jahre mit Oracle DB
Kurs-Referent „Oracle 12c New Features für Entwickler“
und „Beyond SQL and PL/SQL“
4
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Agenda
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
5
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Introduction
6
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Introduction
Pattern
Matching
LISTAGG
NTH_VALUE
SQL Model
Clause
Analytic
functions
enhancements
Analytic
functions
7
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
PIVOT/UNPIVOT
clause
Top-N
Introduction
Oracle 12c database supports SQL Pattern Matching with the new
clause - MATCH_RECOGNIZE
pattern matching in a sequences of rows
nothing to do with string patterns (PL/SQL REGEXP_...
functions)
it‘s a clause, not a function
after the table name in FROM clause
patterns are expressed with regular expression syntax over
pattern variables
pattern variables are defined as SQL expressions
8
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Introduction
MATCH_RECOGNIZE
( [ PARTITION BY <cols> ]
[ ORDER BY <cols> ]
[ MEASURES <cols> ]
[ ONE ROW PER MATCH | ALL ROWS PER MATCH ]
[ SKIP_TO <option> ]
PATTERN ( <row pattern> )
[ SUBSET <subset list> ]
DEFINE <definition list> )
9
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Introduction
Example: Find Mappings in the ETL logging table, which were
increasingly faster over a period of four days. Output: start and end dates
of the period, elapsed time at the beginning and the end of the period,
average elapsed time.
10
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Introduction
SELECT etl_date, mapping_name, elapsed
FROM
dwh_etl_runs;
...
04-NOV-14 MAP_STG_S_ORDER_ITEM
+000000
05-NOV-14 MAP_STG_S_ORDER
+000000
05-NOV-14 MAP_STG_S_ORDER_ITEM
+000000
05-NOV-14 MAP_STG_S_ASSET
+000000
06-NOV-14 MAP_STG_S_ASSET
+000000
06-NOV-14 MAP_STG_S_ORDER
+000000
06-NOV-14 MAP_STG_S_ORDER_ITEM
+000000
07-NOV-14 MAP_STG_S_ORDER_ITEM
+000000
07-NOV-14 MAP_STG_S_ORDER
+000000
07-NOV-14 MAP_STG_S_ASSET
+000000
08-NOV-14 MAP_STG_S_ASSET
+000000
08-NOV-14 MAP_STG_S_ORDER
+000000
08-NOV-14 MAP_STG_S_ORDER_ITEM
+000000
...
11
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
00:14:54.42738
00:10:13.44989
00:15:06.24587
00:14:15.22855
00:14:00.49513
00:11:05.07337
00:10:12.67410
00:19:29.64314
00:14:59.80953
00:13:33.80789
00:10:14.65652
00:13:30.77744
00:17:15.11789
Introduction
As for analytic functions:
partition and order
SELECT *
FROM
dwh_etl_runs MATCH_RECOGNIZE (
Define measures, which are
accessible in the main query
PARTITION BY mapping_name
ORDER BY etl_date
MEASURES FIRST (etl_date) AS start_date
,
LAST (etl_date) AS end_date
Define search
, pattern with FIRST (elapsed) AS first_elapsed
regular expression over boolean
LAST (elapsed) AS last_elapsed
pattern,variables
,
AVG(elapsed) AS avg_elapsed
PATTERN (STRT DOWN{3})
DEFINE DOWN AS elapsed < PREV(elapsed) )
Define pattern variables
12
19.11.2015
Navigation operators:
PREV, NEXT – physical offset
FIRST, LAST – logical offset
12c SQL Pattern Matching – wann werde ich das benutzen?
Introduction
PATTERN: Subset of Perl syntax for regular expressions
– * — 0 or more iterations
– + — 1 or more iterations
– ? — 0 or 1 iterations
– {n} — n iterations (n > 0)
– {n,} — n or more iterations (n >= 0)
– {n,m} — between n and m (inclusive) iterations (0 <= n <= m, 0 < m)
– {,m} — between 0 and m (inclusive) iterations (m > 0)
– ( ) – Grouping
– | – Alternation
– {- … -} – Exclusion
– ^ - before the first row in the Partition
– $ - after the last row in the partition
– ? – “reluctant” vs. “greedy”
– ….
13
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Introduction
Patterns are everywhere
Network Activity
CRM
Stock Market
Financial
Money
Laundering
Telcos
Trouble Ticketing
Quality of Service
Fraud Detection
Automotive
Advertising
Campaigns
Buying Patterns
Retail
Sessionization
Sensor Data
Traffic
Price Trends
14
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Frequent Flyer
Programms
Transport /
Logistics
Process Chain
Introduction
SQL had no efficient way to handle such questions
pre 12c solutions
self-joins, subqueries (NOT) IN, (NOT) EXISTS
switch to PL/SQL - „Do it yourself“, often multiple SQL queries
transfer some logic to pipelined functions and integrate them in
the main query
analytic (window) functions
– ORA-30483: window functions are not allowed here
– not possible to use in WHERE clause
– not possible to nest them
– unable to access the output of analytic functions in other rows
– often leads to nesting queries, self-joins, etc.
15
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Agenda
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
16
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Find consecutive ranges and gaps
17
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
mr_consecutive.sql
Find Consecutive Ranges / Gaps
ID
Start of Range
End of Range
1
1
3
2
5
6
3
10
12
5
6
10
11
12
14
SLA, QoS: find the longest period without outage
Table T_GAPS
20
Find consecutive ranges in the values of column ID
21
Output: Start- and End-ID of consecutive range
…
18
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Find Consecutive Ranges / Gaps
ID
1
2
3
5
6
10
11
12
14
Pre 12c solution using analytic functions
WITH groups_marked AS (
SELECT id
,
CASE
WHEN id != LAG(id,1,id) OVER(ORDER BY id) + 1 THEN 1
ELSE 0
END new_grp
FROM
t_gaps)
,
sum_grp AS (
SELECT id, SUM(new_grp) OVER(ORDER BY id) grp_sum
FROM groups_marked )
SELECT MIN(id) start_of_range
,
MAX(id) end_of_range
FROM
sum_grp
GROUP BY grp_sum
ORDER BY grp_sum;
20
21
…
19
19.11.2015
mr_consecutive.sql
12c SQL Pattern Matching – wann werde ich das benutzen?
Find Consecutive Ranges / Gaps
ID
1
2
3
5
6
„Tabibitosan“- method*
SELECT MIN(id) start_of_range
,
MAX(id) end_of_range
FROM (SELECT id
,
id - ROW_NUMBER() OVER(ORDER BY id) distance
FROM
t_gaps)
GROUP BY distance
ORDER BY distance;
10
11
12
14
20
21
…
* - https://community.oracle.com/message/3991177#3991177
20
19.11.2015
mr_consecutive.sql
12c SQL Pattern Matching – wann werde ich das benutzen?
Find Consecutive Ranges / Gaps
ID
1
2
3
5
6
10
12c solution with MATCH_RECOGINZE
SELECT *
FROM
t_gaps MATCH_RECOGNIZE (
ORDER BY id
MEASURES FIRST(id) start_of_range
,
LAST(id) end_of_range
,
COUNT(*) cnt
ONE ROW PER MATCH
PATTERN (strt cont*)
DEFINE cont AS id = PREV(id)+1
);
11
12
14
20
21
…
21
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
mr_consecutive.sql
mr_gaps.sql
Find Consecutive Ranges / Gaps
ID
Start of Gap
End of Gap
1
4
4
2
7
9
3
13
13
5
15
19
6
10
11
12
Table T_GAPS, numeric column ID with gaps
14
Find the gaps in the values of column ID
20
21
Output: start- and end-ID of the gap
…
22
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Find Consecutive Ranges / Gaps
ID
1
2
3
5
6
10
mr_gaps.sql
Solution with analytic functions
SELECT start_of_gap, end_of_gap
FROM ( SELECT id + 1 start_of_gap
,
LEAD(id) OVER(ORDER BY id) - 1 end_of_gap
,
CASE
WHEN id + 1 != LEAD(id) OVER(ORDER BY id) THEN 1
ELSE 0
END is_gap
FROM
t_gaps)
WHERE is_gap = 1;
11
12
14
20
21
„Tabibitosan“-method*
SELECT MAX(id) + 1 start_of_gap
,
LEAD(MIN(id)) OVER (ORDER BY distance) -1 end_of_gap
FROM (SELECT id
,
id - ROW_NUMBER() OVER(ORDER BY id) distance
FROM
t_gaps)
GROUP BY distance;
…
* - https://community.oracle.com/message/3991177#3991177
23
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Find Consecutive Ranges / Gaps
ID
1
2
3
5
6
10
12c solution with MATCH_RECOGINZE
SELECT *
FROM
t_gaps MATCH_RECOGNIZE (
ORDER BY id
MEASURES PREV(gap.id)+1 start_of_gap
,
gap.id - 1 end_of_gap
ONE ROW PER MATCH
PATTERN (strt gap+)
DEFINE gap AS id != PREV(id)+1
);
11
12
14
20
21
…
24
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
mr_gaps.sql
Agenda
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
25
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Trouble Ticket roundtrip
26
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Trouble Ticket Roundtrip
SCOTT
KING
ADAMS
Find the tickets, which went
again to the same assignee
27
19.11.2015
ID Assignee
Datum
1
SCOTT
01.02.2015
1
SCOTT
02.02.2015
1
ADAMS
03.02.2015
1
SCOTT
04.02.2015
2
ADAMS
01.02.2015
2
ADAMS
02.02.2015
2
SCOTT
03.02.2015
3
KING
01.02.2015
3
ADAMS
02.02.2015
3
ADAMS
03.02.2015
3
KING
04.02.2015
3
ADAMS
05.02.2015
4
KING
01.02.2015
4
ADAMS
02.02.2015
4
SCOTT
03.02.2015
4
KING
05.02.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Trouble Ticket Roundtrip
Pre12c solution using self-joins
SELECT DISTINCT t1.ticket_id
,
t1.assignee AS first_assignee
,
t3.change_date AS last_change
FROM
trouble_ticket t1
,
trouble_ticket t2
,
trouble_ticket t3
WHERE t1.ticket_id
= t2.ticket_id
AND
t1.assignee
!= t2.assignee
AND
t2.change_date > t1.change_date
AND
t3.assignee
= t1.assignee
AND
t3.ticket_id
= t1.ticket_id
AND
t3.change_date > t2.change_date
ORDER BY ticket_id
28
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
mr_trouble_ticket.sql
Trouble Ticket Roundtrip
mr_trouble_ticket.sql
12c solution using MATCH_RECOGINZE clause
SELECT *
FROM
trouble_ticket
MATCH_RECOGNIZE(
PARTITION BY ticket_id
ORDER BY change_date
MEASURES strt.assignee as first_assignee
,
LAST(same.change_date) as letzte_bearbeitung
AFTER MATCH SKIP TO FIRST another
Where to start over after a
PATTERN (strt another+ same+)
match is found?
DEFINE same AS same.assignee = strt.assignee,
another AS another.assignee != strt.assignee
);
New:
– Row Pattern Skip To:
where to start over after
match?
– match overlaping patterns
29
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Agenda
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
30
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Grouping on fuzzy criteria
31
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Grouping over fuzzy criteria
„Sessionization“
– Group rows together where the gap between the timestamps is less
than defined
...
PATTERN (STRT SESS+)
DEFINE SESS AS SESS.ins_date – PREV(SESS.ins_date)<= 10/24/60
– Group rows together that are within a defined interval relatively to the
first row, otherwise start next group
https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID
:13946369553642#3478381500346951056
...
PATTERN (A+)
DEFINE A AS ins_date < FIRST(ins_date) + 6/24
Group over running totals
– Split the data into the groups of defined capacity
32
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Grouping over fuzzy criteria
Example-Schema SH (Sales History)
Task: split the data into the group of fixed
capacity
Fit all customers ordered by age into
groups providing that total sales in every
group < 200 000$
33
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Grouping over fuzzy criteria
mr_group_running_total.sql
12c solution with MATCH_RECOGINZE clause
WITH q AS (SELECT c.cust_id, c.cust_year_of_birth
,
SUM(s.amount_sold) cust_amount_sold
FROM
customers c JOIN sales s ON s.cust_id = c.cust_id
GROUP BY c.cust_id, c.cust_year_of_birth
)
function returns the macth
SELECT *
number
FROM
q
MATCH_RECOGNIZE(
Aggregates in MEASURES:
ORDER BY cust_year_of_birth
Running vs. Final
MEASURES MATCH_NUMBER() gruppe
,
SUM(cust_amount_sold) running_sum
,
FINAL SUM(cust_amount_sold) final_sum
ALL ROWS PER MATCH
PATTERN (gr*)
DEFINE gr AS SUM(cust_amount_sold)<=200000
);
We need all matches
34
19.11.2015
Aggregate function in
pattern variable‘s condition
12c SQL Pattern Matching – wann werde ich das benutzen?
Agenda
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
35
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Merge temporal intervals
36
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Merge temporal intervals
Temporal version of SCOTT-Schema: the data in EMP, DEPT and
JOB have temporal validity (VALID_FROM - VALID_TO)
37
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Merge temporal intervals
Task: Query the data for one employee joining four tables with
respect of temporal validity:
38
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Merge temporal intervals
WITH
joined AS (
SELECT e.empno,
g.valid_from,
LEAST( e.valid_to, d.valid_to, j.valid_to,
NVL(m.valid_to, e.valid_to),
LEAD(g.valid_from - 1, 1, e.valid_to) OVER(
PARTITION BY e.empno ORDER BY g.valid_from )
) AS valid_to,
e.ename, j.job, e.mgr, m.ename AS mgr_ename, e.hiredate,
e.sal, e.comm, e.deptno, d.dname
FROM empv e
INNER JOIN (SELECT valid_from FROM empv
UNION
SELECT valid_from FROM deptv
UNION
SELECT valid_from FROM jobv
UNION
SELECT valid_to + 1 FROM empv
WHERE valid_to != DATE '9999-12-31'
UNION
SELECT valid_to + 1 FROM deptv
WHERE valid_to != DATE '9999-12-31'
UNION
SELECT valid_to + 1 FROM jobv
WHERE valid_to != DATE '9999-12-31') g
ON g.valid_from BETWEEN e.valid_from AND e.valid_to
INNER JOIN deptv d
ON d.deptno = e.deptno AND g.valid_from BETWEEN d.valid_from AND d.valid_to
INNER JOIN jobv j
ON j.jobno = e.jobno AND g.valid_from BETWEEN j.valid_from AND j.valid_to
LEFT JOIN empv m
ON m.empno = e.mgr AND g.valid_from BETWEEN m.valid_from AND m.valid_to )
...
Quelle: Philipp Salvisberg:
http://www.salvis.com/blog/2012/12/28/joining-temporal-intervals-part-2/
39
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Merge temporal intervals
...
SELECT empno, valid_from, valid_to, ename, job, mgr,
mgr_ename, hiredate, sal, comm, deptno, dname
FROM joined
MATCH_RECOGNIZE (
PARTITION BY empno, ename, job, mgr,
mgr_ename, hiredate, sal, comm,
deptno, dname
ORDER BY valid_from
MEASURES FIRST(valid_from) valid_from,
LAST(valid_to) valid_to
PATTERN ( strt nxt* )
DEFINE nxt as valid_from = prev(valid_to) + 1
)
WHERE empno = 7788;
40
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Conclusion
Very powerful feature
Significantly simplifies a lot of queries (self-joins, semi-, anti-joins, nested queries),
mostly with performance benefit
Since 2007 a proposal for ANSI-SQL
Requires thinking in patterns
Complicated syntax (at first sight
)
But in many cases the code looks like the requirement in „plain English“
41
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Further information...
Database Data Warehousing Guide - SQL for Pattern Matching http://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956
Stewart Ashton‘s Blog - https://stewashton.wordpress.com
Oracle Whitepaper - Patterns everywhere - Find them Fast! http://www.oracle.com/ocom/groups/public/@otn/documents/webcontent/1965433.pdf
42
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Trivadis an der DOAG 2015
Ebene 3 - gleich neben der Rolltreppe
Wir freuen uns auf Ihren Besuch.
Denn mit Trivadis gewinnen Sie immer.
43
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?
Herunterladen