12c SQL Pattern Matching – wann werde ich das benutzen? Andrej Pashchenko Senior Consultant Trivadis GmbH BASEL BERN BRUGG DÜSSELDORF HAMBURG KOPENHAGEN LAUSANNE FRANKFURT A.M. FREIBURG I.BR. GENF MÜNCHEN STUTTGART WIEN ZÜRICH Unser Unternehmen. Trivadis ist führend bei der IT-Beratung, der Systemintegration, dem Solution Engineering und der Erbringung von IT-Services mit Fokussierung auf und -Technologien in der Schweiz, Deutschland, Österreich und Dänemark. Trivadis erbringt ihre Leistungen aus den strategischen Geschäftsfeldern: BETRIEB Trivadis Services übernimmt den korrespondierenden Betrieb Ihrer IT Systeme. 2 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Mit über 600 IT- und Fachexperten bei Ihnen vor Ort. KOPENHAGEN 14 Trivadis Niederlassungen mit über 600 Mitarbeitenden. HAMBURG Über 200 Service Level Agreements. Mehr als 4'000 Trainingsteilnehmer. Forschungs- und Entwicklungsbudget: CHF 5.0 Mio. DÜSSELDORF Finanziell unabhängig und nachhaltig profitabel. FRANKFURT STUTTGART FREIBURG BASEL WIEN MÜNCHEN Erfahrung aus mehr als 1'900 Projekten pro Jahr bei über 800 Kunden. BRUGG ZÜRICH BERN LAUSANNE GENF 3 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Über mich Senior Consultant bei der Trivadis GmbH, Düsseldorf Schwerpunkt Oracle – Application Development – Application Performance – Data Warehousing 22 Jahre IT-Erfahrung, davon 16 Jahre mit Oracle DB Kurs-Referent „Oracle 12c New Features für Entwickler“ und „Beyond SQL and PL/SQL“ 4 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Agenda 1. Introduction 2. Find consecutive ranges and gaps 3. Trouble Ticket roundtrip 4. Grouping on fuzzy criteria 5. Merge temporal intervals 5 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Introduction 6 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Introduction Pattern Matching LISTAGG NTH_VALUE SQL Model Clause Analytic functions enhancements Analytic functions 7 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? PIVOT/UNPIVOT clause Top-N Introduction Oracle 12c database supports SQL Pattern Matching with the new clause - MATCH_RECOGNIZE pattern matching in a sequences of rows nothing to do with string patterns (PL/SQL REGEXP_... functions) it‘s a clause, not a function after the table name in FROM clause patterns are expressed with regular expression syntax over pattern variables pattern variables are defined as SQL expressions 8 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Introduction MATCH_RECOGNIZE ( [ PARTITION BY <cols> ] [ ORDER BY <cols> ] [ MEASURES <cols> ] [ ONE ROW PER MATCH | ALL ROWS PER MATCH ] [ SKIP_TO <option> ] PATTERN ( <row pattern> ) [ SUBSET <subset list> ] DEFINE <definition list> ) 9 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Introduction Example: Find Mappings in the ETL logging table, which were increasingly faster over a period of four days. Output: start and end dates of the period, elapsed time at the beginning and the end of the period, average elapsed time. 10 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Introduction SELECT etl_date, mapping_name, elapsed FROM dwh_etl_runs; ... 04-NOV-14 MAP_STG_S_ORDER_ITEM +000000 05-NOV-14 MAP_STG_S_ORDER +000000 05-NOV-14 MAP_STG_S_ORDER_ITEM +000000 05-NOV-14 MAP_STG_S_ASSET +000000 06-NOV-14 MAP_STG_S_ASSET +000000 06-NOV-14 MAP_STG_S_ORDER +000000 06-NOV-14 MAP_STG_S_ORDER_ITEM +000000 07-NOV-14 MAP_STG_S_ORDER_ITEM +000000 07-NOV-14 MAP_STG_S_ORDER +000000 07-NOV-14 MAP_STG_S_ASSET +000000 08-NOV-14 MAP_STG_S_ASSET +000000 08-NOV-14 MAP_STG_S_ORDER +000000 08-NOV-14 MAP_STG_S_ORDER_ITEM +000000 ... 11 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? 00:14:54.42738 00:10:13.44989 00:15:06.24587 00:14:15.22855 00:14:00.49513 00:11:05.07337 00:10:12.67410 00:19:29.64314 00:14:59.80953 00:13:33.80789 00:10:14.65652 00:13:30.77744 00:17:15.11789 Introduction As for analytic functions: partition and order SELECT * FROM dwh_etl_runs MATCH_RECOGNIZE ( Define measures, which are accessible in the main query PARTITION BY mapping_name ORDER BY etl_date MEASURES FIRST (etl_date) AS start_date , LAST (etl_date) AS end_date Define search , pattern with FIRST (elapsed) AS first_elapsed regular expression over boolean LAST (elapsed) AS last_elapsed pattern,variables , AVG(elapsed) AS avg_elapsed PATTERN (STRT DOWN{3}) DEFINE DOWN AS elapsed < PREV(elapsed) ) Define pattern variables 12 19.11.2015 Navigation operators: PREV, NEXT – physical offset FIRST, LAST – logical offset 12c SQL Pattern Matching – wann werde ich das benutzen? Introduction PATTERN: Subset of Perl syntax for regular expressions – * — 0 or more iterations – + — 1 or more iterations – ? — 0 or 1 iterations – {n} — n iterations (n > 0) – {n,} — n or more iterations (n >= 0) – {n,m} — between n and m (inclusive) iterations (0 <= n <= m, 0 < m) – {,m} — between 0 and m (inclusive) iterations (m > 0) – ( ) – Grouping – | – Alternation – {- … -} – Exclusion – ^ - before the first row in the Partition – $ - after the last row in the partition – ? – “reluctant” vs. “greedy” – …. 13 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Introduction Patterns are everywhere Network Activity CRM Stock Market Financial Money Laundering Telcos Trouble Ticketing Quality of Service Fraud Detection Automotive Advertising Campaigns Buying Patterns Retail Sessionization Sensor Data Traffic Price Trends 14 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Frequent Flyer Programms Transport / Logistics Process Chain Introduction SQL had no efficient way to handle such questions pre 12c solutions self-joins, subqueries (NOT) IN, (NOT) EXISTS switch to PL/SQL - „Do it yourself“, often multiple SQL queries transfer some logic to pipelined functions and integrate them in the main query analytic (window) functions – ORA-30483: window functions are not allowed here – not possible to use in WHERE clause – not possible to nest them – unable to access the output of analytic functions in other rows – often leads to nesting queries, self-joins, etc. 15 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Agenda 1. Introduction 2. Find consecutive ranges and gaps 3. Trouble Ticket roundtrip 4. Grouping on fuzzy criteria 5. Merge temporal intervals 16 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Find consecutive ranges and gaps 17 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? mr_consecutive.sql Find Consecutive Ranges / Gaps ID Start of Range End of Range 1 1 3 2 5 6 3 10 12 5 6 10 11 12 14 SLA, QoS: find the longest period without outage Table T_GAPS 20 Find consecutive ranges in the values of column ID 21 Output: Start- and End-ID of consecutive range … 18 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Find Consecutive Ranges / Gaps ID 1 2 3 5 6 10 11 12 14 Pre 12c solution using analytic functions WITH groups_marked AS ( SELECT id , CASE WHEN id != LAG(id,1,id) OVER(ORDER BY id) + 1 THEN 1 ELSE 0 END new_grp FROM t_gaps) , sum_grp AS ( SELECT id, SUM(new_grp) OVER(ORDER BY id) grp_sum FROM groups_marked ) SELECT MIN(id) start_of_range , MAX(id) end_of_range FROM sum_grp GROUP BY grp_sum ORDER BY grp_sum; 20 21 … 19 19.11.2015 mr_consecutive.sql 12c SQL Pattern Matching – wann werde ich das benutzen? Find Consecutive Ranges / Gaps ID 1 2 3 5 6 „Tabibitosan“- method* SELECT MIN(id) start_of_range , MAX(id) end_of_range FROM (SELECT id , id - ROW_NUMBER() OVER(ORDER BY id) distance FROM t_gaps) GROUP BY distance ORDER BY distance; 10 11 12 14 20 21 … * - https://community.oracle.com/message/3991177#3991177 20 19.11.2015 mr_consecutive.sql 12c SQL Pattern Matching – wann werde ich das benutzen? Find Consecutive Ranges / Gaps ID 1 2 3 5 6 10 12c solution with MATCH_RECOGINZE SELECT * FROM t_gaps MATCH_RECOGNIZE ( ORDER BY id MEASURES FIRST(id) start_of_range , LAST(id) end_of_range , COUNT(*) cnt ONE ROW PER MATCH PATTERN (strt cont*) DEFINE cont AS id = PREV(id)+1 ); 11 12 14 20 21 … 21 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? mr_consecutive.sql mr_gaps.sql Find Consecutive Ranges / Gaps ID Start of Gap End of Gap 1 4 4 2 7 9 3 13 13 5 15 19 6 10 11 12 Table T_GAPS, numeric column ID with gaps 14 Find the gaps in the values of column ID 20 21 Output: start- and end-ID of the gap … 22 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Find Consecutive Ranges / Gaps ID 1 2 3 5 6 10 mr_gaps.sql Solution with analytic functions SELECT start_of_gap, end_of_gap FROM ( SELECT id + 1 start_of_gap , LEAD(id) OVER(ORDER BY id) - 1 end_of_gap , CASE WHEN id + 1 != LEAD(id) OVER(ORDER BY id) THEN 1 ELSE 0 END is_gap FROM t_gaps) WHERE is_gap = 1; 11 12 14 20 21 „Tabibitosan“-method* SELECT MAX(id) + 1 start_of_gap , LEAD(MIN(id)) OVER (ORDER BY distance) -1 end_of_gap FROM (SELECT id , id - ROW_NUMBER() OVER(ORDER BY id) distance FROM t_gaps) GROUP BY distance; … * - https://community.oracle.com/message/3991177#3991177 23 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Find Consecutive Ranges / Gaps ID 1 2 3 5 6 10 12c solution with MATCH_RECOGINZE SELECT * FROM t_gaps MATCH_RECOGNIZE ( ORDER BY id MEASURES PREV(gap.id)+1 start_of_gap , gap.id - 1 end_of_gap ONE ROW PER MATCH PATTERN (strt gap+) DEFINE gap AS id != PREV(id)+1 ); 11 12 14 20 21 … 24 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? mr_gaps.sql Agenda 1. Introduction 2. Find consecutive ranges and gaps 3. Trouble Ticket roundtrip 4. Grouping on fuzzy criteria 5. Merge temporal intervals 25 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Trouble Ticket roundtrip 26 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Trouble Ticket Roundtrip SCOTT KING ADAMS Find the tickets, which went again to the same assignee 27 19.11.2015 ID Assignee Datum 1 SCOTT 01.02.2015 1 SCOTT 02.02.2015 1 ADAMS 03.02.2015 1 SCOTT 04.02.2015 2 ADAMS 01.02.2015 2 ADAMS 02.02.2015 2 SCOTT 03.02.2015 3 KING 01.02.2015 3 ADAMS 02.02.2015 3 ADAMS 03.02.2015 3 KING 04.02.2015 3 ADAMS 05.02.2015 4 KING 01.02.2015 4 ADAMS 02.02.2015 4 SCOTT 03.02.2015 4 KING 05.02.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Trouble Ticket Roundtrip Pre12c solution using self-joins SELECT DISTINCT t1.ticket_id , t1.assignee AS first_assignee , t3.change_date AS last_change FROM trouble_ticket t1 , trouble_ticket t2 , trouble_ticket t3 WHERE t1.ticket_id = t2.ticket_id AND t1.assignee != t2.assignee AND t2.change_date > t1.change_date AND t3.assignee = t1.assignee AND t3.ticket_id = t1.ticket_id AND t3.change_date > t2.change_date ORDER BY ticket_id 28 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? mr_trouble_ticket.sql Trouble Ticket Roundtrip mr_trouble_ticket.sql 12c solution using MATCH_RECOGINZE clause SELECT * FROM trouble_ticket MATCH_RECOGNIZE( PARTITION BY ticket_id ORDER BY change_date MEASURES strt.assignee as first_assignee , LAST(same.change_date) as letzte_bearbeitung AFTER MATCH SKIP TO FIRST another Where to start over after a PATTERN (strt another+ same+) match is found? DEFINE same AS same.assignee = strt.assignee, another AS another.assignee != strt.assignee ); New: – Row Pattern Skip To: where to start over after match? – match overlaping patterns 29 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Agenda 1. Introduction 2. Find consecutive ranges and gaps 3. Trouble Ticket roundtrip 4. Grouping on fuzzy criteria 5. Merge temporal intervals 30 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Grouping on fuzzy criteria 31 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Grouping over fuzzy criteria „Sessionization“ – Group rows together where the gap between the timestamps is less than defined ... PATTERN (STRT SESS+) DEFINE SESS AS SESS.ins_date – PREV(SESS.ins_date)<= 10/24/60 – Group rows together that are within a defined interval relatively to the first row, otherwise start next group https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID :13946369553642#3478381500346951056 ... PATTERN (A+) DEFINE A AS ins_date < FIRST(ins_date) + 6/24 Group over running totals – Split the data into the groups of defined capacity 32 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Grouping over fuzzy criteria Example-Schema SH (Sales History) Task: split the data into the group of fixed capacity Fit all customers ordered by age into groups providing that total sales in every group < 200 000$ 33 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Grouping over fuzzy criteria mr_group_running_total.sql 12c solution with MATCH_RECOGINZE clause WITH q AS (SELECT c.cust_id, c.cust_year_of_birth , SUM(s.amount_sold) cust_amount_sold FROM customers c JOIN sales s ON s.cust_id = c.cust_id GROUP BY c.cust_id, c.cust_year_of_birth ) function returns the macth SELECT * number FROM q MATCH_RECOGNIZE( Aggregates in MEASURES: ORDER BY cust_year_of_birth Running vs. Final MEASURES MATCH_NUMBER() gruppe , SUM(cust_amount_sold) running_sum , FINAL SUM(cust_amount_sold) final_sum ALL ROWS PER MATCH PATTERN (gr*) DEFINE gr AS SUM(cust_amount_sold)<=200000 ); We need all matches 34 19.11.2015 Aggregate function in pattern variable‘s condition 12c SQL Pattern Matching – wann werde ich das benutzen? Agenda 1. Introduction 2. Find consecutive ranges and gaps 3. Trouble Ticket roundtrip 4. Grouping on fuzzy criteria 5. Merge temporal intervals 35 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Merge temporal intervals 36 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Merge temporal intervals Temporal version of SCOTT-Schema: the data in EMP, DEPT and JOB have temporal validity (VALID_FROM - VALID_TO) 37 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Merge temporal intervals Task: Query the data for one employee joining four tables with respect of temporal validity: 38 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Merge temporal intervals WITH joined AS ( SELECT e.empno, g.valid_from, LEAST( e.valid_to, d.valid_to, j.valid_to, NVL(m.valid_to, e.valid_to), LEAD(g.valid_from - 1, 1, e.valid_to) OVER( PARTITION BY e.empno ORDER BY g.valid_from ) ) AS valid_to, e.ename, j.job, e.mgr, m.ename AS mgr_ename, e.hiredate, e.sal, e.comm, e.deptno, d.dname FROM empv e INNER JOIN (SELECT valid_from FROM empv UNION SELECT valid_from FROM deptv UNION SELECT valid_from FROM jobv UNION SELECT valid_to + 1 FROM empv WHERE valid_to != DATE '9999-12-31' UNION SELECT valid_to + 1 FROM deptv WHERE valid_to != DATE '9999-12-31' UNION SELECT valid_to + 1 FROM jobv WHERE valid_to != DATE '9999-12-31') g ON g.valid_from BETWEEN e.valid_from AND e.valid_to INNER JOIN deptv d ON d.deptno = e.deptno AND g.valid_from BETWEEN d.valid_from AND d.valid_to INNER JOIN jobv j ON j.jobno = e.jobno AND g.valid_from BETWEEN j.valid_from AND j.valid_to LEFT JOIN empv m ON m.empno = e.mgr AND g.valid_from BETWEEN m.valid_from AND m.valid_to ) ... Quelle: Philipp Salvisberg: http://www.salvis.com/blog/2012/12/28/joining-temporal-intervals-part-2/ 39 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Merge temporal intervals ... SELECT empno, valid_from, valid_to, ename, job, mgr, mgr_ename, hiredate, sal, comm, deptno, dname FROM joined MATCH_RECOGNIZE ( PARTITION BY empno, ename, job, mgr, mgr_ename, hiredate, sal, comm, deptno, dname ORDER BY valid_from MEASURES FIRST(valid_from) valid_from, LAST(valid_to) valid_to PATTERN ( strt nxt* ) DEFINE nxt as valid_from = prev(valid_to) + 1 ) WHERE empno = 7788; 40 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Conclusion Very powerful feature Significantly simplifies a lot of queries (self-joins, semi-, anti-joins, nested queries), mostly with performance benefit Since 2007 a proposal for ANSI-SQL Requires thinking in patterns Complicated syntax (at first sight ) But in many cases the code looks like the requirement in „plain English“ 41 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Further information... Database Data Warehousing Guide - SQL for Pattern Matching http://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956 Stewart Ashton‘s Blog - https://stewashton.wordpress.com Oracle Whitepaper - Patterns everywhere - Find them Fast! http://www.oracle.com/ocom/groups/public/@otn/documents/webcontent/1965433.pdf 42 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen? Trivadis an der DOAG 2015 Ebene 3 - gleich neben der Rolltreppe Wir freuen uns auf Ihren Besuch. Denn mit Trivadis gewinnen Sie immer. 43 19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen?