Remarks to Optimization Rules for the End User Programming Language OttoVonG Opatia 8.6.2007 Klaus Benecke [email protected] IWS/FIN, Otto-von-Guericke-Universität Magdeburg Postfach 4120 Magdeburg, Germany 39016, Sachsen/Anhalt 1 Aims of OttoVonG Universal Enduser Query Language for • Documents (XML) • Tables (databases) • Internet (of XML-Documents) • Graphics 1 Objects of OttoVonG Generating Operations • El_tab • Tag0 • Tuple_t • Coll_t • Alternate_t 2 Counter Examples for Optimization Rules (1) << M( A, L( 1 B, C)):: 2 3 4 5 >> (Tabment T0) (a) B=4(B::C=3(T0) ) B::C=3(B=4(T0)) 2 Counter Examples for Optimization Rules (2) << M( A, L( 1 B, C)):: 2 3 4 5 >> (Tabment T0) (b) B::pos(B)=1 (B::B=4(T0)) B::B=4(B::pos(B)=1(T0)) 2 Counter Examples for Optimization Rules (3) << M( A, L( 1 B, C)):: 2 3 4 5 >> (Tabment T0) (c) B::C=3(L(C)[-1]=5(T0)) L(C)[-1]=5(B::C=3(T0)) 2 Attributes • • • • name C(name) (C collection symbol; M; B; L) pos(name) Attribute[i] (i: integer) 2 Nonrecursive Example DTD • • • • • • NAME TABMENT C D F A,... TYPOS L(A?, B, M(C, D)) E, F M(H) M(G) TEXT 2 Example Extended Tree L | (A: TEXT)?, (B: TEXT), M | (C: (E: TEXT), (F: M)), (D: M) | | (G: TEXT) (H: TEXT) 2 Condition Types (1) • Simple condition name1:: cond1, where cond1 contains only names and deepest name of cond1 is as deep as name1 example: G:: G=B counter example1: E:: G=B counter example2: n:: G=H 2 Condition Types (2) • Relational condition name1::: cond1, where name1:: cond1 is simple example: G::: G=B abbreviates: G:: G=B E:: G=B A:: G=B 3 Commuting Conditions(1) • EMPS: M(ENO, NAME, FIRSTNAME, LOCATION, SALARY, SEX, PATENTCNT, INSTITUTE, M(HOBBY), M(PROJECT, TIME)) • Query 1: not commuting conditions aus EMPS gib B-(SALARY, NAME, LOCATION, SEX) mit LOCATION=”Magdeburg” ## simple condition mit pos(SALARY) < 50 ## position selecting condition 3 Commuting Conditions(2) • EMPS: M(ENO, NAME, FIRSTNAME, LOCATION, SALARY, SEX, PATENTCNT, INSTITUTE, M(HOBBY), M(PROJECT, TIME)) • Query 2: commuting conditions aus EMPS mit PROJECT:: TIME > 10 # simple condition mit LOCATION=”Magdeburg” # simple condition 3 Commuting Conditions (3) • cond2(cond1(tab)) = cond1(cond2(tab)), • if one of the following conditions is satisfied: 1 cond1 and cond2 are simple. 2 one condition does not select in a fix level of the other 3 cond1 and cond2 are relational. 4 cond1 and cond2 refer to the same level and are not position selecting. 4 Absorption of a Condition • Query 3: aus EMPS mit PROJECT:: PROJECT in L(“otto” ”SQL” ”XML”) mit TIME > 10 # existential condition aus EMPS mit TIME>10 i PROJECT in L(“otto” ”SQL” ”XML”) mit PROJECT:: PROJECT in L(“otto” ”SQL” ”XML”) 5 Smuggling a Condition (1) • Query 4: aus EMPS mit LOCATION = ”Magdeburg” mit PROJECT:: TIME > 10 gib M(PROJECT, M(NAME, ENO, TIME)) aus EMPS mit LOCATION=”Magdeburg” mit PROJECT:: TIME>10 mit PROJECT = PROJECT gib M(PROJECT, M(NAME, ENO, TIME)) 5 Smuggling a Condition (2) • Query 4 continued: aus EMPS mit LOCATION=”Magdeburg” mit TIME > 10 mit PROJECT:: TIME>10 gib M(PROJECT, M(NAME, ENO, TIME)) aus EMPS mit LOCATION=”Magdeburg” mit PROJECT ::: TIME>10 gib M(PROJECT, M(NAME, ENO, TIME)) 6 Smuggling the ForgetOperation (1) • The forget-operation is a relatively simple operation, which is similar to the relational projection, but which differs from projection in 3 points. 1. The argument of forget is not a list of attributes, which remain in the resulting structure, but the list of attributes, which are to omit. 2. forget does not omit duplicates in sets. 3. forget can be used also in recursive structures 6 Smuggling the ForgetOperation (2) Query 5 a: aus EMPS mit LOCATION = ”Magdeburg” gib M(INSTITUTE, B(NAME, SALARY)) aus EMPS mit LOCATION = ”Magdeburg” forget HOBBY, PROJECT, TIME, SEX,… gib M(INSTITUTE, B(NAME, SALARY)) 6 Smuggling the ForgetOperation (3) Query 5 b: aus EMPS gib M(NAME, HOBBY, PROJECT) aus EMPS forget NAME, HOBBY, PROJECT, TIME, SEX,… gib M(NAME, HOBBY, PROJECT) 7 Rules with the Extension Operation ext ()(1) • Query 7: a hierarchical join FACULTIES: M(FAC, DEAN, FACBUDGET) INSTITUTES: M(INSTI, MGR, BUDGET, FAC) ext ext mit mit mit F := FACULTIES G := INSTITUTES at FACBUDGET INSTI:: F/FAC = G/FAC FACBUDGET > 100000 INSTI:: BUDGET > 10000 7 Rules with the Extension Operation ext ()(2) • Query 7: a hierarchical join of aus INSTITUTES mit BUDGET > 10000 =: $instis aus mit ext mit F:=FACULTIES FACBUDGET > 100000 G := $insts at FACBUDGET INSTI:: F/FAC = G/FAC 7 Rules with the Extension Operation ext ()(3) sel-ext1 cond(assi(tab)) = assi (cond (tab)), this rule holds, if all operations are applicable, and the extension does not introduce a name, which is used in the condition. Counter example for sel-ext1: << L(A, B):: 1 2>> ext B := 3 at B mit B=3 7 Rules with the Extension Operation ext ()(4) sel-ext2 cond(X:=tab2 at Y(tab1))=X:= ( cond (tab2)) at Y(tab1), here is presupposed that all operations are applicable, that cond is a ::-condition and does not contain a name from tab1. 7 Rules with the Extension Operation ext ()(5) ext-ext assi1(assi2(tab)) = assi2(assi1(tab)), this rule holds, if right and left hand side are defined and assi1 and assi2 have no common names or tabment names. Counter example for ext-ext (without importance): ext A := 1 ext C := 3 at A ext B := 2 at A # result type: A, B, C 7 Rules with the Extension Operation ext ()(6) Counter example for ext-ext (without importance): ext A := 1 ext B := 2 at A ext A := 3 at A # result type: (A, A, B) but ext A := 1 ext A := 3 at A ext B := 2 at A # result type: (A, B, A, B) Summary • We have powerful operations, which are implemented for XML-documents and TAB-files; • this implementation must be improved and generalized in several points • include optimization strategies; • generalize it to databases and Intranet Thank you for attention