Hewlett-Packard Laboratories Technische Universität München Managing Long-Running Queries Stefan KrompassTUM, Harumi KunoHPL, Janet WienerHPL, Kevin WilkinsonHPL, Umeshwar DayalHPL, and Alfons KemperTUM TUMTechnische Universität München Munich, Germany HPLHewlett-Packard Laboratories Palo Alto, CA, USA Technische Universität München Hewlett-Packard Laboratories Motivation Business analysis Maintenance Order entry Customer relations DBMS Sales Administrator 1 Hewlett-Packard Laboratories Technische Universität München Goals of this work • Develop technology to study policies for mixed workloads • Initial study of managing mixed workloads, in particular: impact of long-running queries on a workload – Unreliable cost estimates under-informed admission control and scheduling decisions – Unobserved resource contention monitored resource not the source of contention – System overload 2 Hewlett-Packard Laboratories Technische Universität München Outline • Workload management components & workload management policies • Experiments • Conclusions 3 Technische Universität München Hewlett-Packard Laboratories Workload management overview Workload Objectives Client Workload Manager Admission Controller Database Engine Scheduler Queries Execution Controller DBMS 4 Hewlett-Packard Laboratories Technische Universität München Experimental approach • Create workloads that inject “problem” queries (our workloads are derived from actual mixed workload queries) • Develop a workload management software that implements admission control, scheduling, and execution control policies • Workload manager feeds queries into database engine simulator – Investigate workloads that run for hours – Obtain reproducible results – Experiment with comprehensive set of workload management policies – Inject problem queries 5 Technische Universität München Hewlett-Packard Laboratories Experimental input: queries query type size of query pool queries per workload average elapsed time short 2807 400 30 sec medium 247 23 10 min long 48 3 1 hr Problem queries 6 Technische Universität München Hewlett-Packard Laboratories Taxonomy of long-running queries Query type Query Query Uses equal expected progress share of to be long reasonable resources expected-long yes yes yes expected-hog yes yes no (> equal) surprise-long no yes yes surprise-hog no yes no (> equal) overload no no yes starving no no no (< equal) 7 Hewlett-Packard Laboratories Technische Universität München Experimental inputs • Workload types: expected-long, surprise-long, surprise-hog • Admission control – Policies: none, limit expected costs of a query – Thresholds: 0.2m, 0.5m, and 1.0m • Scheduling – Queue: FIFO – Multiprogramming level (MPL) • Execution control – Policies: none, kill, kill&requeue, suspend&resume – Thresholds: absolute 5000 (time units), absolute 12000, absolute 5000 & progress < 30%, relative 1.2x 8 Technische Universität München Hewlett-Packard Laboratories How did we choose the thresholds? Expected = actual CPU costs Surprise-* 9 Technische Universität München Hewlett-Packard Laboratories Experimental measure - weighted makespan Q4 13 Q3 4 Q1 9 31 Elapsed time of queries 4.5 9 5 Weighted makespan Q2 Makespan (Q4 filtered, Penalty for as expected treating Q2 as no Makespan penalty) positive (Qfalse 2 and Q4 filtered) 10 Hewlett-Packard Laboratories Technische Universität München Experimental measure - weighted makespan 1A1C “one long query admitted” (1A) “one long query completed” (1C) 11 Technische Universität München Hewlett-Packard Laboratories Can WM handle unreliable cost estimates? Admission control thresholds Weighted makespan (in simulator time units) 160 140 penalty none 0.2m 0.5m 1.0m 120 100 80 60 3A3C 3A3C 0A0C 0A0C 0A0C 2A2C 2A2C 0A0C 40 20 0 expectedlong surpriselong Workload type 12 Technische Universität München Hewlett-Packard Laboratories Can WM handle unreliable cost estimates? Adm ctl + exec ctl with different kill thresholds 160 penalty none absolute 12000 Weighted makespan (in simulator time units) 140 120 100 80 3A3C absolute 5000 absolute 5000, progress <30% relative 1.2x 3A3C 3A0C 3A0C 3A2C 0A0C 0A0C 0A0C 0A0C0A0C 60 40 20 0 none 1.0m Admission control threshold 13 Technische Universität München Hewlett-Packard Laboratories Can WM handle unreliable cost estimates? Execution control actions (w/ admission control 1.0m) 160 penalty Weighted makespan (in simulator time units) 140 absolute 5000 relative 1.2x 120 2A2C 100 80 2A2C 2A2C 2A0C 2A2C 2A2C 2A0C 60 40 20 0 none kill kill & requeue suspend& resume 14 Technische Universität München Hewlett-Packard Laboratories Can workload management handle system overload? 160 penalty none absolute 12000 Weighted makespan (in simulator time units) 140 120 2A2C 100 80 absolute 5000 absolute 5000, progress <30% relative 1.2x 2A0C 2A0C 2A2C 2A0C2A0C 2A0C 2A0C 2A0C 2A0C 60 40 20 0 MPL 4 MPL 10 (overload) 15 Hewlett-Packard Laboratories Technische Universität München Conclusion • Systematic study of workload management policies to mitigate the impact of long-running queries • Can workload management handle… – unreliable cost estimates – unobserved resource contention – system overload • Value of this work: experimental framework for studying more challenging workload management problems 16 Hewlett-Packard Laboratories Technische Universität München Related work (excerpt) D. G. Benoit. Automated Diagnosis and Control of DBMS Resources. EDBT PhD. Workshop, 2000 S. Chaudhuri, R. Kaushik, and R. Ramamurthy. When Can We Trust Progress Estimators for SQL Queries? SIGMOD 2005 S. Chaudhuri, R. Kaushik, R. Ramamurthy, and A. Pol. Stop-and-Restart Style Execution for Long Running Decision Support Queries. VLDB 2007 S. Krompass, H. Kuno, U. Dayal, and A. Kemper. Dynamic Workload Management for Very Large Data Warehouses: Juggling Feathers and Bowling Balls. VLDB 2007 G. Luo, J. F. Naughton, and P. S. Yu. Multi-query SQL Progress Indicators, EDBT 2006 Workload management tools: HP Neoview, IBM Workload Manager for DB2, Microsoft SQL Server, Oracle Database Resource Manager, Teradata Dynamic Workload Manager 17