IBM® PureData™ System for Analytics Carlo Marchesi, PureData Specialist [email protected] February 13, 2013 © 2013 IBM Corporation Würden Sie Google nutzen, wenn es Sie 3 Tage und 7 Personen kostet um eine Antwort zu erhalten? 2 Go to 'View > Header and Footer' to change this footer text to the event title © 2013 IBM Corporation IBM PureData System for Analytics Takes Analytics Beyond Reporting Optimization Predictive Analytics BI Reporting and Ad-Hoc Analysis What is the best choice? What will happen? What happened? What will the impact be? When and where? How much? 3 © 2013 IBM Corporation Traditionelle Data Warehouse Anwendungen sind einfach zu komplex Sie basieren auf Datenbanken die für die Transaktionsverarbeitung optimiert wurden – NICHT um die Anforderungen von fortschrittlichen Analysen auf großen Datenbeständen abzubilden Zu komplexe Infrastruktur Zu ineffiziente Analysen Zu komplizierter Einsatz Zu viel Personal für die Wartung Zu viel Tuning notwendig Zu kostspielig im Betrieb Zu zeitraubend für schnelle Antworten 4 © 2013 IBM Corporation IBM PureSystem Family 5 Infrastructure Application Platform Data Platform Delivering Infrastructure Services Delivering Platform Services Delivering Data Services © 2013 IBM Corporation IBM PureData System Meeting Big Data Challenges – Fast and Easy! For apps like E-commerce… System for Transactions Database cluster services optimized for transactional throughput and scalability For apps like Customer Analysis… System for Analytics Data warehouse services optimized for high-speed, peta-scale analytics and simplicity Powered by Netezza technology For apps like Real-time Fraud Detection… System for Operational Analytics 6 Operational data warehouse services optimized to balance high performance analytics and real-time operational throughput © 2013 IBM Corporation Built-In Expertise Makes This as Simple as an Appliance Dedicated device Optimized for purpose Complete solution Fast installation Very easy operation Standard interfaces Low cost 7 © 2013 IBM Corporation IBM PureData System for Analytics The Simple Appliance for Serious Analytics Built-in Expertise No indexes or tuning Data model agnostic Hardware accelerated, fully parallel, optimized, In Database Analytics Integration by Design Server, Storage, Database in one easy to use package Automatic parallelization and resource optimization to scale efficiently and economically Enterprise-class security and platform management Simplified Experience Up and running in hours Minimal up front design and tuning Minimal ongoing administration Standard interfaces to best of breed Analytics, Business Intelligence, and data integration tools Built-in, complex analytical capabilities allow users to derive insight from their data quickly Easy connectivity to other Big Data Platform components 8 © 2013 IBM Corporation IBM PureData System for Analytics Transforms the User Experience Purpose-built analytics engine Integrated database, server and storage Standard interfaces Low total cost of ownership Speed: 10-100x faster than traditional systems Simplicity: Minimal administration and tuning Scalability: Peta-scale user data capacity Smart: High-performance advanced analytics 9 © 2013 IBM Corporation Speed 15,000 Nutzer führen 800,000+ Abfragen am Tag aus. Jetzt 50X schneller als vorher “…when something took 24 hours I could only do so much with it, but when something takes 10 seconds, I may be able to completely rethink the business …” - SVP Application Development, Nielsen Source: http://www.youtube.com/watch?v=yOwnX14nLrE&feature=player_embedded 10 © 2013 IBM Corporation Simplicity Im produktiven Einsatz 6 Monate vor dem ersten Training 200X schneller als das alte Oracle System ROI in weniger als 3 Monaten MONTHS WEEKS “Allowing the business users access to the Netezza box was what sold it.” DAYS Steve Taff, Executive Dir. of IT Services 11 © 2013 IBM Corporation Scalability 1 PB auf Netezza Systemen 7 Jahre an historischen Daten 100-200% jährliches Datenwachstum “NYSE … has replaced an Oracle IO relational database with a data warehousing appliance from Netezza, allowing it to conduct rapid searches of 650 terabytes of data.” ComputerWeekly.com Source: http://www.computerweekly.com/Articles/2008/04/14/230265/NYSE-improves-data-management-with-datawarehousing.htm 12 © 2013 IBM Corporation Smart 30% redemption increase “Using results derived from Netezza system, Catalina Marketing is able to give shoppers (more relevant) coupons at point of sale for items they would like want to buy in future visits.” Editorial Director, DM Review 13 © 2013 IBM Corporation 14 Go to 'View > Header and Footer' to change this footer text to the event title © 2013 IBM Corporation Integrated by Design IBM Netezza In-Database Analytics Version 2.0 Netezza In-Database Analytics Transformations Mathematical Geospatial Predictive Statistics Time Series Data Mining No data movement Analyze deep and wide data High performance, parallel computation 15 © 2013 IBM Corporation Pre-Built In-Database Analytics Statistics Descriptive Statistics+ Distance Measures* Hypothesis Testing* Chi-Square & Contingency Tables* Univariate & Multivariate Distributions+ Transformations Data Profiling / Descriptive Statistics+ General Diagnostics Time Series Autoregressive+ Forecasting* Statistics+ Sampling Data prep Monte Carlo Simulation* Data Mining Predictive Mathematical Basic Math* Permutation and Combination* Greatest Common Divisor and Least Common Multiple* Conversion of Values* Exponential and Logarithm* Gamma and Beta Functions Matrix Algebra+ Area Under Curve* Interpolation Methods* Geospatial Association Rules+ Linear Regression+ Geospatial Data Type Clustering+ Logistic Regression+ Geometric Functions Feature Extraction+ Classification Geometric Analysis Discriminant Analysis* Bayesian Sampling * Fuzzy Logix DB Lytix capabilities + Netezza Analytics and Fuzzy Logix DB Lytix capabilities Model Testing 16 © 2013 IBM Corporation Combining Spatial and Corporate Data Delivering More Insight Combine location with 100 Million call data records Customer usage and location can now be used together to – Report on usage by area – Develop new marketing campaigns to gain customers in low usage areas – Planning for additional towers and network capacity 17 © 2013 IBM Corporation PureData System for Analytics Optimization With Other IBM Products Big Data Platform Data Integration Business Intelligence / Performance Management System Z 18 InfoSphere Streams InfoSphere BigInsights System ML (Machine Learning) Information Server v9.1 InfoSphere Discovery v4.5 InfoSphere Data Architect v8.1 InfoSphere CDC Heterogeneous Replication InfoSphere Optim Data Archive 9.1 Industry Models v8.4 – Banking, Insurance, Healthcare Industry Model Packs – Supply Chain, Customer, Market & Campaign Tivoli Storage Manager Vivismo Data Explorer v8.2 Cognos v10.2 Cognos TM1 v9.5 Guardium DB Monitoring v9 SPSS Modeler v15 Unica EMM Marketing Analytics 8.6 Unica NetInsights 8.6 IBM DB2 Analytics Accelerator (IDAA) zLinux ODBC driver Coming Soon: PureData System for Operational Analytics Guardium Informix Data Warehouse Edition SPSS v16 © 2013 IBM Corporation Loading the PureData System for Analytics 19 JDBC Data In ODBC Ab Initio Cloudera Composite Software IBM Big Insights IBM Information Server IBM InfoSphere Streams Informatica Oracle Data Integrator Oracle GoldenGate SAP Business Objects SQL OLE-DB Data Integration © 2013 IBM Corporation Querying the PureData System for Analytics 20 JDBC Data Out ODBC IBM Cognos IBM SPSS IBM Unica Information Builders Kalido KXEN Microsoft Excel MicroStrategy Oracle OBIEE SAP Business Objects SAS Actuate SQL OLE-DB Reporting and Analysis © 2013 IBM Corporation PureData System for Analytics Hardware Overview: Model N2001 12 Disk Enclosures 288 600 GB SAS2 Drives • 240 for User Data • 14 for S-Blades • 34 Spare RAID 1 Mirroring 2 Hosts (Active-Passive) 2 6-Core Intel 3.46 GHz CPUs 7x300 GB SAS Drives Red Hat Linux 6 64-bit Scales from ½ Rack to 4 Racks 7 PureData for Analytics S-Blades™ 2 Intel 8 Core 2+ GHz CPUs 2 8-Engine Xilinx Virtex-6 FPGAs 128 GB RAM + 8 GB slice buffer Linux 64-bit Kernel User Data Capacity: Data Scan Speed: Load Speed (per system): 192 TB* 478 TB/hr* 5+ TB/hr Power Requirements: Cooling Requirements: 7.5 kW 27,000 BTU/hr * Assuming 4X compression 21 © 2013 IBM Corporation Big Data Meets Deep Analytics Analytics without constraint 22 © 2013 IBM Corporation PureExperience Program Let Us Prove it at No Charge 1. Guided analysis of business value 2. PureSystems Technology Demonstration 3. On-Site Trial & Support Free execution of on-site service engagement Continued use of the PureSystems offering for 30 days Access to a technical advocate for usage questions and advice Single point of IBM support and maintenance www.ibm.com/PureExperience 23 © 2013 IBM Corporation Take the next step Discover the value and begin your journey with IBM PureSystems Visit ibm.com/puresystems to learn more Join the conversation about this new category of computing – Twitter: @IBM PureSystems • Hashtag: #IBMPureSystems or #expertintsys – YouTube Channel: expertintegratedsys – Blog: expertintegratedsystemsblog.com Developers – Get started today with our no charge trial offerings! – ibm.com/developerworks/puresystems/try Explore PureSystem partner solutions – Ibm.com/puresystems/centre 24 © 2013 IBM Corporation