IBM Smarter Analytics für Big Data

Werbung
IBM Smarter Analytics für „Big Data”
Technologie Workshop „Big Data”
FZI Forschungszentrum Informatik
am Karlsruher Institut für Technologie
22. Juni 2015
Axel J. Schwarz
Mobil: +49 171 5619419
E-Mail: [email protected]
IBM Smarter Analytics für „Big Data”
1
v1.3
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Paradigm shift enabled by „Big Data“
Analytics is expanding from enterprise data to big data,
creating new opportunities for competitive advantage
Traditional Approach
Structured, analytical, logical
New Approach
Creative, holistic thought, intuition
Hadoop &
Streaming
Data
Data
Warehouse
Web Logs, URLs
Transaction Data
Contact Center notes
Customer Data
Core System Data
Structured
Repeatabl
e
Linear
Enterprise
Integration
Unstructured
Exploratory
Iterative
Text Data: emails, chats
Social Data
Log data
3rd
Party Data
Traditional
Sources
2
Geolocation data
New
Sources
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Paradigm shift enabled by „Big Data“
Leverage more of the data being captured
TRADITIONAL APPROACH
All available
information
Analyze small subsets
of information
3
BIG DATA APPROACH
Analyzed
information
All available
information
analyzed
Analyze
all information
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Paradigm shift enabled by „Big Data“
Reduce effort required to leverage data
TRADITIONAL APPROACH
Small
amount of
carefully
organized
information
Carefully cleanse information
before any analysis
4
BIG DATA APPROACH
Large
amount of
messy
information
Analyze information as is,
cleanse as needed
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Paradigm shift enabled by „Big Data“
Data leads the way—and sometimes correlations are good enough
TRADITIONAL APPROACH
Hypothesis
Question
Data
Exploration
Answer
Data
Insight
Correlation
Start with hypothesis and
test against selected data
5
BIG DATA APPROACH
Explore all data and
identify correlations
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Paradigm shift enabled by „Big Data“
Leverage data as it is captured
TRADITIONAL APPROACH
BIG DATA APPROACH
Data
Analysis
Data
Repository
Analysis
Insight
Insight
Analyze data after it’s been
processed and landed in a warehouse
or mart
6
Analyze data in motion as it’s
generated, in real-time
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Paradigm shift enabled by „Big Data“ - Summary
Analyze small subsets
of information
Analyzed
information
All available
information
analyzed
BIG DATA APPROACH
TRADITIONAL APPROACH
All available
information
Analyze
all information
Hypothesis
Question
Data
Exploration
Answer
Data
Insight
Correlation
Start with hypothesis and
test against selected data
7
Explore all data and
identify correlations
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Paradigm shift enabled by „Big Data“ - Summary
Analyze small subsets
of information
EXACT
EVERY
THING
SLOW
LACK OF
CONTROL
Analyzed
information
Hypothesis
Question
Answer
Data
Start with hypothesis and
test against selected data
8
Data
Exploration
Insight
Correlation
All available
information
analyzed
BIG DATA APPROACH
TRADITIONAL APPROACH
All available
information
Analyze
all information
Explore all data and
identify correlations
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
IBM Big Data & Analytics Referenz-Architektur
Komponenten - Überblick
9
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
IBM Big Data & Analytics Referenz-Architektur
Komponenten - Detaillierung
10
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
IBM Big Data & Analytics Referenz-Architektur
Komponenten – Fokus
11
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
In-Memory „Big Data“ Processing - Apache Spark
IBM Investing in Four Catalysts for Big Data Adoption
12
Open Source Innovation
Technical Standards
Familiar Interfaces & Integration
with Established Tools
New Analytics Capabilities
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
In-Memory „Big Data“ Processing - Apache
…potentially the most significant open source project of the next decade.
To further accelerate open source innovation for the
Spark ecosystem, IBM is taking the following actions:
 IBM will build Spark into the core of the company’s analytics
and commerce platforms.
 IBM's Watson Health Cloud will leverage Spark as a key under-
pinning for its insight platform, helping to deliver faster time
to value for medical providers and researchers as they access
new analytics around population health data.
 IBM will open source its breakthrough IBM SystemML machine
learning technology and collaborate with Databricks to advance
Spark’s machine learning capabilities.
 IBM will offer Spark as a Cloud service on IBM Bluemix to make it possible for
app developers to quickly load data, model it, and derive the predictive
artifact to use in their app.
 IBM will commit more than 3,500 researchers and developers to work on
Spark-related projects […], and open a Spark Technology Center in San
Francisco […].
 IBM will educate more than 1 million data scientists and data engineers on
13
Spark through extensive partnerships with AMPLab, DataCamp, MetiStream,
Galvanize and BigData University MOOC.
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
In-Memory „Big Data“ Processing - Apache Spark
The Combination: The Flexibility of Spark on a Stable Hadoop Platform
14
Unlimited Scale
Ease of Development
In-Memory Performance
Enterprise Platform
Wide Range of
Data Formats
Combine Workflows
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
In-Memory „Big Data“ Processing - Apache Spark
Die wichtigsten Bewohner des „Hadoop-Zoos”
Quelle: Uwe Seiler: Zoo voller Gehege, in: iX Developer Big Data, 2/2015, S.37
15
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
In-Memory „Big Data“ Processing - Apache Spark
What’s the scoop with Hadoop?
Der Link zum Video findet sich im Anhang dieser Präsentation.
16
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
In-Memory „Big Data“ Processing - Apache Spark
Spark Libraries
Spark SQL
Spark
Streaming
GraphX
MLlib
SparkR
Apache Spark
17
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
In-Memory „Big Data“ Processing - Apache Spark
Spark on Hadoop
Spark SQL
Spark
Streaming
GraphX
MLlib
SparkR
Apache Spark
Slave node 1
18
Apache Hadoop-YARN
Resource
management
Apache Hadoop-HDFS
Storage
management
Slave node 2
…
Slave node n
Compute
layer
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
In-Memory „Big Data“ Processing - Apache Spark
IBM Open Platform with Apache Hadoop

100% open source code
 Commitment to currency: “days, not months”
 Includes Spark

Free for production use
 Decoupled Apache Hadoop from IBM analytics and data science technologies
 Production support offering available
IBM Open Platform with Apache Hadoop
HDFS
MapReduce
Spark
Hive
HCatalog
Pig
YARN
Ambari
HBase
Flume
Sqoop
Solr/Lucene
Apache Open Source Components
19
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
In-Memory „Big Data“ Processing - Apache Spark
IBM is Committed to Open Source

Open source technologies are the base for IBM software and solutions

IBM’s long history of deep open source commitment





Apache Software Foundation: Founding member in 1999
Cloud Foundry: #1 contributor; Basis for Bluemix
OpenStack: #4 contributor; Basis for IBM’s IaaS
Linux: #3 contributor; IBM first enterprise backer of Linux
Hadoop/Spark: Extensive investment in open source contribution; Integration
with Analytics software
Application
Systems
Infrastructure
20
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
In-Memory „Big Data“ Processing - Apache Spark
IBM BigInsights for Apache Hadoop
for Apache Hadoop
IBM BigInsights
Data Scientist
Text Analytics
IBM BigInsights
Analyst
Machine Learning
with Big R
IBM BigInsights
Enterprise Management
Big R
POSIX Distributed File System
Big SQL
Big SQL
BigSheets
BigSheets
Multi-workload, Multi-tenant
scheduling
IBM Open Platform with Apache Hadoop
21
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Link: http://www.sidap.de
22
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: SIDAP
Echtzeit-Betriebsanalyse – Betrieb auf Basis globaler Erfahrung
Otto von Bismarck, Winston Churchill:
„Der Weise lernt von den Fehlern
der Anderen, der Dumme aus den
Eigenen”
Device: Gerät, Apparat, Einrichtung,
Verfahren, Einheit
23
Quelle: Dr. T. Pötter, Dr. M. Steffen (beide Bayer Technology Services)
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: SIDAP
Echtzeit-Betriebsanalyse – Betrieb auf Basis globaler Erfahrung
Quelle: Dr. T. Pötter, Dr. M. Steffen (beide Bayer Technology Services)
24
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: SIDAP
Echtzeit-Betriebsanalyse – Betrieb auf Basis globaler Erfahrung
Quelle: Dr. T. Pötter, Dr. M. Steffen (beide Bayer Technology Services)
25
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: SIDAP
Echtzeit-Betriebsanalyse – Betrieb auf Basis globaler Erfahrung
Die Qualität der
Vorhersagen von
Fehlern wächst
mit der Anzahl
der Einträge in
die Datenbasis.
Quelle: Dr. T. Pötter, Dr. M. Steffen (beide Bayer Technology Services)
26
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: SIDAP
Datenaggregation im Rahmen von SIDAP
Quelle: Dr. T. Pötter, Dr. M. Steffen (beide Bayer Technology Services)
27
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: SIDAP
Use-Cases
28
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: SIDAP
Use-Cases und Aufgabenbereiche der Partner
29
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: SIDAP
Echtzeit-Betriebsanalyse (Operations Analysis)
Raw Logs and Machine Data
Real-time Monitoring
InfoSphere
Streams
Capture Data Stream
SPSS
Modeler
Identify Anomaly
Decision
Management
Historical Reporting and Analysis
InfoSphere
BigInsights
Raw Data
SPSS
Modeler
Aggregate Results
Dashboard/BI
Predict and Classify
Data Warehouse
SPSS
Modeler
Predict and Score
30
Store Results
Federated
Navigation
and Discovery
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
IBM’s Internet of Things Foundation






Secure Device Registration
Scalable Device Connectivity
Historian
Visual wiring
PAYG SaaS pricing
Powered by IBM MessageSight technology
Manage
Assemble
Collect
Connect
31
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
IoT becomes a Composable Business
IoT end-end solutions
Connected appliance solutions, Smarter home solutions…
IoT-related Bluemix services
Rules, Push, Geo location, Analytics, Asset management, Predictive Maintenance…
App tips open
community
IoT Foundation
Secure Device Registration, Scalable Device Connectivity, Historian, Visual wiring
Device recipe
open
community
Devices & Gateways
32
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
IBM Bluemix – How to build a smarter App
Der Link zum Video findet sich im Anhang dieser Präsentation.
33
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
IBM Internet of Things Foundation
34
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
IBM Internet of Things Foundation – Status Quo
35
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
IBM Internet of Things Foundation – Status Quo
36
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
IBM Internet of Things Foundation – Status Quo
37
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
IBM Internet of Things Foundation – Status Quo
38
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
IBM Internet of Things Foundation – Status Quo
39
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
Device Recipes make it faster
Connect
40
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
Connect
Application Connection
 Publish the same data
to many applications
with MQTT
 Visualisation on app
or Brower based
interface
App on Bluemix
 Access control via
Application Registration
& Secure Token
 Compose with other IoT
Services in Bluemix
 e.g. HD Insights,
Notification services,
Twitter analytics
41
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
Beispiel
42
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
IBM Smarter Analytics für „Big Data”
Exkurs: Internet of Things
Beispiel
43
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Empfehlung Ia
IBM Bluemix
Link:
http://www.ibm.com/bluemix
44
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Empfehlung Ib
IBM Internet of Things Zone auf Bluemix
IBM Internet of Things
Foundation
IoT Zone in Bluemix
ibm.biz/try_iot
oder
https://bluemix.net/solutions/iot
45
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Empfehlung II
IBM Watson Analytics
Link:
http://www.ibm.co
m/analytics/watson
-analytics/
46
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Empfehlung III
„Big Data“ University
Link:
http://bigdatauniversity.co
m/
47
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Dipl. Wirtsch.-Ing.
Axel J. Schwarz
Mobik: +49 171 5619419
E-Mail: [email protected]
Software Client Architect
Vielen Dank!
48
© 2015 IBM Corporation
IBM Smarter Analytics für „Big Data”
Quellen und Referenzen
Literatur/Blog-Beiträge
 Ryan Baxter: Bluemix and the Internet of Things
http://ryanjbaxter.com/2014/07/16/bluemix-and-the-internet-of-things/
 Bernard Marr: Spark Or Hadoop -- Which Is The Best Big Data Framework?
http://www.forbes.com/sites/bernardmarr/2015/06/22/spark-or-hadoop-which-is-the-best-big-data-framework/2/
 Paul Miller - IBM Backs Apache Spark For Big Data Analytics
http://www.forbes.com/sites/paulmiller/2015/06/15/ibm-backs-apache-spark-for-big-data-analytics/
Videos




IBM IBM Big Data: How it works - https://www.youtube.com/watch?v=u5jWC89xBzI
What’s the Scoop with Hadoop? - https://www.youtube.com/watch?v=BcGqiJXcDo8
How to build a smarter app - https://www.youtube.com/watch?v=E5TKNbWl2Co
Learn how to create a mobile app quickly using IBM BlueMix! - https://www.youtube.com/watch?v=DbXN2mz90b0
Links
 IBM Emerging Technology - Node-RED: http://nodered.org
 IBM developerWorks - IoT: https://www.ibmdw.net/iot/
 IBM developerWorks - Build a cloud-ready temperature sensor with the Arduino Uno and the IBM IoT Foundation:
http://www.ibm.com/developerworks/cloud/library/cl-bluemix-arduino-iot1/index.html Bluemi
Bildquellen
 IBM sowie lizenzfreie digitale Bilder der Medien-Bibliotheken: 123RF (http://www.123rf.com) und Stock.XCHNG
(http:///www.sxc.hu).
49
© 2015 IBM Corporation
Herunterladen