select fun, profit from real_world where rational = false Teching Software-Engineering db4o Great stuff! • 20 years with Relational Databases • PostgreSQL, SQLite, Oracle ... 2006 2004 Really great stuff! 2006 Sorry. No books available! 2008 nosqlberlin.de nosqlfrankfurt.de nosql powerdays 2010 Oracle, IBM, etc. wollen auch… auch nonnon-relational Und nicht vergessen… Einfache Installation / Handling + Fun! ScaleScale-out! WebWeb-Scale = Historie OpenOpen-Source nur indirekte finanzielle Interessen Schema free Strange Loop Pattern: Design for Crash! Crash! Replication fun? kommt noch… nosqltapes.com ! NoSQL is specialization! NoSQL foundations: Paralellization Contracts Stonebraker „A A giant step back! Imcompatible, missing features, not new, …““ Starke Konkurrenz: Stratosphere (TUB), ePic, SwissBox, etc. compile, analyze, optimize auf einer atmenden Cloud! Eventually Consistent Consistency Models ACID BASE • Amazon Dynamo • MySQL Replikation © Wilfried Springer NoSQL Rollercoaster CAP Theoreme Pick 2! „Don‘t Don‘t throw C away so easy! It‘s complex.“ System is always ‘ on‘ Availability Clients find replicas Klassiker ACID / Isolation Consistency Clients see equal data NoSQL Partition Tolerance What you really have is: 1. Application errors 2. Repetable DBMS errors 3. Unrepeatable DBMS errors 4. Operating System errors 5. Hardware failure in cluster 6. Network partition in local cluster 7. A disaster 8. WAN failure • 6 = Network Partition is rare • 3,4,5,6 is mostly a Single Node • Algorithms can help! Consistent Hashing M:[0,5) R:[25,30) N:[5,10) „give give up P rather than sacrificing C. Use VoltDB or NimbusDB” NimbusDB” Q:[20,25) pessimistisches Locking? KNOTEN REPLIKAT 2 M N,O 8 N O,P 10 O P,Q 17 P Q,R 22 Q R,M 26 R M,N O:[10,15) P:[15,20) W = 2*W R = 1*R HASH • ausfallsicher • leicht erweiterbar • gut verteilt / vnodes laufen Anna A:1 L:1 surfen surfen laufen P:1 L:1 L:2 P:1 A:0 A:1 L:1 P:0 Paul laufen surfen L:1 L:2 P:1 surfen => P:2 A:1 L:2 Laura Google Protocol Buffers • automatic RPC generation => Column Family DocumentDBs Key/ValueDBs Voldemort, Chordless, Scalaris, Dynamo / Dynomite GraphDBs andere db4o, Versant, Objectivity, Gemstone, Progress, Mark Logic, EMC Momentum, Tamino, GigaSpaces, Hazelcast, Terracotta, … Wide Column Stores / Column Families + Skalierung = new node + Community + API - Replikation - Aufsetzen, Optimierung, Wartung + stressfreie SaaS Lösung + transparent scaling - UTFUTF-8 String - Daten liegen bei Amazon - kein tuning / config + Skalierung = new node + Replikation + Konfiguration (r, w) - Dokumentation - Abfragen - (storage(storage-conf.xml) Document DBs Views + memory mapped, indexes, queries, marketing - durability, single instance design 1 URL Replikation 67 GB 2 Days off NoSQL Divergenz: CouchDB = NoNoSQL K/VK/V-Stores + sehr schnell > 100.000 /sek + konfigurierbarer Disc sync NoSQL Konvergenz: + API für eigene Anbindung + einfache Replikation Data Structure Server -> + hash, list, set, sorted set, messages + Installation UNIX: 38 sek Windows: 18 sek - noch nicht skalierbar (2.*) Property Graph Neo4j, Sones, HyperGraphDB, InfiniteGraph, InfoGrid, Dex, VertexDB, Filament, OrientDB X DB DataModel, Query Methods, Languaes, License, Protocol Key / ValueDB + DocumentDB + ObjectDB + GraphDB +schneller… Entscheider > 220 DBs 10 Seiten Analyse Bitten um Hilfe Bauchentscheidung nichtfunktionales Requirement! NoSQL Consulting Start-Ups ☺ 1. Data 2. Transactions 3. Performance 4. Queries etablierte Unternehmen 5. Architecture 6. other Non-Functional Requirements Graph Analyse your Data Domain-Data, Log-Data, Event-Data, Message-Data, critical Data, Business-Data, Meta-Data, temp Data, Session-Data, Geo Data, etc. Data- / Storage-Model: NoSQL relational, column-o, doc-alike, graphs, objects, etc. What Types / Type-System? Data-Navigation, Data Amount, Data Komplexity (Deep XML?) ACID vs. BASE vs. Mixture? CAP decisions Lessons learned Performance Dimension Analysis Latency, Request behaviour, Throughput Scale-Up vs Scale-Out Query Requirements Typical queries, Tools, Ad-Hoc Queries, SQL / LINQ needed, Map/Reduce? … Distribution Architecture local, parallel, distributed / grid, service, cloud, mobile, p2p, … Data Access Patterns read / write distribution, random / sequential, Access Design Patterns Non Functional Requirements: Replication, Refactoring Frequency, DB-Support, Qualification / simplicity, Company restrictions, DB diversity (allowed?), Security, Safety / Backup & Restore, Crash Resistance, Licence… RAM + SSD rocks Think outside the MyOracleSql Box RethinkDB VoltDB Vertica Lot‘s of >1 PT RAM DBs in CA! GenieDB MonetDB Hadoop++ ... Queries in FPGA ! Tonnen cleverer hybrid Lösungen da! The world is diverse! Act accordingly! OO-Model! Document! Relational & SQL! Excel! Map & Reduce! Coffee? XML. Tupel! Graphs! Key-Value! N SQL DaaS => best Mix! http://edlich.de This talk is supported by …they want you for Cloud Computing! Polyglot Persistence