Warum NoSQL? Hinnerk Haardt <[email protected]> Not only SQL SQL Structured Query Language Programmiersprache für relationale Datenbanken Warum? Das Internet ist schuld! 1980er: data bank ACID • Atomicity — ganz oder gar nicht • Consistency — gewährleistet Integrität • Isolation — Kapselung gleichzeitiger T. • Durability — Persistenz aller Änderungen »große« Datenbanken Skalieren vertikal RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage teurer → größer → 21. Jh. Beispiel Facebook • 30.000 Server • 25 Terabyte Logdaten täglich • 300.000.000 Nutzer • 230 Ingenieure Das Internet ist schuld. Horizontal RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage mehr Daten → horizontale Skalierung vertikale Skalierung mehr Durchsatz & höhere Verfügbarkeit → RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage Verfügbarkeit Sicherheit (ACID) Verfügbarkeit unbegrenztes Wachstum CAP-Theorem Consistency Availability Partition Tolerance »in larger distributed-scale systems, network partitions are a given; therefore, consistency and availability cannot be achieved at the same time« Werner Vogels, Amazon.com 2009: NoSQL Definition… »Gruppe nicht konventioneller Datenbanken« Willkommen im Zoo! • CouchDB • MongoDB • Redis • Memcachedb • Tokyo Cabinet • Google BigTable • Amazon Dynamo • Apache Cassandra • Project Voldemort • Mnesia (Erlang) • Hbase (Apache Hadoop) • Hypertable • Twitter Gizzard kein ACID eingeschränkte Transaktionen kein »JOIN« kein SQL einfach anzusprechen schemafrei skaliert horizontal Replikation eventual consistency probabilistic worldview Amazon's Dynamo • »applications have received successful responses […] for 99.9995% of its requests« • »no data loss event has occurred to date« [2007, nach 2 Jahren Betrieb] Veranstaltung »ACID vs. BASE« • mehr zu NoSQL, ACID, BASE und dem CAP-Theorem • Objekt-Stammtisch Kiel • 27.04.2010 19:00 Uhr • Eckernförder Straße 20, Kiel (Toppoint) Referenzen • Bigtable: A Distributed Storage System for Structured Data • Eventually Consistent - Revisited • Keynote address to the PODC conference in 2000 by Eric Brewer • Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services