Technische Implementation von CERA Hannes Thiemann Max-Planck-Institut für Meteorologie Modelle und Daten hannes.thiemann @ zmaw.de Jena, 24. Januar 2007 Inhalt Aufgabe und Motivation Umsetzung Datenbanken Anbindung an das HSM Ausblick Klimasystem Klimamodell: Grid Klimamodell: Auflösung T42 (300 km) T106 (120 km) Datenmengen Horizontalauflösung des Klimamodells T42: 128 * 64 = 8192 Punkte pro Globalfeld T106: 160 * 320 = 51200 Punkte pro Globalfeld Erforderliche Speichereinheiten (GRIB Format) Horizontalfeld (Zugriffseinheit): 17.1 kB (T42) / 100.1 kB (T106) Unix Filegröße für monatsweise akkumulierte Ergebnisse mit 6 Std. Speicherintervall und 300 2d Variablen (Physikalische Einheit): 616 MB (T42) / 3500 MB (T106) 240 Jahre Modellintegration (Logische Einheit): 1.7 TB (T42) / 10 TB (T106) Umsetzung Datenbanken The Winter TopTen Program identifies the world’s largest and most heavily used databases. ….. Congratulations on achieving Grand Prize award winner status (1) in Database Size, Other, All and TopTen Winner status Database Size, Other, Linux;Workload, Other, Linux in Winter Corp.'s 2005 TopTen Program! ....... (1) Grand prizes are awarded for first place winners in the All Environments categories only. WDCC's CERA DB has been identified as the largest Linux DB. Wintercorp (2005) - DB Size: Scientific, Archive, and other Company Size (TB) DBMS Platform System Vendor Max-Planck 222 Oracle Federated/SMP NEC USGS/EROS 17 Oracle Centralized/SMP Sun USGS/EROS 17 Oracle Centralized/SMP Sun HP 1 NonStop SQL Centralized/MPP HP T-Systems 1 Oracle RAC Centralized/Cluster Sun See: www.wintercorp.com Wintercorp (2005) - DB Size: Data Warehouse Company Size (TB) DBMS Platform System Vendor Yahoo 100 Oracle Centralized/SMP Fujitsu Siemens AT&T 1) 94 Daytona Federated/SMP HP KT IT-Group 50 DB2 Centralized/Cluster IBM LGR 25 Oracle Centralized/SMP Amazon 25 Oracle RAC Centralized/Cluster HP 1) 330 GB Norm. Data Volume HP See: www.wintercorp.com CERA: Some Facts Oracle 9.2 single instance running on TX7 Enterprise Edition Partitioning Option Advanced Security 24 Tbyte disk attached to database nodes Database size ~260 Tbyte (logical) Database nodes connected to HSM system Data accessible on the internet 800 named users worldwide Daily access 300 GB/Day (average) New data 250 GB/Day (average) Users Oracle AS AP Climate Model Oracle Application Server SX-6 PP writes data (local I/O) AP 1.Climate Model writes raw output (GFS I/O) GFS Environment raw 2.PP reads raw data (GFS I/O) Post Process Application GFS/ Client Post processing System GFS/ Server OCI Application AP META + Data DXDM AsAmA 16way DXSM DXSN GFS/Server Oracle Instance DXDM AsAmA 16way DXSN AsAmA 4way DXDB Local disk 3.OCI reads data (Local I/O) Oracle Instance GFS/Server 5.Data inquiry (OCI) ... AsAmA 4way DXDB 4.OCI writes BLOB (via networks) Migration & Staging DiskXtender Disk cache Oracle DB BLOB Oracle DB BLOB GE Network © NEC Corporation WDCC Data Topology Level 1 - Interface: Metadata entries (XML, ASCII) + Data Files Level 2 – Interf.: Separate files containing BLOB table data in application adapted structure (time series of single variables) Experiment Description Pointer to Unix-Files Dataset 1 Description Dataset n Description BLOB Data Table BLOB Data Table BLOB DB Table corresponds to scalable, virtual file at the operating system level. Datenbanken: Aufteilung OID Daten Metadaten Enterprise User Security 12 1 31 45 111 1 6 1 Reference Status Distribution Contact 100.000 Tabellen Entry 800 GB Data OrgLocal Adm. Data Access Coverage Parameter Spatial Reference Data matrix of model experiment T2M Precip SLP 2D variables . . Temp T1 T2 T3 .. .. .. . Tn .. .. .. .. .. .. .. .. .. . Water vapour 3D variables . . Raw data file in DKRZ Archive Model Run Time Model variables Tend 2 D: small BLOBS (16 KB) 3 D: large BLOBS (3 MB) Raw data file: direct model output (0.7 – 16.2 GB) Each columm is one BLOB Table and one META Table in CERA-DB Structure of metadata tables Metadata Table Blob_id Blob_size Start_date Blob_min Blob_max Blob_mean Informationen um Einfache Anfragen ohne Zugriff auf Daten selbst zu beantworten. Konsistenz zu den Daten selbst überprüfen zu können. Qualitätskontrollen durchzuführen. Liegen auf Disk Metadaten erlauben die Abbildung der blob_id auf die wirkliche Modellzeit Structure of blob tables Range Partitioning BLOB Data Table blob_id blob_data Table Partition 1 Datafile blob_id 1 1.. n Time t0 .. tn Table Partition 2 Datafile blob_id n+1 2 .. m Time tn+1 .. tm … … … … Table Partition n Datafile blob_id m+1 n .. k Time tm+1 .. tk Umsetzung: HSM Anbindung an das HSM Migin Migout dxdb TBS - RW TBS - RO Tbl Partition 1 All tablespaces are moved “at once” to dxdb Tbl Partition 21 Migout / Migin Migout takes place after files haven’t been modified for x minutes Only one migout process per dxdb-filesystem Migin takes place immediately after a file is requested. Only parts accessed are retrieved from the backend storage. One migin process per requested file. Purging dxdb HWM LWM Criteria for purging Size of datafiles doesn’t matter Except: “small” datafiles can stay on disk Time not modified (easy for read only tablespaces) Time not touched Oracle has the tendency to touch data files quite often Oracle parameter read_only_open_delayed could be an option Prerequisite: 2 copies on tape Inside the datafile Header 128k Table Lob Index Primary Key Blob data Frontend versus Backend Filesystem Frontend HSM Backend Header 128k Header 128k Part 1 = 512 MB Part 2 = 512 MB Retrieving data Header 128k 3 1 2 4 5 Tape Request Usage: Downloads Downloads per year 900000 800000 700000 600000 500000 400000 300000 200000 100000 0 1999 2000 2001 2002 2003 2004 2005 2006 Statistics: Size Database Size 300 TByte 250 200 150 100 50 0 1998 1999 2000 2001 2002 2003 2004 2005 2006 Year Ausblick: Globalmodell T213 (Atmosphäre) Horizontalauflösung des Klimamodells T213: 640 * 320 = 204800 Punkte pro Globalfeld T106: 160 * 320 = 51200 Punkte pro Globalfeld Erforderliche Speichereinheiten (GRIB Format) Horizontalfeld (Zugriffseinheit): Unix Filegröße für monatsweise akkumulierte Ergebnisse mit 6 Std. Speicherintervall und 300 2d Variablen (Physikalische Einheit): 400.1 kB (T213) / 100.1 kB (T106) 14000MB (T213) / 3500 MB (T106) 240 Jahre Modellintegration (Logische Einheit): 40 TB (T213) / 10 TB (T106) Ausblick: Regionalmodell Auflösung und Datenmengen REMO-UBA-Modellgebiet •Auslösung: 10x10 km •Datenmenge: 5 TB / 100 Jahre (nur Bodenfelder) Orography Vielen Dank!