Technische_Implement..

Werbung
Technische Implementation von
CERA
Hannes Thiemann
Max-Planck-Institut für Meteorologie
Modelle und Daten
hannes.thiemann @ zmaw.de
Jena, 24. Januar 2007
Inhalt

Aufgabe und Motivation

Umsetzung Datenbanken

Anbindung an das HSM

Ausblick
Klimasystem
Klimamodell: Grid
Klimamodell: Auflösung
T42 (300 km)
T106 (120 km)
Datenmengen

Horizontalauflösung des Klimamodells



T42: 128 * 64 = 8192 Punkte pro Globalfeld
T106: 160 * 320 = 51200 Punkte pro Globalfeld
Erforderliche Speichereinheiten (GRIB Format)

Horizontalfeld (Zugriffseinheit):
 17.1 kB (T42) /
 100.1 kB (T106)

Unix Filegröße für monatsweise akkumulierte Ergebnisse mit 6
Std. Speicherintervall und 300 2d Variablen (Physikalische
Einheit):
 616 MB (T42) /
 3500 MB (T106)

240 Jahre Modellintegration (Logische Einheit):
 1.7 TB (T42) /
 10 TB (T106)

Umsetzung Datenbanken
The Winter TopTen Program
identifies the world’s largest and
most heavily used databases.
….. Congratulations on achieving Grand Prize award winner status (1)
in Database Size, Other, All and TopTen Winner status Database Size,
Other, Linux;Workload, Other, Linux in Winter Corp.'s 2005 TopTen
Program! .......
(1) Grand prizes are awarded for first place winners in the All
Environments categories only.
WDCC's CERA DB has been identified as
the largest Linux DB.
Wintercorp (2005) - DB Size: Scientific, Archive, and other
Company
Size
(TB)
DBMS
Platform
System
Vendor
Max-Planck
222
Oracle
Federated/SMP
NEC
USGS/EROS
17
Oracle
Centralized/SMP
Sun
USGS/EROS
17
Oracle
Centralized/SMP
Sun
HP
1
NonStop SQL
Centralized/MPP
HP
T-Systems
1
Oracle RAC
Centralized/Cluster
Sun
See: www.wintercorp.com
Wintercorp (2005) - DB Size: Data Warehouse
Company
Size
(TB)
DBMS
Platform
System
Vendor
Yahoo
100
Oracle
Centralized/SMP
Fujitsu
Siemens
AT&T 1)
94
Daytona
Federated/SMP
HP
KT IT-Group
50
DB2
Centralized/Cluster IBM
LGR
25
Oracle
Centralized/SMP
Amazon
25
Oracle RAC
Centralized/Cluster HP
1) 330 GB Norm. Data Volume
HP
See: www.wintercorp.com
CERA: Some Facts








Oracle 9.2 single instance running on TX7
 Enterprise Edition
 Partitioning Option
 Advanced Security
24 Tbyte disk attached to database nodes
Database size ~260 Tbyte (logical)
Database nodes connected to HSM system
Data accessible on the internet
800 named users worldwide
Daily access 300 GB/Day (average)
New data 250 GB/Day (average)
Users
Oracle AS
AP
Climate Model
Oracle Application
Server
SX-6
PP writes data
(local I/O)
AP
1.Climate Model writes
raw output (GFS I/O)
GFS
Environment
raw
2.PP reads raw data
(GFS I/O)
Post Process
Application
GFS/
Client Post
processing
System
GFS/
Server
OCI
Application
AP
META + Data
DXDM
AsAmA 16way
DXSM
DXSN
GFS/Server
Oracle Instance
DXDM
AsAmA 16way
DXSN
AsAmA 4way
DXDB
Local disk
3.OCI reads data
(Local I/O)
Oracle Instance
GFS/Server
5.Data inquiry
(OCI)
...
AsAmA 4way
DXDB
4.OCI writes BLOB
(via networks)
Migration
&
Staging
DiskXtender
Disk cache
Oracle DB
BLOB
Oracle DB
BLOB
GE Network
© NEC Corporation
WDCC Data Topology
Level 1 - Interface:
Metadata entries
(XML, ASCII)
+ Data Files
Level 2 – Interf.:
Separate files
containing BLOB
table data in
application
adapted structure
(time series of
single variables)
Experiment
Description
Pointer to
Unix-Files
Dataset 1
Description
Dataset n
Description
BLOB Data
Table
BLOB Data
Table
BLOB DB Table corresponds to scalable,
virtual file at the operating system level.
Datenbanken: Aufteilung
OID
Daten
Metadaten
Enterprise
User
Security
12 1
31
45 111
1
6
1
Reference
Status
Distribution
Contact
100.000 Tabellen
Entry
800 GB
Data OrgLocal Adm.
Data Access
Coverage
Parameter
Spatial
Reference
Data matrix of model experiment
T2M
Precip
SLP
2D
variables .
.
Temp
T1
T2
T3
..
..
..
.
Tn
..
..
..
..
..
..
..
..
..
.
Water
vapour
3D
variables .
.
Raw data file in
DKRZ Archive
Model Run Time
Model variables
Tend
2 D: small BLOBS (16 KB)
3 D: large BLOBS (3 MB)
Raw data file: direct model output (0.7 – 16.2 GB)
Each columm is one
BLOB Table and one
META Table in CERA-DB
Structure of metadata tables
Metadata
Table
Blob_id
Blob_size
Start_date
Blob_min
Blob_max
Blob_mean
Informationen um

Einfache Anfragen ohne Zugriff
auf Daten selbst zu
beantworten.

Konsistenz zu den Daten
selbst überprüfen zu können.

Qualitätskontrollen
durchzuführen.

Liegen auf Disk
Metadaten erlauben die Abbildung
der blob_id auf die wirkliche
Modellzeit
Structure of blob tables
Range Partitioning
BLOB Data
Table
blob_id
blob_data
Table
Partition 1
Datafile
blob_id 1
1.. n
Time
t0 .. tn
Table
Partition 2
Datafile
blob_id
n+1
2 .. m
Time
tn+1 .. tm
…
…
…
…
Table
Partition n
Datafile
blob_id
m+1
n .. k
Time
tm+1 .. tk
Umsetzung: HSM

Anbindung an das HSM
Migin
Migout
dxdb
TBS - RW
TBS - RO
Tbl
Partition 1
All
tablespaces
are moved
“at once” to
dxdb
Tbl
Partition 21
Migout / Migin




Migout takes place after files haven’t been modified for x
minutes
Only one migout process per dxdb-filesystem
Migin takes place immediately after a file is requested.
Only parts accessed are retrieved from the backend
storage.
One migin process per requested file.
Purging
dxdb
HWM
LWM
Criteria for purging




Size of datafiles doesn’t matter
 Except: “small” datafiles can stay on disk
Time not modified (easy for read only tablespaces)
Time not touched
 Oracle has the tendency to touch data files quite
often
 Oracle parameter read_only_open_delayed could
be an option
Prerequisite: 2 copies on tape
Inside the datafile
Header 128k
Table
Lob Index
Primary Key
Blob data
Frontend versus Backend
Filesystem Frontend
HSM Backend
Header 128k
Header 128k
Part 1
= 512 MB
Part 2
= 512 MB
Retrieving data
Header 128k
3
1
2
4
5
Tape Request
Usage: Downloads
Downloads per year
900000
800000
700000
600000
500000
400000
300000
200000
100000
0
1999 2000 2001 2002 2003 2004 2005 2006
Statistics: Size
Database Size
300
TByte
250
200
150
100
50
0
1998 1999 2000 2001 2002 2003 2004 2005 2006
Year
Ausblick: Globalmodell T213 (Atmosphäre)

Horizontalauflösung des Klimamodells



T213: 640 * 320 = 204800 Punkte pro Globalfeld
T106: 160 * 320 = 51200 Punkte pro Globalfeld
Erforderliche Speichereinheiten (GRIB Format)

Horizontalfeld (Zugriffseinheit):



Unix Filegröße für monatsweise akkumulierte Ergebnisse mit 6 Std.
Speicherintervall und 300 2d Variablen (Physikalische Einheit):



400.1 kB (T213) /
100.1 kB (T106)
14000MB (T213) /
3500 MB (T106)
240 Jahre Modellintegration (Logische Einheit):


40 TB (T213) /
10 TB (T106)
Ausblick: Regionalmodell Auflösung und Datenmengen
REMO-UBA-Modellgebiet
•Auslösung:
10x10 km
•Datenmenge:
5 TB / 100 Jahre
(nur Bodenfelder)
Orography
Vielen Dank!
Herunterladen