SlideShare a Scribd company logo
The Earth Science Platform
Ted Habermann, Mike Folk, The HDF Group
Conventions

Tools

Formats

Services

December 12, 2013

AGU, Fall 2013

1
Formats with HDF Inside

HDF5

December 12, 2013

AGU, Fall 2013

2
High Performance / Parallel Computing
Problem: Support I/O and analysis needs for stateof-the-art plasma physics code

Novel Accomplishments:
 Ran Trillion particle VPIC simulation on
120,000 hopper cores and generated 350 TB
dataset
 Parallel HDF5 obtained peak 35GB/s I/O rate
and 80% sustained bandwidth
 Developed hybrid parallel FastQuery using
FastBit to utilize multicore hardware
 FastQuery took 10 minutes to index and 3
seconds to query energetic particles
 SC12 paper, XLDB 2012 poster

Impact
 Demonstrated software scalability for
writing and analyzing ~40TB HDF5 files
 Enabled novel discoveries in plasma physics

*Vector Particle-in-Cell

December 12, 2013

AGU, Fall 2013

3
Grouping Data and Metadata (HDF-EOS)
HDF File with HDF-EOS Conventions
Grids

Points

Swaths

Zonal Averages
Grid_1

Data Fields

Grid_N

Attributes

Swath_1

Data Fields

Swath_N

Geolocation
Fields

Profile Fields

Latitude
Data Field.1

Data Field.1

Profile Field.1
Longitude

Data Field.2

Data Field.2

Time

Profile Field.2

Colatitude

December 12, 2013

AGU, Fall 2013

4
Conventions / History
Processing Level
3

1

Derived geophysical variables
at the same resolution and
location as Level 1 source
data.

Reconstructed, unprocessed
instrument data at full
resolution, time-referenced,
and annotated with ancillary
information, including
radiometric and geometric
calibration coefficients and
georeferencing parameters
(e.g., platform ephemeris)
computed and appended but
not applied to Level 0 data.

December 12, 2013

CF

CF

?
AGU, Fall 2013

Grid

HDF-EOS

2

Model Results / Variables mapped on
uniform space-time grid scales,
usually with some completeness and
consistency.

Zonal
Average

CF Feature Types:
Points
Timeseries
Trajectory
Profile
TimeSeriesProfile
TrajectoryProfile

?

Points

Swath

5
Convention Governance
Community / Users

December 12, 2013

AGU, Fall 2013

Operational Data
Processing System

6
Community

Using HDF to share data?
Tweet #HDFInside

December 12, 2013

AGU, Fall 2013

7
Acknowledgements

thabermann@hdfgroup.org

This work was partially supported by NASA contract number NNG10HP02C.
Any opinions, findings, conclusions, or recommendations expressed in this material are
those of the author and do not necessarily reflect the views of NASA or The HDF Group.
December 12, 2013

AGU, Fall 2013

8

More Related Content

PDF
An Empirical Evaluation of RDF Graph Partitioning Techniques
PPTX
SPD and KEA: HDF5 based file formats for Earth Observation
PPTX
PPTX
My Other Computer is a Data Center: The Sector Perspective on Big Data
PPT
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
PPTX
HDF-EOS Data Product Developer's Guide
PPTX
Bionimbus - An Overview (2010-v6)
An Empirical Evaluation of RDF Graph Partitioning Techniques
SPD and KEA: HDF5 based file formats for Earth Observation
My Other Computer is a Data Center: The Sector Perspective on Big Data
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
HDF-EOS Data Product Developer's Guide
Bionimbus - An Overview (2010-v6)

What's hot (20)

PPTX
Slide 1
PPTX
Improved Methods for Accessing Scientific Data for the Masses
PPTX
OCC Overview OMG Clouds Meeting 07-13-09 v3
PPTX
Bioclouds CAMDA (Robert Grossman) 09-v9p
PPTX
Multidimensional Scientific Data in ArcGIS
PPT
Large Scale On-Demand Image Processing For Disaster Relief
PPTX
Coding the Continuum
PPTX
An Overview of Bionimbus (March 2010)
PPTX
Project Matsu: Elastic Clouds for Disaster Relief
PPTX
Bionimbus Cambridge Workshop (3-28-11, v7)
PDF
This Helix Nebula Science Cloud Pilot Phase Open Session
PPTX
Health & Status Monitoring (2010-v8)
PDF
OpenTopography - Scalable Services for Geosciences Data
PPTX
Learning Systems for Science
PPTX
ICESat-2 Metadata and Status
PPTX
DATACUBES: Conquering Space & Time
PPTX
Handling High Energy Physics Data using Cloud Computing
PPTX
ArcGIS and Multi-D: Tools & Roadmap
PPTX
Open Science Data Cloud - CCA 11
PPT
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
Slide 1
Improved Methods for Accessing Scientific Data for the Masses
OCC Overview OMG Clouds Meeting 07-13-09 v3
Bioclouds CAMDA (Robert Grossman) 09-v9p
Multidimensional Scientific Data in ArcGIS
Large Scale On-Demand Image Processing For Disaster Relief
Coding the Continuum
An Overview of Bionimbus (March 2010)
Project Matsu: Elastic Clouds for Disaster Relief
Bionimbus Cambridge Workshop (3-28-11, v7)
This Helix Nebula Science Cloud Pilot Phase Open Session
Health & Status Monitoring (2010-v8)
OpenTopography - Scalable Services for Geosciences Data
Learning Systems for Science
ICESat-2 Metadata and Status
DATACUBES: Conquering Space & Time
Handling High Energy Physics Data using Cloud Computing
ArcGIS and Multi-D: Tools & Roadmap
Open Science Data Cloud - CCA 11
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
Ad

Viewers also liked (15)

PPTX
Hdf Augmentation: Interoperability in the Last Mile
PPTX
ISO Metadata in HDF Data Files
PPTX
Wikis, Rubrics and Views: An Integrated Approach to Improving Documentation
PPTX
Metadata Evaluation and Improvement
PPTX
Translation proofing
PPTX
Hdf Inside
PPTX
The HDF Product Designer – Interoperability in the First Mile
PPTX
Citations in ISO Metadata
PPTX
Metadata For Humans and Machines
PPTX
ESDIS and International Standards
PPTX
ESDIS Metadata Archive
PPTX
ISO Metadata Improvements - Questions and Answers
PPTX
Granules and ISO Metadata
PPTX
19157 Questions and Answers
PPTX
Can ISO 19157 support current NASA data quality metadata?
Hdf Augmentation: Interoperability in the Last Mile
ISO Metadata in HDF Data Files
Wikis, Rubrics and Views: An Integrated Approach to Improving Documentation
Metadata Evaluation and Improvement
Translation proofing
Hdf Inside
The HDF Product Designer – Interoperability in the First Mile
Citations in ISO Metadata
Metadata For Humans and Machines
ESDIS and International Standards
ESDIS Metadata Archive
ISO Metadata Improvements - Questions and Answers
Granules and ISO Metadata
19157 Questions and Answers
Can ISO 19157 support current NASA data quality metadata?
Ad

Similar to Earth Science Platform (20)

PPTX
PDF
HDF-EOS Development: Current Status and Tools
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
HDF OPeNDAP Project Update and Demo
PPT
HDF OPeNDAP project update and demo
PPTX
HDF Project Status and Plans
PPT
HDF-EOS Maintenance, Current Development and Tools
PPTX
HDF & HDF-EOS Data & Support at NSIDC
PPSX
NASA HDF/HDF-EOS Data Access Challenges
PDF
Python and HDF5: Overview
PDF
Productivity and Performance: An Exploration of Parallel H5py on HPC
HDF-EOS Development: Current Status and Tools
Cloud Optimized HDF5 for the ICESat-2 mission
HDF OPeNDAP Project Update and Demo
HDF OPeNDAP project update and demo
HDF Project Status and Plans
HDF-EOS Maintenance, Current Development and Tools
HDF & HDF-EOS Data & Support at NSIDC
NASA HDF/HDF-EOS Data Access Challenges
Python and HDF5: Overview
Productivity and Performance: An Exploration of Parallel H5py on HPC

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Tartificialntelligence_presentation.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPT
Teaching material agriculture food technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Approach and Philosophy of On baking technology
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
1. Introduction to Computer Programming.pptx
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
Encapsulation theory and applications.pdf
NewMind AI Weekly Chronicles - August'25-Week II
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Tartificialntelligence_presentation.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Unlocking AI with Model Context Protocol (MCP)
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Teaching material agriculture food technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
TLE Review Electricity (Electricity).pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Heart disease approach using modified random forest and particle swarm optimi...
OMC Textile Division Presentation 2021.pptx
Approach and Philosophy of On baking technology
MIND Revenue Release Quarter 2 2025 Press Release
1. Introduction to Computer Programming.pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf

Earth Science Platform

  • 1. The Earth Science Platform Ted Habermann, Mike Folk, The HDF Group Conventions Tools Formats Services December 12, 2013 AGU, Fall 2013 1
  • 2. Formats with HDF Inside HDF5 December 12, 2013 AGU, Fall 2013 2
  • 3. High Performance / Parallel Computing Problem: Support I/O and analysis needs for stateof-the-art plasma physics code Novel Accomplishments:  Ran Trillion particle VPIC simulation on 120,000 hopper cores and generated 350 TB dataset  Parallel HDF5 obtained peak 35GB/s I/O rate and 80% sustained bandwidth  Developed hybrid parallel FastQuery using FastBit to utilize multicore hardware  FastQuery took 10 minutes to index and 3 seconds to query energetic particles  SC12 paper, XLDB 2012 poster Impact  Demonstrated software scalability for writing and analyzing ~40TB HDF5 files  Enabled novel discoveries in plasma physics *Vector Particle-in-Cell December 12, 2013 AGU, Fall 2013 3
  • 4. Grouping Data and Metadata (HDF-EOS) HDF File with HDF-EOS Conventions Grids Points Swaths Zonal Averages Grid_1 Data Fields Grid_N Attributes Swath_1 Data Fields Swath_N Geolocation Fields Profile Fields Latitude Data Field.1 Data Field.1 Profile Field.1 Longitude Data Field.2 Data Field.2 Time Profile Field.2 Colatitude December 12, 2013 AGU, Fall 2013 4
  • 5. Conventions / History Processing Level 3 1 Derived geophysical variables at the same resolution and location as Level 1 source data. Reconstructed, unprocessed instrument data at full resolution, time-referenced, and annotated with ancillary information, including radiometric and geometric calibration coefficients and georeferencing parameters (e.g., platform ephemeris) computed and appended but not applied to Level 0 data. December 12, 2013 CF CF ? AGU, Fall 2013 Grid HDF-EOS 2 Model Results / Variables mapped on uniform space-time grid scales, usually with some completeness and consistency. Zonal Average CF Feature Types: Points Timeseries Trajectory Profile TimeSeriesProfile TrajectoryProfile ? Points Swath 5
  • 6. Convention Governance Community / Users December 12, 2013 AGU, Fall 2013 Operational Data Processing System 6
  • 7. Community Using HDF to share data? Tweet #HDFInside December 12, 2013 AGU, Fall 2013 7
  • 8. Acknowledgements [email protected] This work was partially supported by NASA contract number NNG10HP02C. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author and do not necessarily reflect the views of NASA or The HDF Group. December 12, 2013 AGU, Fall 2013 8