SlideShare a Scribd company logo
Toward Cloud
Native HPC
Outline
Cloud Native Paradigm
CNCF Ecosystem
HPC Adoption
Public Cloud Use Cases
What’s Next?
Non-Business Use
Adoption of Public
Clouds in HPC sites
13% 74%
2011 2018
Hyperion Research study “Cloud Computing Comes of Age”, 2019
4
Non-Business Use
Develop, Deploy & Run
Mostly Open Source
Cloud Computing Model
Cloud
Native
5
Non-Business Use
Architectural design that breaks an application to
independent, loosely-coupled, individually deployable services.
• Portability was a challenge.
Orchestration
Containers
Microservices
6
Non-Business Use
Bundling of an application and all its dependencies as a
package to be deployed regardless of environment.
Orchestration
Containers
Microservices
7
Non-Business Use
Automation of the operational effort required to run the
lifecycle of a container; its workloads and services .
• provisioning, deployment, scaling (up and down), networking, load
balancing and more.
• Enabling DevOps and CI/CD
Orchestration
Containers
Microservices
8
Non-Business Use
Google & Linux Foundation Project
Founded in 2015
Advance Container Technology
9
Non-Business Use
Google & Linux Foundation Project
Founded in 2015
Advance Container Technology
10
Non-Business Use
Google & Linux Foundation Project
Founded in 2015
Advance Container Technology
App Definition & Development
Database, Streaming & Messaging, App Def & Image building, CICD
Orchestration & Management
Scheduling & Orchestration, Coordination & Service Discovery, Remote Procedure Call,
Service Proxy, API Gateway, Service Mesh
Runtime
Cloud Native Storage, Container Runtime, Cloud Native Network
Provisioning
Automation & Configuration, Container Registry, Security & Compliance, Key
Management
Special
Kubernetes Certified Service Provider, Kubernetes Training Partner,
Platform
Certified Kubernetes
Distribution, Host, Installer
Observability &
Analysis
Monitoring, Logging, Tracing,
Chaos Engineering, Continuous
Optimization
Serverless
11
Non-Business Use
Google & Linux Foundation Project
Founded in 2015
Advance Container Technology
App Definition & Development
Database, Streaming & Messaging, App Def & Image building, CICD
Orchestration & Management
Scheduling & Orchestration, Coordination & Service Discovery, Remote Procedure Call,
Service Proxy, API Gateway, Service Mesh
Runtime
Cloud Native Storage, Container Runtime, Cloud Native Network
Provisioning
Automation & Configuration, Container Registry, Security & Compliance, Key
Management
Special
Kubernetes Certified Service Provider, Kubernetes Training Partner,
Platform
Certified Kubernetes
Distribution, Host, Installer
Observability &
Analysis
Monitoring, Logging, Tracing,
Chaos Engineering, Continuous
Optimization
Serverless
Scheduling
Observability
Storage
Network
UX
High Performance Computing
Cloud Native
Distributed
Cloud
Kubernetes
CNCF launched v1.0 GA
Huawei Cloud Container Engine (CCE)
Google Kubernetes Engine (GKE)
KubeEdge
CNCF’s first intelligent
edge computing project
Volcano
CNCF’s first batch
scheduling project
Distributed
Cloud Native
Slurmnetes
Batch scheduling failed
attempts
KubeFlow
Machine learning framework for operations,
pipelines, training & deployment.
MindSpore
Deep Learning framework for
mobile, edge, cloud scenarios
Karmada
CNCF’s first multi-cloud
container orchestration project
Evolution Timeline
Kueue
Kubernetes-native job
queueing
Cern
1000 node POC
2015 2016 2019 2020 2021
2017 2018 2022
2011
Cycle Computing
Running cloud HPC around 8
regions
Expanded upon chart from https://bit.ly/FrontiersCloudNative
HPC Cloud Adoption Challenges
Special
Hardware
Data
Gravity
Paradigm
Shift
• Network latency, as in special IB
• GPUS, accelerators, Numa …etc
• CPU architecture and topology
TOP 500
HPC Cloud Adoption Challenges
Special
Hardware
Data
Gravity
Paradigm
Shift
• Data governance
• Data residency
• Egress cost
• Higher the availability, higher the cost
Services
Data
Apps
Throughput Latency
HPC Cloud Adoption Challenges
Special
Hardware
Persistent Storage
Kubernetes Control Plane
K8s Kubelet K8s Kubelet
K8s Kubelet
Image Registry
Data
Gravity
Paradigm
Shift
• Both, learning and adoption
• Distributing workload as images (registry)
Research End User: CERN
https://bit.ly/HPCSAUDI-cern-org
CERN is the European Organization for Nuclear
Research.
• Kubernetes use case: Particle Physics
• Experimented with virtualization early to
enable ease of management and
automation.
2017 first Kubernetes POC
1000 worker nodes
Data 330 PB
Hybrid on-demand infra 3hrs > 15 min
Public Cloud Use Cases
“Focus on your application and results”
• Dynamically provision resources
• Plans, schedules, and executes
• Fully managed “Serverless”
• Free
• Integration with AWS services
2020 Statistics
Largest Cluster 1,243,000 vCPUS
Largest Container Image 30 GB
No. simulatenous jobs 500,000
Customers Thousands 1000s
The CNCF Community
It's very hard right now to justify developing a new product in-house. There is
really no real reason to keep doing that. It's much easier for us to try it out,
and if we see it's a good solution, we try to reach out to the community and
start working with that community.”
Where to next?
• Kubernetes Batch HPC Day North America 2022
• SC22 Containers and New Orchestration Paradigms for Isolated Environments in HPC
• CNCF Research User Group
• CNCF Technical Advisory Group for Runtime
• Kubernetes Community: Batch WorkGroup
• CNCF Batch System Initiative Working Group

More Related Content

PPTX
CNCF Introduction - Feb 2018
PDF
Using cloud native development to achieve digital transformation
PDF
Your Journey to Cloud-Native Begins with DevOps, Microservices, and Containers
PDF
Introduction of Kubernetes - Trang Nguyen
PDF
Nimbus Concept
PPTX
Why to Cloud Native
PDF
Cloud computing & Batch processing: potentiels & perspectives
PDF
ClouNS - A Cloud-native Application Reference Model for Enterprise Architects
CNCF Introduction - Feb 2018
Using cloud native development to achieve digital transformation
Your Journey to Cloud-Native Begins with DevOps, Microservices, and Containers
Introduction of Kubernetes - Trang Nguyen
Nimbus Concept
Why to Cloud Native
Cloud computing & Batch processing: potentiels & perspectives
ClouNS - A Cloud-native Application Reference Model for Enterprise Architects

Similar to Towards-cloud-native-HPC.pdf (20)

PDF
Intro - Cloud Native
PPTX
Google Cloud Fundamentals by CloudZone
PPTX
Improving Your Company’s Health with Middleware Takeout
PPTX
IBM Cloud Manager with OpenStack Overview
PDF
[OpenInfra Days Vietnam 2019] Innovation with open sources and app modernizat...
PDF
Continuously Design your Continuous Deployment
PPTX
Containers Orchestration using kubernates.pptx
PDF
Modern big data and machine learning in the era of cloud, docker and kubernetes
PDF
Containers - Transforming the data centre as we know it 2016
PDF
STANISLAV KOLENKIN, BAQ "K8S: network plugins - issues and performance compar...
PDF
stackconf 2024 | Orchestrating Resilient Data: Harnessing the Strength of Kub...
PPTX
Accelerate Your Application Modernization Journey with Konveyor - Kubernetes ...
PPTX
"The Cloud Native Enterprise is Coming"
PDF
Kubernetes: Dive into the Future of Infrastructure
PPTX
Cloud-Native-Applications-The-Future-of-Development.pptx
PDF
KCD Czech & Slovak 2024 Keynote - Celebrating a Decade of Kubernetes and Adva...
PPTX
Executive Briefing: The Why, What, and Where of Containers
PPTX
004 abhishek__Internship_ptt[1] (1).pptx
PDF
Making Cloud Native CI_CD Services.pdf
PDF
Pivotal Developer-Ready Infrastructure Slides
Intro - Cloud Native
Google Cloud Fundamentals by CloudZone
Improving Your Company’s Health with Middleware Takeout
IBM Cloud Manager with OpenStack Overview
[OpenInfra Days Vietnam 2019] Innovation with open sources and app modernizat...
Continuously Design your Continuous Deployment
Containers Orchestration using kubernates.pptx
Modern big data and machine learning in the era of cloud, docker and kubernetes
Containers - Transforming the data centre as we know it 2016
STANISLAV KOLENKIN, BAQ "K8S: network plugins - issues and performance compar...
stackconf 2024 | Orchestrating Resilient Data: Harnessing the Strength of Kub...
Accelerate Your Application Modernization Journey with Konveyor - Kubernetes ...
"The Cloud Native Enterprise is Coming"
Kubernetes: Dive into the Future of Infrastructure
Cloud-Native-Applications-The-Future-of-Development.pptx
KCD Czech & Slovak 2024 Keynote - Celebrating a Decade of Kubernetes and Adva...
Executive Briefing: The Why, What, and Where of Containers
004 abhishek__Internship_ptt[1] (1).pptx
Making Cloud Native CI_CD Services.pdf
Pivotal Developer-Ready Infrastructure Slides
Ad

More from Walid Shaari (16)

PDF
Aws ug dxb 2021 container series iv
PDF
Open hybrid cloud
PDF
Okd wg kubecon marathon azure & vsphere
PDF
K8s architecture meetup2- k8saraby
PDF
Pydata 2020 containers meetup
PPTX
Dammam aws user group meetup
PPTX
Bahrain ch9 introduction to docker 5th birthday
PDF
IAU workshop 2018 day one
PDF
Containers - Portable, repeatable user-oriented application delivery. Build, ...
PDF
Network Automation Journey, A systems engineer NetOps perspective
PDF
Kick starting Network Automation
PDF
Docker Dhahran November 2017 meetup
PDF
Containers - Portable, repeatable user-oriented application delivery. Build, ...
PDF
Docker Dhahran Nov 2016 meetup
PDF
What HPC can learn from DevOps?
PDF
Docker 101 @KACST Saudi HPC 2016
Aws ug dxb 2021 container series iv
Open hybrid cloud
Okd wg kubecon marathon azure & vsphere
K8s architecture meetup2- k8saraby
Pydata 2020 containers meetup
Dammam aws user group meetup
Bahrain ch9 introduction to docker 5th birthday
IAU workshop 2018 day one
Containers - Portable, repeatable user-oriented application delivery. Build, ...
Network Automation Journey, A systems engineer NetOps perspective
Kick starting Network Automation
Docker Dhahran November 2017 meetup
Containers - Portable, repeatable user-oriented application delivery. Build, ...
Docker Dhahran Nov 2016 meetup
What HPC can learn from DevOps?
Docker 101 @KACST Saudi HPC 2016
Ad

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Empathic Computing: Creating Shared Understanding
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPT
Teaching material agriculture food technology
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Mushroom cultivation and it's methods.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Digital-Transformation-Roadmap-for-Companies.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Empathic Computing: Creating Shared Understanding
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
A comparative study of natural language inference in Swahili using monolingua...
Teaching material agriculture food technology
OMC Textile Division Presentation 2021.pptx
Unlocking AI with Model Context Protocol (MCP)
Heart disease approach using modified random forest and particle swarm optimi...
Programs and apps: productivity, graphics, security and other tools
Group 1 Presentation -Planning and Decision Making .pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Mushroom cultivation and it's methods.pdf
NewMind AI Weekly Chronicles - August'25-Week II
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...

Towards-cloud-native-HPC.pdf

  • 2. Outline Cloud Native Paradigm CNCF Ecosystem HPC Adoption Public Cloud Use Cases What’s Next?
  • 3. Non-Business Use Adoption of Public Clouds in HPC sites 13% 74% 2011 2018 Hyperion Research study “Cloud Computing Comes of Age”, 2019
  • 4. 4 Non-Business Use Develop, Deploy & Run Mostly Open Source Cloud Computing Model Cloud Native
  • 5. 5 Non-Business Use Architectural design that breaks an application to independent, loosely-coupled, individually deployable services. • Portability was a challenge. Orchestration Containers Microservices
  • 6. 6 Non-Business Use Bundling of an application and all its dependencies as a package to be deployed regardless of environment. Orchestration Containers Microservices
  • 7. 7 Non-Business Use Automation of the operational effort required to run the lifecycle of a container; its workloads and services . • provisioning, deployment, scaling (up and down), networking, load balancing and more. • Enabling DevOps and CI/CD Orchestration Containers Microservices
  • 8. 8 Non-Business Use Google & Linux Foundation Project Founded in 2015 Advance Container Technology
  • 9. 9 Non-Business Use Google & Linux Foundation Project Founded in 2015 Advance Container Technology
  • 10. 10 Non-Business Use Google & Linux Foundation Project Founded in 2015 Advance Container Technology App Definition & Development Database, Streaming & Messaging, App Def & Image building, CICD Orchestration & Management Scheduling & Orchestration, Coordination & Service Discovery, Remote Procedure Call, Service Proxy, API Gateway, Service Mesh Runtime Cloud Native Storage, Container Runtime, Cloud Native Network Provisioning Automation & Configuration, Container Registry, Security & Compliance, Key Management Special Kubernetes Certified Service Provider, Kubernetes Training Partner, Platform Certified Kubernetes Distribution, Host, Installer Observability & Analysis Monitoring, Logging, Tracing, Chaos Engineering, Continuous Optimization Serverless
  • 11. 11 Non-Business Use Google & Linux Foundation Project Founded in 2015 Advance Container Technology App Definition & Development Database, Streaming & Messaging, App Def & Image building, CICD Orchestration & Management Scheduling & Orchestration, Coordination & Service Discovery, Remote Procedure Call, Service Proxy, API Gateway, Service Mesh Runtime Cloud Native Storage, Container Runtime, Cloud Native Network Provisioning Automation & Configuration, Container Registry, Security & Compliance, Key Management Special Kubernetes Certified Service Provider, Kubernetes Training Partner, Platform Certified Kubernetes Distribution, Host, Installer Observability & Analysis Monitoring, Logging, Tracing, Chaos Engineering, Continuous Optimization Serverless Scheduling Observability Storage Network UX High Performance Computing
  • 12. Cloud Native Distributed Cloud Kubernetes CNCF launched v1.0 GA Huawei Cloud Container Engine (CCE) Google Kubernetes Engine (GKE) KubeEdge CNCF’s first intelligent edge computing project Volcano CNCF’s first batch scheduling project Distributed Cloud Native Slurmnetes Batch scheduling failed attempts KubeFlow Machine learning framework for operations, pipelines, training & deployment. MindSpore Deep Learning framework for mobile, edge, cloud scenarios Karmada CNCF’s first multi-cloud container orchestration project Evolution Timeline Kueue Kubernetes-native job queueing Cern 1000 node POC 2015 2016 2019 2020 2021 2017 2018 2022 2011 Cycle Computing Running cloud HPC around 8 regions Expanded upon chart from https://bit.ly/FrontiersCloudNative
  • 13. HPC Cloud Adoption Challenges Special Hardware Data Gravity Paradigm Shift • Network latency, as in special IB • GPUS, accelerators, Numa …etc • CPU architecture and topology TOP 500
  • 14. HPC Cloud Adoption Challenges Special Hardware Data Gravity Paradigm Shift • Data governance • Data residency • Egress cost • Higher the availability, higher the cost Services Data Apps Throughput Latency
  • 15. HPC Cloud Adoption Challenges Special Hardware Persistent Storage Kubernetes Control Plane K8s Kubelet K8s Kubelet K8s Kubelet Image Registry Data Gravity Paradigm Shift • Both, learning and adoption • Distributing workload as images (registry)
  • 16. Research End User: CERN https://bit.ly/HPCSAUDI-cern-org CERN is the European Organization for Nuclear Research. • Kubernetes use case: Particle Physics • Experimented with virtualization early to enable ease of management and automation. 2017 first Kubernetes POC 1000 worker nodes Data 330 PB Hybrid on-demand infra 3hrs > 15 min
  • 17. Public Cloud Use Cases “Focus on your application and results” • Dynamically provision resources • Plans, schedules, and executes • Fully managed “Serverless” • Free • Integration with AWS services 2020 Statistics Largest Cluster 1,243,000 vCPUS Largest Container Image 30 GB No. simulatenous jobs 500,000 Customers Thousands 1000s
  • 18. The CNCF Community It's very hard right now to justify developing a new product in-house. There is really no real reason to keep doing that. It's much easier for us to try it out, and if we see it's a good solution, we try to reach out to the community and start working with that community.”
  • 19. Where to next? • Kubernetes Batch HPC Day North America 2022 • SC22 Containers and New Orchestration Paradigms for Isolated Environments in HPC • CNCF Research User Group • CNCF Technical Advisory Group for Runtime • Kubernetes Community: Batch WorkGroup • CNCF Batch System Initiative Working Group