SlideShare a Scribd company logo
Container Rebalancing: Towards
Proactive Linux Containers Placement
Optimization in a DataCenter
PONGSAKORN U-CHUPALA, YASUHIRO WATASHIBA, KOHEI ICHIKAWA,
SUSUMU DATE* AND HAJIMU IIDA
N A R A I N S T I T U T E O F S C I E N C E A N D T E C H N O L O G Y , N A R A , J A P A N
* O S A K A U N I V E R S I T Y , O S A K A , J A P A N
July 14, 2017 COMPSAC 2017 1
Agenda
1. Introduction & Background
◦ Linux Containers (LXC)
◦ Rapid Container Migration
◦ Problem Statement
◦ LXC Scheduling and Overcommitting
2. Container Rebalancing
◦ Design Goals
◦ Illustrated Example
◦ Implementation
3. Evaluation
◦ Workload Data
◦ Simulation Method
◦ Metrics
◦ Simulation Results
4. Conclusion
◦ Future Work
July 14, 2017 COMPSAC 2017 2
Linux Containers (LXC)
LXC allow the creation of a
contained process called
“container”
“Lightweight” Virtualization
◦ Compared to a VM, an LXC
container has significantly lower
overhead [1, 2]
◦ A container may take seconds to
boot up whereas a similar VM
may take minutes
July 14, 2017 COMPSAC 2017 3
[1] M. G. Xavier, M. V. Neves, F. D. Rossi, T. C. Ferreto, T. Lange, and C. a. F. De Rose, “Performance Evaluation of Container-based Virtual- ization for High Performance
Computing Environments,” Proceedings of the 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 233–240, 2013.
[2] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An Updated Performance Comparison of Virtual Machines and Linux Containers,” Technology, vol. 25482, 2014.
Rapid Container Migration
Long VM migration time is a problem for migration-based
scheduling strategy [3]
With significant reduction in migration time of LXC, rapid
container migration becomes a viable optimization
strategy
July 14, 2017 COMPSAC 2017 4
Typically smaller than VM Very small for LXC
[3] J. Hu, J. Gu, G. Sun, and T. Zhao, “A scheduling strategy on load balancing of virtual machine resources in cloud computing environment,”
Proceedings - 3rd International Symposium on Parallel Architectures, Algorithms and Programming, PAAP 2010, pp. 89–96, 2010.
tdisk-copy tinsantitationtmem-copy
Container
Virtual Machine
Migration Time (not to scale)
tdisk-copy tinsantitationtmem-copy
None of the existing container
orchestration solutions take advantage of
rapid container migration
We explores the possibility of leveraging rapid
container migration to increase data center efficiency
July 14, 2017 COMPSAC 2017 6
LXC Scheduling and Overcommitting
Existing scheduling solutions for LXC clusters are
typically designed as a general-purpose scheduling
platform
◦ No solution take advantage of LXC’s unique capability
Overcommitting allows the scheduler to allocate more
resources than the actual capacity of the system
◦ Assumption: Allocated resources are typically higher than
actual utilization
◦ Commonly done statically by setting a static overcommit
ratio
◦ o.c. ratio too high => Instability
◦ o.c. ratio too low => Underutilization
◦ There is an optimal o.c. ratio for a system
July 14, 2017 COMPSAC 2017 7
Overcommit-able
Region
ResourceUtilization
Container
Rebalancing
A novel method to increase LXC cluster
efficiency by increasing the optimal overcommit
ratio using rapid container migration
July 14, 2017 COMPSAC 2017 8
CR | Illustrated Example
July 14, 2017 COMPSAC 2017 9
A Task
Before After
Comparable Allocation
&
Different Utilization
Overcommit OKOvercommit NG Overcommit OKOvercommit OK
Increase Optimal Overcommit Ratio
=> Increased Utilization
Container Rebalancing (CR) | Goals
1. Proactive-Optimization: Anticipates future workloads
and proactively optimizes container placement
accordingly
◦ Online load-balancing in anticipation of future workloads
◦ This approach requires rapid migration which is viable with LXC
2. Compatibility: Should work alongside the existing
scheduling process
◦ CR is another process working with the scheduling process while
minimizing interference to the scheduler
3. Scalability: Should be able to handle a large number of
containers efficiently
◦ Only load-balance long-lived container for scalability
July 14, 2017 COMPSAC 2017 10
CR | Implementation
Container rebalancing process is divided into 4 steps:
1. Container Classification: Containers are classified
into long-lived containers or short-lived containers as
they are inserted into the system
2. Building Comparable Container Space: Long-lived
containers are grouped together according to the
amounts of their allocated resources and according
to their assigned hosts
3. Searching Comparable Container Space: A pair of
hosts with a significant resource utilization
difference is selected
4. Container Swapping: Each container in the
swappable container pair is migrated to the host of
its counterpart.
July 14, 2017 COMPSAC 2017 11
Overcommit-able
Region
Overcommit-able
Region
ResourceUtilization
ResourceUtilization
Host A Host B
LongShort
Evaluation
The evaluation was done with an LXC cluster
simulation
◦ Measure scheduling performance and cluster utilization
◦ Compares a general scheduling mechanism (scheduling)
to the container rebalancing mechanism (scheduling +
rebalancing)
◦ Driven by a real-world workload from Google’s cluster
data [4, 5]
July 14, 2017 COMPSAC 2017 12
[4] J. Wilkes, “More Google cluster data,” Google research blog, nov 2011.
[5] C. Reiss, J. Wilkes, and J. L. Hellerstein, “Google cluster-usage traces: format + schema,” Google Inc., Mountain View, CA, USA, Technical Report, nov 2011.
Workload Data
The workload is organized into jobs and containers
◦ A jobs contain one or more identical containers that have to be scheduled
together
Google’s cluster data contains 672,074 jobs and 24,281,242 containers
from a 1-month period
CPU cores are the only resources taken into account in the simulation
◦ Simplify the simulation process
◦ Speed up the simulation time
July 14, 2017 COMPSAC 2017 13
Requested Actual
Simulation
The simulation is an event-driven simulation with four processes running
simultaneously
1. Producer: Categorizes long-lived/short-lived container and inserts a job
into the job_queue when the simulation time reaches the starting time
of each job
2. Scheduler: Implements a common scheduling strategy (random first-fit)
with overcommitting
3. Rebalancer: Searches through scheduled long-lived containers for
swappable container-pairs and migrates each container to its
counterpart’s host
4. Monitor: Keeps track of resource utilization of hosts and containers in
the simulated cluster and generates reports
Process 1, 2 and 4 are used to simulate general scheduling mechanism while
all processes are used to simulate container rebalancing mechanism
July 14, 2017 COMPSAC 2017 14
Evaluation Metrics
Scheduling Performance Metrics
◦ Container Scheduled Rate (CSR)
◦ Long-lived Container Scheduled Rate (LCSR)
◦ Short-lived Container Scheduled Rate (SCSR)
Cluster Utilization Metrics
◦ Average Cluster Utilization
◦ Cluster Utilization Over Time
July 14, 2017 COMPSAC 2017 15
Evaluation Results (1/3)
July 14, 2017 COMPSAC 2017 16
0.31
0.32
0.33
0.34
0.35
0.36
0.37
1.3 1.4
AverageClusterUtilization
Overcommit Ratio
Container Rebalancing General Scheduling
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
CSR
Overcommit Ratio
Container Rebalancing General Scheduling
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
LCSR
Overcommit Ratio
Container Rebalancing General Scheduling
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
SCSR
Overcommit Ratio
Container Rebalancing General Scheduling
Long-lived Container Scheduled Rate Short-lived Container Scheduled Rate
Container Scheduled Rate Average Cluster Utilization
Outliers
Outliers
Outliers
Optimal
for
CR
Optimal
of
GS
Optimal
of
CR
Optimal
of
GS
CR is generally produce
better result
Optimal
of GS
Optimal
for CR
0
Evaluation Results (2/3)
July 14, 2017 COMPSAC 2017 17
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100
103
106
109
112
115
118
121
124
127
130
133
136
139
142
145
148
151
154
157
160
163
166
ClusterUtilization
Simulation Time(Hours)
General Scheduling (Overcommit Ratio: 1.3) Container Rebalancing (Overcommit Ratio: 1.3)
General Scheduling (Overcommit Ratio: 1.4) Container Rebalancing (Overcommit Ratio: 1.4)
Utilization over time of the simulated cluster
CR is generally produce better result
Comparing CR at overcommit ratio 1.4 to GS at overcommit ratio 1.3,
1.8% more containers are executed
Evaluation Results (3/3)
July 14, 2017 COMPSAC 2017 18
Distribution of unique containers by their migration count
throughout the CR simulation at overcommit ratio 1.4
0
3000
6000
9000
12000
15000
18000
21000
24000
27000
30000
33000
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160
#ofUniqueContainers
# of Migrations
Around half of the affected containers are migrated only once
Almost all of the affected containers are migrated
less than 20 times
• Minimal effect to the containers
• 1.32% of all containers are migrated
• Accounted to 5.96% of all long-lived
containers in the simulation
Conclusion
Rapid container migration is a property of LXC cluster that can be
leverage to increase data center efficiency
Container rebalancing is a novel scheduling mechanism with a
rebalancing process working in conjunction with an existing scheduling
process of LXC clusters
◦ Increases optimal overcommit factor with online container load-balancing
The simulation is used to evaluate the performance and validate the
feasibility of container rebalancing
◦ The results still suggest that container rebalancing is a promising method
More work is being done to investigate the effectiveness of this
method, to improve the accuracy of the simulation, and to see the
effect of this method with multi-objective optimization
July 14, 2017 COMPSAC 2017 19
Thank You
Q & A
PONGSAKORN U-CHUPALA, D3, SDLAB, NAIST
PONGSAKORN.UCHUPALA.PM7@IS.NAIST.JP
July 14, 2017 COMPSAC 2017 20

More Related Content

PDF
cnsm2012_slide
PPTX
Earthquake Updates and Enhancements to Processing for Hazus-MH 3.2
PDF
Benchmarking Elastic Cloud Big Data Services under SLA Constraints
PDF
Value-Based Allocation of Docker Containers
PPTX
Doing in One Go: Delivery Time Inference Based on Couriers’ Trajectories
PDF
Download-manuals-hydrometeorology-data processing-10howtocorrectandcompleter...
PDF
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
PDF
Anatomy of an action
cnsm2012_slide
Earthquake Updates and Enhancements to Processing for Hazus-MH 3.2
Benchmarking Elastic Cloud Big Data Services under SLA Constraints
Value-Based Allocation of Docker Containers
Doing in One Go: Delivery Time Inference Based on Couriers’ Trajectories
Download-manuals-hydrometeorology-data processing-10howtocorrectandcompleter...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Anatomy of an action

Similar to Container Rebalancing: Towards Proactive Linux Containers Placement in a Datacenter (20)

PDF
Hybrid Task Scheduling Approach using Gravitational and ACO Search Algorithm
PDF
White paper: How to build a real-time vehicle route optimiser
DOC
genetic paper
PDF
Scheduling in cloud computing
PDF
Multi objective genetic approach with Ranking
PDF
Ieeepro techno solutions 2014 ieee java project - deadline based resource p...
PDF
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...
PDF
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...
PDF
An improved approach to minimize context switching in round robin scheduling ...
PPTX
Cluster Management _ kubernetes MADIHA HARIFI
PPT
Srushti_M.E_PPT.ppt
PDF
[Document] MultiProject analysis with Critical Path Method
PDF
Optimization of Resource Allocation Strategy Using Modified PSO in Cloud Envi...
PPTX
Simplivity webinar presentation
PDF
Improving Resource Utilization in Data Centers using an LSTM-based Prediction...
PDF
Beyond Cost Savings_ How DevOps and FinOps Drive Cloud Success Together.pdf
PDF
Cost-Efficient Task Scheduling with Ant Colony Algorithm for Executing Large ...
PPTX
GCCP JSCOE Session 2
PDF
Dynamic Resource Allocation Algorithm using Containers
PPT
PMI Global 2007 - Urucu/Manaus
Hybrid Task Scheduling Approach using Gravitational and ACO Search Algorithm
White paper: How to build a real-time vehicle route optimiser
genetic paper
Scheduling in cloud computing
Multi objective genetic approach with Ranking
Ieeepro techno solutions 2014 ieee java project - deadline based resource p...
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...
An improved approach to minimize context switching in round robin scheduling ...
Cluster Management _ kubernetes MADIHA HARIFI
Srushti_M.E_PPT.ppt
[Document] MultiProject analysis with Critical Path Method
Optimization of Resource Allocation Strategy Using Modified PSO in Cloud Envi...
Simplivity webinar presentation
Improving Resource Utilization in Data Centers using an LSTM-based Prediction...
Beyond Cost Savings_ How DevOps and FinOps Drive Cloud Success Together.pdf
Cost-Efficient Task Scheduling with Ant Colony Algorithm for Executing Large ...
GCCP JSCOE Session 2
Dynamic Resource Allocation Algorithm using Containers
PMI Global 2007 - Urucu/Manaus
Ad

More from Pongsakorn U-chupala (10)

PPTX
Application-Oriented Bandwidth and Latency Aware Routing with OpenFlow Network
PPTX
Designing of SDN-Assisted Bandwidth and Latency Aware Route Allocation
PPTX
Vision of the future Ambient Intelligence
PPTX
An Implementation of Virtual Cluster on a Cloud
PPTX
Anime Discussion (Fall 2010)
PPTX
Getting Things Done with "Getting Things Done"
PPTX
Introduction to MVC Web Framework with CodeIgniter
PPTX
Introduction to database
PPTX
Are you ready for Google Wave?
PPTX
How to develop a homebrew application for Nintendo Wii
Application-Oriented Bandwidth and Latency Aware Routing with OpenFlow Network
Designing of SDN-Assisted Bandwidth and Latency Aware Route Allocation
Vision of the future Ambient Intelligence
An Implementation of Virtual Cluster on a Cloud
Anime Discussion (Fall 2010)
Getting Things Done with "Getting Things Done"
Introduction to MVC Web Framework with CodeIgniter
Introduction to database
Are you ready for Google Wave?
How to develop a homebrew application for Nintendo Wii
Ad

Recently uploaded (20)

PPTX
History, Philosophy and sociology of education (1).pptx
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
01-Introduction-to-Information-Management.pdf
PPTX
master seminar digital applications in india
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Updated Idioms and Phrasal Verbs in English subject
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Cell Structure & Organelles in detailed.
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
History, Philosophy and sociology of education (1).pptx
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Chinmaya Tiranga quiz Grand Finale.pdf
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
Microbial disease of the cardiovascular and lymphatic systems
A systematic review of self-coping strategies used by university students to ...
01-Introduction-to-Information-Management.pdf
master seminar digital applications in india
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Orientation - ARALprogram of Deped to the Parents.pptx
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Final Presentation General Medicine 03-08-2024.pptx
Updated Idioms and Phrasal Verbs in English subject
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Computing-Curriculum for Schools in Ghana
Cell Structure & Organelles in detailed.
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf

Container Rebalancing: Towards Proactive Linux Containers Placement in a Datacenter

  • 1. Container Rebalancing: Towards Proactive Linux Containers Placement Optimization in a DataCenter PONGSAKORN U-CHUPALA, YASUHIRO WATASHIBA, KOHEI ICHIKAWA, SUSUMU DATE* AND HAJIMU IIDA N A R A I N S T I T U T E O F S C I E N C E A N D T E C H N O L O G Y , N A R A , J A P A N * O S A K A U N I V E R S I T Y , O S A K A , J A P A N July 14, 2017 COMPSAC 2017 1
  • 2. Agenda 1. Introduction & Background ◦ Linux Containers (LXC) ◦ Rapid Container Migration ◦ Problem Statement ◦ LXC Scheduling and Overcommitting 2. Container Rebalancing ◦ Design Goals ◦ Illustrated Example ◦ Implementation 3. Evaluation ◦ Workload Data ◦ Simulation Method ◦ Metrics ◦ Simulation Results 4. Conclusion ◦ Future Work July 14, 2017 COMPSAC 2017 2
  • 3. Linux Containers (LXC) LXC allow the creation of a contained process called “container” “Lightweight” Virtualization ◦ Compared to a VM, an LXC container has significantly lower overhead [1, 2] ◦ A container may take seconds to boot up whereas a similar VM may take minutes July 14, 2017 COMPSAC 2017 3 [1] M. G. Xavier, M. V. Neves, F. D. Rossi, T. C. Ferreto, T. Lange, and C. a. F. De Rose, “Performance Evaluation of Container-based Virtual- ization for High Performance Computing Environments,” Proceedings of the 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 233–240, 2013. [2] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An Updated Performance Comparison of Virtual Machines and Linux Containers,” Technology, vol. 25482, 2014.
  • 4. Rapid Container Migration Long VM migration time is a problem for migration-based scheduling strategy [3] With significant reduction in migration time of LXC, rapid container migration becomes a viable optimization strategy July 14, 2017 COMPSAC 2017 4 Typically smaller than VM Very small for LXC [3] J. Hu, J. Gu, G. Sun, and T. Zhao, “A scheduling strategy on load balancing of virtual machine resources in cloud computing environment,” Proceedings - 3rd International Symposium on Parallel Architectures, Algorithms and Programming, PAAP 2010, pp. 89–96, 2010. tdisk-copy tinsantitationtmem-copy Container Virtual Machine Migration Time (not to scale) tdisk-copy tinsantitationtmem-copy
  • 5. None of the existing container orchestration solutions take advantage of rapid container migration We explores the possibility of leveraging rapid container migration to increase data center efficiency July 14, 2017 COMPSAC 2017 6
  • 6. LXC Scheduling and Overcommitting Existing scheduling solutions for LXC clusters are typically designed as a general-purpose scheduling platform ◦ No solution take advantage of LXC’s unique capability Overcommitting allows the scheduler to allocate more resources than the actual capacity of the system ◦ Assumption: Allocated resources are typically higher than actual utilization ◦ Commonly done statically by setting a static overcommit ratio ◦ o.c. ratio too high => Instability ◦ o.c. ratio too low => Underutilization ◦ There is an optimal o.c. ratio for a system July 14, 2017 COMPSAC 2017 7 Overcommit-able Region ResourceUtilization
  • 7. Container Rebalancing A novel method to increase LXC cluster efficiency by increasing the optimal overcommit ratio using rapid container migration July 14, 2017 COMPSAC 2017 8
  • 8. CR | Illustrated Example July 14, 2017 COMPSAC 2017 9 A Task Before After Comparable Allocation & Different Utilization Overcommit OKOvercommit NG Overcommit OKOvercommit OK Increase Optimal Overcommit Ratio => Increased Utilization
  • 9. Container Rebalancing (CR) | Goals 1. Proactive-Optimization: Anticipates future workloads and proactively optimizes container placement accordingly ◦ Online load-balancing in anticipation of future workloads ◦ This approach requires rapid migration which is viable with LXC 2. Compatibility: Should work alongside the existing scheduling process ◦ CR is another process working with the scheduling process while minimizing interference to the scheduler 3. Scalability: Should be able to handle a large number of containers efficiently ◦ Only load-balance long-lived container for scalability July 14, 2017 COMPSAC 2017 10
  • 10. CR | Implementation Container rebalancing process is divided into 4 steps: 1. Container Classification: Containers are classified into long-lived containers or short-lived containers as they are inserted into the system 2. Building Comparable Container Space: Long-lived containers are grouped together according to the amounts of their allocated resources and according to their assigned hosts 3. Searching Comparable Container Space: A pair of hosts with a significant resource utilization difference is selected 4. Container Swapping: Each container in the swappable container pair is migrated to the host of its counterpart. July 14, 2017 COMPSAC 2017 11 Overcommit-able Region Overcommit-able Region ResourceUtilization ResourceUtilization Host A Host B LongShort
  • 11. Evaluation The evaluation was done with an LXC cluster simulation ◦ Measure scheduling performance and cluster utilization ◦ Compares a general scheduling mechanism (scheduling) to the container rebalancing mechanism (scheduling + rebalancing) ◦ Driven by a real-world workload from Google’s cluster data [4, 5] July 14, 2017 COMPSAC 2017 12 [4] J. Wilkes, “More Google cluster data,” Google research blog, nov 2011. [5] C. Reiss, J. Wilkes, and J. L. Hellerstein, “Google cluster-usage traces: format + schema,” Google Inc., Mountain View, CA, USA, Technical Report, nov 2011.
  • 12. Workload Data The workload is organized into jobs and containers ◦ A jobs contain one or more identical containers that have to be scheduled together Google’s cluster data contains 672,074 jobs and 24,281,242 containers from a 1-month period CPU cores are the only resources taken into account in the simulation ◦ Simplify the simulation process ◦ Speed up the simulation time July 14, 2017 COMPSAC 2017 13 Requested Actual
  • 13. Simulation The simulation is an event-driven simulation with four processes running simultaneously 1. Producer: Categorizes long-lived/short-lived container and inserts a job into the job_queue when the simulation time reaches the starting time of each job 2. Scheduler: Implements a common scheduling strategy (random first-fit) with overcommitting 3. Rebalancer: Searches through scheduled long-lived containers for swappable container-pairs and migrates each container to its counterpart’s host 4. Monitor: Keeps track of resource utilization of hosts and containers in the simulated cluster and generates reports Process 1, 2 and 4 are used to simulate general scheduling mechanism while all processes are used to simulate container rebalancing mechanism July 14, 2017 COMPSAC 2017 14
  • 14. Evaluation Metrics Scheduling Performance Metrics ◦ Container Scheduled Rate (CSR) ◦ Long-lived Container Scheduled Rate (LCSR) ◦ Short-lived Container Scheduled Rate (SCSR) Cluster Utilization Metrics ◦ Average Cluster Utilization ◦ Cluster Utilization Over Time July 14, 2017 COMPSAC 2017 15
  • 15. Evaluation Results (1/3) July 14, 2017 COMPSAC 2017 16 0.31 0.32 0.33 0.34 0.35 0.36 0.37 1.3 1.4 AverageClusterUtilization Overcommit Ratio Container Rebalancing General Scheduling 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 CSR Overcommit Ratio Container Rebalancing General Scheduling 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 LCSR Overcommit Ratio Container Rebalancing General Scheduling 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 SCSR Overcommit Ratio Container Rebalancing General Scheduling Long-lived Container Scheduled Rate Short-lived Container Scheduled Rate Container Scheduled Rate Average Cluster Utilization Outliers Outliers Outliers Optimal for CR Optimal of GS Optimal of CR Optimal of GS CR is generally produce better result Optimal of GS Optimal for CR 0
  • 16. Evaluation Results (2/3) July 14, 2017 COMPSAC 2017 17 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121 124 127 130 133 136 139 142 145 148 151 154 157 160 163 166 ClusterUtilization Simulation Time(Hours) General Scheduling (Overcommit Ratio: 1.3) Container Rebalancing (Overcommit Ratio: 1.3) General Scheduling (Overcommit Ratio: 1.4) Container Rebalancing (Overcommit Ratio: 1.4) Utilization over time of the simulated cluster CR is generally produce better result Comparing CR at overcommit ratio 1.4 to GS at overcommit ratio 1.3, 1.8% more containers are executed
  • 17. Evaluation Results (3/3) July 14, 2017 COMPSAC 2017 18 Distribution of unique containers by their migration count throughout the CR simulation at overcommit ratio 1.4 0 3000 6000 9000 12000 15000 18000 21000 24000 27000 30000 33000 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 #ofUniqueContainers # of Migrations Around half of the affected containers are migrated only once Almost all of the affected containers are migrated less than 20 times • Minimal effect to the containers • 1.32% of all containers are migrated • Accounted to 5.96% of all long-lived containers in the simulation
  • 18. Conclusion Rapid container migration is a property of LXC cluster that can be leverage to increase data center efficiency Container rebalancing is a novel scheduling mechanism with a rebalancing process working in conjunction with an existing scheduling process of LXC clusters ◦ Increases optimal overcommit factor with online container load-balancing The simulation is used to evaluate the performance and validate the feasibility of container rebalancing ◦ The results still suggest that container rebalancing is a promising method More work is being done to investigate the effectiveness of this method, to improve the accuracy of the simulation, and to see the effect of this method with multi-objective optimization July 14, 2017 COMPSAC 2017 19
  • 19. Thank You Q & A PONGSAKORN U-CHUPALA, D3, SDLAB, NAIST [email protected] July 14, 2017 COMPSAC 2017 20

Editor's Notes

  • #4: An isolated view of the OS environment with only an allocated amount of resources
  • #7: We explores the possibility of leveraging rapid container migration as a resource management technique in conjunction with existing optimization techniques to increase data center efficiency
  • #8: Even with a fairly efficient scheduling algorithm, actual resources utilization could still be at about 50-60%, while available resources (such as CPU cores and memory) are mostly allocated [4] [4] C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. a. Kozuch, “Heterogeneity and dynamicity of clouds at scale,” Proceedings of the Third ACM Symposium on Cloud Computing - SoCC ’12, pp. 1–13, 2012.
  • #11: A container typically requires fewer resources than a VM with a similar configuration, an LXC cluster is expected to be able to deal with a higher number of container
  • #13: An LXC cluster simulation is used to evaluate the performance and validate the feasibility of the container rebalancing mechanism
  • #14: The value provided by the trace data is normalized with the number of cores in the machine with the most cores in Google’s cluster for obfuscation
  • #17: Fit more containers, same resources, same time
  • #19: Minimal effect to the containers 67,718 unique containers are migrated This number is 1.32% of all containers in the simulation and 5.96% of all long-lived containers in the simulation