SlideShare a Scribd company logo
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
__________________________________________________________________________________________ 
Volume: 03 Special Issue: 03 | May-2014 | NCRIET-2014, Available @ http://www.ijret.org 167 
LOAD BALANCING IN PUBLIC CLOUD COMBINING THE CONCEPTS OF DATA MINING AND NETWORKING Priyanka R M. Tech Student, Dept. of Computer Science and Engineering, AIET, Karnataka, India Abstract Load balancing in the cloud computing environment has an important impact on the performance of whole system. A Good load balancing method makes cloud computing more efficient with increased user satisfaction. The combined concepts of networking, data mining and cloud computing technology are used to achieve a good load balancing strategy. Cloud partitioning helps to simplify the load balancing problem. The algorithms from networking and data mining are used to achieve cloud partitions. A cloud partitioning concepts introduced here is used in load balancing model for the public cloud with a switch mechanism to choose different strategies for different situations. The algorithm applies the round robin and game theory after cloud partitioning as the load balancing strategy to improve the efficiency in the public cloud environment. Keywords: Load balancing, public cloud, Cloud partition 
-----------------------------------------------------------------------***----------------------------------------------------------------------- 1. INTRODUCTION Cloud Computing, the long-held dream of computing is similar to utility computing. It is expected to accelerate innovation and business agility of the IT industry, enabling software to become more attractive as a service and shaping the way IT hardware is designed and purchased [1]. Cloud computing provides a distributed computing environment that focuses on providing a wide range of users with distributed access to virtualized, scalable hardware and/or software infrastructure over the internet. Cloud computing provides hardware and software packages which are delivered as a service to clients over an outsized scale network [2]. 1.1 Load balancing concepts Load balancing is a process of distributing the total load on to the individual nodes of the collective system to maximize throughput, resource utilization and to minimize the response time, along with removing a condition in which some of the nodes are heavily loaded while some others are light. [3] Load balancer is a software program which receives connection request from clients and forwards it to one of the backend server which replies accordingly. Because of this the client will be unaware of where the data is stored. This separation also has security benefit which hides the structure of internal network and helps to prevent attacks. In cloud computing environment, there is random arrival of jobs with random CPU utilization. Such requirements can load a specific resources heavily, while the other resources are less loaded. 
2. RELATED WORK 
There are many studies being conducted in this stream. However, load balancing in the cloud is still a new problem that needs new architectures which can adapt to changing needs. Soumya Ray and Ajanta De Sarkar [15] proposed a brief review of existing load balancing algorithms. K. Ramana, A. Subramanyam and A. Ananda Raohave [16] have put forth that Load balancing algorithm tries to balance the total system load by transferring the workload from heavily loaded nodes to lightly loaded nodes and also presented the performance analysis of various load balancing algorithms based on different qualitative parameters, considering static and dynamic load balancing approaches. In the existing system the partitioning of cloud is based on area which might not always be well suited in all situations. Hence, there is a need of new partitioning method using which we can achieve a better performance. 3. SYSTEM MODEL The load balancing strategy involves creating cloud partitions. A cloud partition is a sub area of public cloud. Here the divisions are based on the VDBSCAN algorithm. Once the public cloud is partitioned, then the load balancing starts when a job arrives at the system. The main controller decides which cloud partition should receive the job. The load balancer assigned for each partition then decides how to assign the jobs to the nodes. When the load status of a cloud partition is idle or normal, this task can is accomplished locally. If the cloud partition is overloaded, this job should be transferred to another partition. The whole process is shown in Fig.1.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
__________________________________________________________________________________________ 
Volume: 03 Special Issue: 03 | May-2014 | NCRIET-2014, Available @ http://www.ijret.org 168 
We are using following techniques. 
1. Dijkstra’s algorithm to find the shortest distance between two nodes 
2. VDBSCAN Cluster based Cloud Partition 
3. Round-Robin Technique 
4. Game Theory Technique. 
Following algorithm shows the procedure Algorithm 1 Begin Apply Dijkstra’s algorithm to find the shortest path Apply VDBSCAN algorithm to find clusters and make each cluster as a partition while job do searchBestPartition (job); if (partitionState == idle || partitionState == normal) then Send Job to Partition; else search for another Partition; end if end while end 
Fig -1: Job assignment strategy 
3.1 Cloud Partition 
A cloud partition is a sub area of public cloud. Partitioning will simplify the process of load balancing. 
3.1.1 Dijkstra’s Algorithm 
The use of networking concept will determine the distance at which each server is placed. For this, here, we use distance formula to find weights and then use the dijkstra’s algorithm [4]. Dijkstra’s algorithm solves the single-source shortest- paths problem on a directed graph G= (V,E) and all edges have non-negative weights. In this section, therefore, we assume that w(u,v)>=0 for each edge for each edge E(u,v) [5]. Dijkstra’s algorithm maintains a set S of vertices whose final shortest-path weights from the source S have already been determined. The algorithm iteratively selects the vertex u ∈ V- S with the minimum shortest-path estimate, adds u to set S , and relaxes all edges leaving u. Here, we use a min-priority queue Q of vertices, keyed by their values of d. The following algorithm is referred from [5]. DIJKSTRA(G,w,s) INITIALIZE -SINGLE-SOURCE(.G, s) S=휑; Q=G.V while Q!=휑 u=EXTRACT-MIN(Q) S=SU{u} for each vertex v G.Adj[u] RELAX (u,v,w) Thus, using the Dijkstra’s algorithm we first find the shortest path and then we use the shortest path concept in VDBSCAN method to develop clusters. 3.1.2 DBSCAN In the pervious step, the shortest distance between any two points is calculated. This value is used in next step to form clusters. Here we use VDBSCAN( Varied Density Based SCANning) method to form clusters. DBSCAN can find clusters of arbitrary shape. However, clusters that lie close to each other tend to belong to the same class. Its computing process is based on six rules or definitions, creating two lemmas [6][13]. Definition 1: (The Eps-neighbourhood of a point) The NEps(p) represents the Eps-neighbourhood of a point p and is defined by NEps(p) = {q D|dist(p,q)<Eps} (1)
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
__________________________________________________________________________________________ 
Volume: 03 Special Issue: 03 | May-2014 | NCRIET-2014, Available @ http://www.ijret.org 169 
Where D is set of given points. For a point to belong to a cluster it needs to have at least one other point that lies closer to it than the distance Eps. Definition 2: (Directly density-reachable) There are two kinds of points belonging to a cluster; there are border points and core points [6]. “The Eps-neighborhood of a border point tends to have significantly less points than the Eps-neighborhood of a core point”. The border points will still be a part of the cluster and in order to include these points, they must belong to the Eps- neighborhood of a core point q. pNEps (q) (2) In order for point q to be a core point it needs to have a minimum number of points within its Eps-neighborhood | NEps(q)| ≥ MinPts (core point condition) (3) Definition 3: ( Density-reachable)“A point p is density- reachable from a point q with respect to Eps and MinPts if there is a chain of points p1...,pn, p1=q, pn=p such that pi+1 is directly density-reachable from pi.” Definition 4: (Density-connected) There are cases when two border points will belong to the same cluster but where the two border points don’t share a specific core point. In these situations the points will not be density-reachable from each other. There must however be a core point q from which they are both density-reachable. “A point p is density-connected to a point q with respect to Eps and MinPts if there is a point o such that both, p and q are density-reachable from o with respect to Eps and MinPts.” Definition 5: (cluster) If point p is a part of a cluster C and point q is density- reachable from point p with respect to a given distance and a minimum number of points within that distance, then q is also a part of cluster C. 
1) "∀ p, q: if p C and q is density-reachable from p with respect to Eps and MinPts, then qC. 
Two points belongs to the same cluster C, is the same as saying that p is density-connected to q with respect to the given distance and the number of points within that given distance. 
2) ∀p, q C: p is density-connected to q with respect to Eps and MinPts. 
Definition 6: (noise) 
Noise is the set of points, that don’t belong to any of the clusters. 
Lemma 1: A cluster can be formed from any of its core points and will always have the same shape. 
Lemma 2: Let p be a core point in cluster C with a given minimum distance (Eps) and a minimum number of points within that distance (MinPts). If the set O is density-reachable from p with respect to the same Eps and MinPts, then C is equal to the set O. VDBSCAN VDBSCAN[13] algorithm detects cluster with varied density as well as automatically selects several values of input parameter Eps for different densities.Peng Liu, Dong Zhou and Naijun Wu have proposed a detailed description in [14]. Description of the Algorithm In general the algorithm has two steps, choosing parameters Epsi and cluster with varied densities. The algorithm [13] is as follows, (i) It calculates and stores k-dist for each project and partition the k-dist plots. (ii) The number of densities is given intuitively by k- dist plot. (iii) The parameter Epsi is selected automatically for each density. (iv) Scan the dataset and cluster different densities using corresponding Epsi (v) Display the valid cluster with respect to varied density. Algorithm: Partition k-dist plot. Give thresholds of parameters Epsi(i=1,2,…..n) For each Epsi(i=1,2,…..n) a) Eps = Epsi b) Adopt DBSCAN algorithm for points that are not marked. c) Mark points as ci Display all the marked points as corresponding clusters. By using this VDBSCAN algorithm we can form clusters of nodes and make each cluster as a partition. 3.1.3 Assigning Jobs to the Cloud Partition When a job arrives at the public cloud, the first step is to choose the right partition. Job assignment strategy is presented from [7]. The cloud partition status can be divided into three types: 
(1) Idle: When the percentage of idle nodes exceeds α, change to idle status. 
(2) Normal: When the percentage of the normal nodes exceeds β, change to normal load status.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
__________________________________________________________________________________________ 
Volume: 03 Special Issue: 03 | May-2014 | NCRIET-2014, Available @ http://www.ijret.org 170 
(3) Overload: When the percentage of the overloaded nodes exceeds γ, change to overloaded status. 
The parameters α, β and γ are to be set by the cloud partition balancers [7]. When the load status of a cloud partition is idle or normal, this partitioning can be accomplished locally [7]. If the cloud partition load status is not normal or idle, this job should be transferred to another partition. The partition load balancer then decides how to assign the jobs to the nodes. Server load status is divided into three types. If one cloud server is overloaded and it again getting new client request while other servers are in Idle or Normal state then following algorithms are used. 
 Idle: If it is in idle status, this job should be transferred to another partition by using Round Robin algorithm. 
 Normal: If it is normal, this job should be transferred to another partition by using Game theory algorithm. 
 Overload: If it is overload, this job should be transferred to another partition. That partition selected using above two algorithms. 
Step 1 Define a load parameter set: F = {F1, F2, ..Fm }with each Fi(1 <= i<= m, Fi∈[0,1]) parameter being either static or dynamic. m represents the total number of the parameters.[7] Step 2 Compute the load degree as: Load degree(N ) = αiFi 푚푖 =0 (4) αi( αi n푖 =0=1) are weights that may differ for different kinds of jobs. N represents the current node.[7] Step 3 Define evaluation benchmarks. Calculate the average cloud partition degree from the node load degree statistics as: Load degreeavg = Load_degree(Ni) 푛 푖=0 푛 (5) The bench mark Load_degreehigh is then set for different situations based on the Load_degreeavg.[7] Step 4 Three nodes load status levels are then defined as[7]: 
 Idle When 
Load_degree(N) = 0 (6) There is no job being processed by this node so the status is changed to Idle. 
 Normal For 
0 < Load degree(N ) <= Load_degreehigh(7) The node is normal and it can process other jobs. 
 Overloaded When 
Load degreehigh<= Load_degree(N) (8) The node is not available and can not receive jobs until it returns to the normal.[7] 
The load degree results are input into the Load Status Tables created by the cloud partition balancers. Each balancer has a Load Status Table and refreshes it each fixed period T. The table is then used by the balancers to calculate the partition status. Each partition status has a different load balancing solution. When a job arrives at a cloud partition, the balancer assigns the job to the nodes based on its current load strategy. This strategy is changed by the balancers as the cloud partition status changes. 3.1.4 Round Robin Algorithm In the regular Round Robin algorithm, every node has an equal opportunity to be chosen. However, in a public cloud, the configuration and the performance of each node will be not the same; thus, this method may overload some nodes. Thus, an improved Round Robin algorithm is used, which called “Round Robin based on the load status”. Step 1:Job arrives at the main controller Step 2: Job is assigned to balancer based on request location, balancer status Step 3:In particular partition P Set i=0 Set s[n] as no. of servers arranged in increasing order of jobs in P for all n=1,..,n Set s[c] as no. of idle servers in P from s[n] for all c=1,..,n Step 4:When job arrives If s[c] != NULL Then, send the connection to s[i] i= i+1; If i == c Then i=1 Else Go to game theory End if Step 5:go to step 1 3.1.5 Game Theory The players in the game are the nodes and they contend for jobs. Suppose there are n nodes in the current cloud partition with N jobs arriving, then define the following parameters: μi : Processing ability of each node, i = 1,…, n. Фj : Time spent for each job. Ф = ф푁푗 =1j Фj :Time spent by the entire cloud partition, Ф < 휇푁푗 =1i Sij: Fraction of job j that assigned to node i ( 푆푁푗 =1 ij = 1 and 0 <= Sij<= 1). In this model, the most important step is finding appropriate value of Sij. Here “the best reply” method proposed by Grosuet al.[12]can be used to calculate Sij of each node. The Nash Equilibrium here is to minimize the response time of each job.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 
__________________________________________________________________________________________ 
Volume: 03 Special Issue: 03 | May-2014 | NCRIET-2014, Available @ http://www.ijret.org 171 
4. CONCLUSIONS Load balancing in public clouds is a difficult task with cloud division problem being the major one. Using clustering algorithms is one way of doing it, and there may be several other ways that get the job done. This problem can be easily solved by combining the concepts of networking and data- mining into the load balancing problem of public cloud FUTURE WORK Since this work is conceptual framework, more work is needed to implement the framework and resolve new problems. 
(1) Cloud division rules: It can be further enhanced by devising other methods and comparing performance. The division rule should not simply be based on the geographic location. 
(2) Set the refresh period: Refresh period to communicate the load status should be properly set. It should neither be too short nor too long. Tests and statistical tools can be used to set a reasonable refresh periods. 
(3) Other load balance strategy: Other load balancing strategies may also be used, based on tests comparing different strategies. Many tests are needed to guarantee system availability and efficiency. 
REFERENCES [1].Michael Armbrust, Armando Fox, Gunho Lee and Ion Stoica (2009) “Above the clouds: a berkeley view of cloud computing,” University of California at Berkeley Technical Report No. UCB/EECS-2009-28 [2]. Soumya Ray and Ajanta De Sarkar “Execution analysis of load balancing algorithms in cloud computing environment,” International Journal on Cloud Computing: Services and Architecture (IJCCSA), Vol.2, No.5, October 2012 [3]. K. Ramana, A. Subramanyam and A. Ananda Rao, “Comparative analysis of distributed web server system load balancing algorithms using qualitative parameters,” VSRD- IJCSIT, Vol. 1 (8), 2011, 592-600 [4]. Network Working Group J. Moy, Editor Request for Comments: 1245, “OSPF protocol analysis,” Editor Request for Comments: 1245 Proteon, Inc. July 1991 [5]. T. H. Cormen et al, “Introduction to algorithms,” 3rd edition, Prentice-Hall of India,2010 [6]. Henrik Bäcklund, Anders Hedblom andNiklasNeijman, A density-based spatial clustering of application with noise,” Data Mining TNM033 2011-11-30 LinköpingsUniversitet - ITN [7].Gaochao Xu, Junjie Pang, and Xiaodong Fu, “A Load Balancing Model Based on Cloud Partitioning for the Public Cloud,” IEEE transactions on cloud computing year 2013 [8].Suriya Begum, Dr. Prashanth C.S.R, “Review of load balancing in cloud computing,” IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 1, No 2, January 2013 
[9]. P. Mell and T. Grance, “The NIST definition of cloud computing,”NIST Special Publication 800-145 September 2011 
[10]. Martin J. Osborne, “An introduction to game theory 1995–2000”: Text book, University of Toronto, Canada, Oxford university press 2000, pg 1-8 
[11]. Jeffrey M. Galloway, Karl L. Smith and Susan S. Vrbsky, “Power aware load balancing for cloud computing,” World Congress on Engineering and Computer Science 2011Vol I WCECS 2011, October 19-21, 2011, San Francisco, USA [12].Daniel Grosu and Anthony T. Chronopoulos, “Noncooperative load balancing in distributed systems,” Journel of parallel and distributed computing ELSEVIER Comput. 65 (2005) 1022 – 1034 [13].M.Parimala, Daphne Lopez, N.C. Senthilkumar, “ A survey on density based clustering algorithms for mining large spatial databases,” International Journal of Advanced Science and Technology Vol. 31, June, 2011 [14]. Peng Liu, Dong Zhou andNaijun Wu, “VDBSCAN: Varied density based spatial clustering of applications with noise,” 1-4244-0885-7/07/$20.00 ©2007 IEEE [15]. Soumya Ray and Ajanta De Sarkar “Execution Analysis of Load Balancing Algorithms in Cloud Computing Environment,” International Journal on Cloud Computing: Services and Architecture (IJCCSA), Vol.2, No.5, October 2012 [16]. K. Ramana, A. Subramanyam and A. AnandaRao,“Comparative Analysis of Distributed Web Server System Load Balancing Algorithms Using Qualitative Parameters,” VSRD-IJCSIT, Vol. 1 (8), 2011, 592-600

More Related Content

PDF
A study of localized algorithm for self organized wireless sensor network and...
PDF
Securing Privacy of User’s Data on Cloud Using Back Propagation Neural Networks
PPT
Data miningpresentation
PDF
Clustering Algorithms for Data Stream
PDF
Redundant Actor Based Multi-Hole Healing System for Mobile Sensor Networks
PDF
A density based micro aggregation technique for privacy-preserving data mining
PDF
I04503075078
PDF
International Refereed Journal of Engineering and Science (IRJES)
A study of localized algorithm for self organized wireless sensor network and...
Securing Privacy of User’s Data on Cloud Using Back Propagation Neural Networks
Data miningpresentation
Clustering Algorithms for Data Stream
Redundant Actor Based Multi-Hole Healing System for Mobile Sensor Networks
A density based micro aggregation technique for privacy-preserving data mining
I04503075078
International Refereed Journal of Engineering and Science (IRJES)

What's hot (16)

PDF
A survey on weighted clustering techniques in manets
PDF
Ripple Algorithm to Evaluate the Importance of Network Nodes
PDF
Intrusion detection in heterogeneous network by multipath routing based toler...
PDF
Intrusion detection in heterogeneous network by multipath routing based toler...
PDF
D031202018023
PDF
REDUCING FREQUENCY OF GROUP REKEYING OPERATION
PDF
A FAST FAULT TOLERANT PARTITIONING ALGORITHM FOR WIRELESS SENSOR NETWORKS
PDF
Erca energy efficient routing and reclustering
PDF
Survey on Load Rebalancing for Distributed File System in Cloud
PDF
E035425030
PDF
Chaos in Small-World Networks
PDF
Distributed vertex cover
PDF
Cloud Partitioning of Load Balancing Using Round Robin Model
PDF
Q026201030106
PDF
NODE FAILURE TIME ANALYSIS FOR MAXIMUM STABILITY VS MINIMUM DISTANCE SPANNING...
PDF
Node failure time analysis for maximum stability vs minimum distance spanning...
A survey on weighted clustering techniques in manets
Ripple Algorithm to Evaluate the Importance of Network Nodes
Intrusion detection in heterogeneous network by multipath routing based toler...
Intrusion detection in heterogeneous network by multipath routing based toler...
D031202018023
REDUCING FREQUENCY OF GROUP REKEYING OPERATION
A FAST FAULT TOLERANT PARTITIONING ALGORITHM FOR WIRELESS SENSOR NETWORKS
Erca energy efficient routing and reclustering
Survey on Load Rebalancing for Distributed File System in Cloud
E035425030
Chaos in Small-World Networks
Distributed vertex cover
Cloud Partitioning of Load Balancing Using Round Robin Model
Q026201030106
NODE FAILURE TIME ANALYSIS FOR MAXIMUM STABILITY VS MINIMUM DISTANCE SPANNING...
Node failure time analysis for maximum stability vs minimum distance spanning...
Ad

Viewers also liked (20)

PDF
A virtual analysis on various techniques using ann with
PDF
An experimental study of square footing resting on geo grid reinforced sand
PDF
Design of fuzzy logic controller for starch
PDF
Pilot aided scheduling for uplink ofdma
PDF
Analysis of methane diffusion flames
PDF
Scour investigation around single and two piers
PDF
Three dimensional finite element modeling of pervious
PDF
Partial encryption of compresed video
PDF
An improved color image encryption algorithm with
PDF
Gsm or x10 based scada system for industrial
PDF
Parametric study of response of an asymmetric building for various earthquake...
PDF
A novel block cipher involving keys in a key bunch
PDF
Online stream mining approach for clustering network traffic
PDF
Radiation effects on heat and mass transfer of a mhd
PDF
Aggregates sustainability through preparation of
PDF
An experiental investigation of effect of cutting parameters and tool materia...
PDF
Accuracy enhancement of srtm and aster dems using weight estimation regressio...
PDF
Heat transfer augmentation in different geometries of dimpled surface under n...
PDF
Image compression using negative format
PDF
I slip algorithm for low latency on hybrid noc architecture
A virtual analysis on various techniques using ann with
An experimental study of square footing resting on geo grid reinforced sand
Design of fuzzy logic controller for starch
Pilot aided scheduling for uplink ofdma
Analysis of methane diffusion flames
Scour investigation around single and two piers
Three dimensional finite element modeling of pervious
Partial encryption of compresed video
An improved color image encryption algorithm with
Gsm or x10 based scada system for industrial
Parametric study of response of an asymmetric building for various earthquake...
A novel block cipher involving keys in a key bunch
Online stream mining approach for clustering network traffic
Radiation effects on heat and mass transfer of a mhd
Aggregates sustainability through preparation of
An experiental investigation of effect of cutting parameters and tool materia...
Accuracy enhancement of srtm and aster dems using weight estimation regressio...
Heat transfer augmentation in different geometries of dimpled surface under n...
Image compression using negative format
I slip algorithm for low latency on hybrid noc architecture
Ad

Similar to Load balancing in public cloud combining the concepts of data mining and networking (20)

PDF
A study of localized algorithm for self organized wireless sensor network and...
PDF
Load balancing with switching mechanism in cloud computing environment
PDF
Dijkstra Shortest Path Visualization
PDF
An efficient routing approach for aggregated data transmission along with per...
PDF
New Approach for K-mean and K-medoids Algorithm
PDF
Spatial correlation based clustering algorithm for random and uniform topolog...
PDF
A novel load balancing model for overloaded cloud
PDF
Routing management for mobile ad hoc networks
PDF
Scalable and efficient cluster based framework for multidimensional indexing
PDF
Scalable and efficient cluster based framework for
PDF
Analysis and implementation of modified k medoids
PDF
Communication synchronization in cluster based wireless sensor network a re...
PDF
Vol 3 No 1 - July 2013
PDF
Ameliorate the performance using soft computing approaches in wireless networks
PDF
Density Based Clustering Approach for Solving the Software Component Restruct...
PDF
Stability routing in fso manets
PDF
Managing cost and performing balancing at cloud platform
PDF
Optimising Data Using K-Means Clustering Algorithm
PDF
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
PDF
Ie3514301434
A study of localized algorithm for self organized wireless sensor network and...
Load balancing with switching mechanism in cloud computing environment
Dijkstra Shortest Path Visualization
An efficient routing approach for aggregated data transmission along with per...
New Approach for K-mean and K-medoids Algorithm
Spatial correlation based clustering algorithm for random and uniform topolog...
A novel load balancing model for overloaded cloud
Routing management for mobile ad hoc networks
Scalable and efficient cluster based framework for multidimensional indexing
Scalable and efficient cluster based framework for
Analysis and implementation of modified k medoids
Communication synchronization in cluster based wireless sensor network a re...
Vol 3 No 1 - July 2013
Ameliorate the performance using soft computing approaches in wireless networks
Density Based Clustering Approach for Solving the Software Component Restruct...
Stability routing in fso manets
Managing cost and performing balancing at cloud platform
Optimising Data Using K-Means Clustering Algorithm
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Ie3514301434

More from eSAT Publishing House (20)

PDF
Likely impacts of hudhud on the environment of visakhapatnam
PDF
Impact of flood disaster in a drought prone area – case study of alampur vill...
PDF
Hudhud cyclone – a severe disaster in visakhapatnam
PDF
Groundwater investigation using geophysical methods a case study of pydibhim...
PDF
Flood related disasters concerned to urban flooding in bangalore, india
PDF
Enhancing post disaster recovery by optimal infrastructure capacity building
PDF
Effect of lintel and lintel band on the global performance of reinforced conc...
PDF
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
PDF
Wind damage to buildings, infrastrucuture and landscape elements along the be...
PDF
Shear strength of rc deep beam panels – a review
PDF
Role of voluntary teams of professional engineers in dissater management – ex...
PDF
Risk analysis and environmental hazard management
PDF
Review study on performance of seismically tested repaired shear walls
PDF
Monitoring and assessment of air quality with reference to dust particles (pm...
PDF
Low cost wireless sensor networks and smartphone applications for disaster ma...
PDF
Coastal zones – seismic vulnerability an analysis from east coast of india
PDF
Can fracture mechanics predict damage due disaster of structures
PDF
Assessment of seismic susceptibility of rc buildings
PDF
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
PDF
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Likely impacts of hudhud on the environment of visakhapatnam
Impact of flood disaster in a drought prone area – case study of alampur vill...
Hudhud cyclone – a severe disaster in visakhapatnam
Groundwater investigation using geophysical methods a case study of pydibhim...
Flood related disasters concerned to urban flooding in bangalore, india
Enhancing post disaster recovery by optimal infrastructure capacity building
Effect of lintel and lintel band on the global performance of reinforced conc...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to buildings, infrastrucuture and landscape elements along the be...
Shear strength of rc deep beam panels – a review
Role of voluntary teams of professional engineers in dissater management – ex...
Risk analysis and environmental hazard management
Review study on performance of seismically tested repaired shear walls
Monitoring and assessment of air quality with reference to dust particles (pm...
Low cost wireless sensor networks and smartphone applications for disaster ma...
Coastal zones – seismic vulnerability an analysis from east coast of india
Can fracture mechanics predict damage due disaster of structures
Assessment of seismic susceptibility of rc buildings
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...

Recently uploaded (20)

PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPT
Mechanical Engineering MATERIALS Selection
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPT
Project quality management in manufacturing
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
DOCX
573137875-Attendance-Management-System-original
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Arduino robotics embedded978-1-4302-3184-4.pdf
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
CH1 Production IntroductoryConcepts.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
CYBER-CRIMES AND SECURITY A guide to understanding
Lecture Notes Electrical Wiring System Components
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Mechanical Engineering MATERIALS Selection
Model Code of Practice - Construction Work - 21102022 .pdf
Internet of Things (IOT) - A guide to understanding
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Project quality management in manufacturing
UNIT 4 Total Quality Management .pptx
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
OOP with Java - Java Introduction (Basics)
Embodied AI: Ushering in the Next Era of Intelligent Systems
573137875-Attendance-Management-System-original

Load balancing in public cloud combining the concepts of data mining and networking

  • 1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 __________________________________________________________________________________________ Volume: 03 Special Issue: 03 | May-2014 | NCRIET-2014, Available @ http://www.ijret.org 167 LOAD BALANCING IN PUBLIC CLOUD COMBINING THE CONCEPTS OF DATA MINING AND NETWORKING Priyanka R M. Tech Student, Dept. of Computer Science and Engineering, AIET, Karnataka, India Abstract Load balancing in the cloud computing environment has an important impact on the performance of whole system. A Good load balancing method makes cloud computing more efficient with increased user satisfaction. The combined concepts of networking, data mining and cloud computing technology are used to achieve a good load balancing strategy. Cloud partitioning helps to simplify the load balancing problem. The algorithms from networking and data mining are used to achieve cloud partitions. A cloud partitioning concepts introduced here is used in load balancing model for the public cloud with a switch mechanism to choose different strategies for different situations. The algorithm applies the round robin and game theory after cloud partitioning as the load balancing strategy to improve the efficiency in the public cloud environment. Keywords: Load balancing, public cloud, Cloud partition -----------------------------------------------------------------------***----------------------------------------------------------------------- 1. INTRODUCTION Cloud Computing, the long-held dream of computing is similar to utility computing. It is expected to accelerate innovation and business agility of the IT industry, enabling software to become more attractive as a service and shaping the way IT hardware is designed and purchased [1]. Cloud computing provides a distributed computing environment that focuses on providing a wide range of users with distributed access to virtualized, scalable hardware and/or software infrastructure over the internet. Cloud computing provides hardware and software packages which are delivered as a service to clients over an outsized scale network [2]. 1.1 Load balancing concepts Load balancing is a process of distributing the total load on to the individual nodes of the collective system to maximize throughput, resource utilization and to minimize the response time, along with removing a condition in which some of the nodes are heavily loaded while some others are light. [3] Load balancer is a software program which receives connection request from clients and forwards it to one of the backend server which replies accordingly. Because of this the client will be unaware of where the data is stored. This separation also has security benefit which hides the structure of internal network and helps to prevent attacks. In cloud computing environment, there is random arrival of jobs with random CPU utilization. Such requirements can load a specific resources heavily, while the other resources are less loaded. 2. RELATED WORK There are many studies being conducted in this stream. However, load balancing in the cloud is still a new problem that needs new architectures which can adapt to changing needs. Soumya Ray and Ajanta De Sarkar [15] proposed a brief review of existing load balancing algorithms. K. Ramana, A. Subramanyam and A. Ananda Raohave [16] have put forth that Load balancing algorithm tries to balance the total system load by transferring the workload from heavily loaded nodes to lightly loaded nodes and also presented the performance analysis of various load balancing algorithms based on different qualitative parameters, considering static and dynamic load balancing approaches. In the existing system the partitioning of cloud is based on area which might not always be well suited in all situations. Hence, there is a need of new partitioning method using which we can achieve a better performance. 3. SYSTEM MODEL The load balancing strategy involves creating cloud partitions. A cloud partition is a sub area of public cloud. Here the divisions are based on the VDBSCAN algorithm. Once the public cloud is partitioned, then the load balancing starts when a job arrives at the system. The main controller decides which cloud partition should receive the job. The load balancer assigned for each partition then decides how to assign the jobs to the nodes. When the load status of a cloud partition is idle or normal, this task can is accomplished locally. If the cloud partition is overloaded, this job should be transferred to another partition. The whole process is shown in Fig.1.
  • 2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 __________________________________________________________________________________________ Volume: 03 Special Issue: 03 | May-2014 | NCRIET-2014, Available @ http://www.ijret.org 168 We are using following techniques. 1. Dijkstra’s algorithm to find the shortest distance between two nodes 2. VDBSCAN Cluster based Cloud Partition 3. Round-Robin Technique 4. Game Theory Technique. Following algorithm shows the procedure Algorithm 1 Begin Apply Dijkstra’s algorithm to find the shortest path Apply VDBSCAN algorithm to find clusters and make each cluster as a partition while job do searchBestPartition (job); if (partitionState == idle || partitionState == normal) then Send Job to Partition; else search for another Partition; end if end while end Fig -1: Job assignment strategy 3.1 Cloud Partition A cloud partition is a sub area of public cloud. Partitioning will simplify the process of load balancing. 3.1.1 Dijkstra’s Algorithm The use of networking concept will determine the distance at which each server is placed. For this, here, we use distance formula to find weights and then use the dijkstra’s algorithm [4]. Dijkstra’s algorithm solves the single-source shortest- paths problem on a directed graph G= (V,E) and all edges have non-negative weights. In this section, therefore, we assume that w(u,v)>=0 for each edge for each edge E(u,v) [5]. Dijkstra’s algorithm maintains a set S of vertices whose final shortest-path weights from the source S have already been determined. The algorithm iteratively selects the vertex u ∈ V- S with the minimum shortest-path estimate, adds u to set S , and relaxes all edges leaving u. Here, we use a min-priority queue Q of vertices, keyed by their values of d. The following algorithm is referred from [5]. DIJKSTRA(G,w,s) INITIALIZE -SINGLE-SOURCE(.G, s) S=휑; Q=G.V while Q!=휑 u=EXTRACT-MIN(Q) S=SU{u} for each vertex v G.Adj[u] RELAX (u,v,w) Thus, using the Dijkstra’s algorithm we first find the shortest path and then we use the shortest path concept in VDBSCAN method to develop clusters. 3.1.2 DBSCAN In the pervious step, the shortest distance between any two points is calculated. This value is used in next step to form clusters. Here we use VDBSCAN( Varied Density Based SCANning) method to form clusters. DBSCAN can find clusters of arbitrary shape. However, clusters that lie close to each other tend to belong to the same class. Its computing process is based on six rules or definitions, creating two lemmas [6][13]. Definition 1: (The Eps-neighbourhood of a point) The NEps(p) represents the Eps-neighbourhood of a point p and is defined by NEps(p) = {q D|dist(p,q)<Eps} (1)
  • 3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 __________________________________________________________________________________________ Volume: 03 Special Issue: 03 | May-2014 | NCRIET-2014, Available @ http://www.ijret.org 169 Where D is set of given points. For a point to belong to a cluster it needs to have at least one other point that lies closer to it than the distance Eps. Definition 2: (Directly density-reachable) There are two kinds of points belonging to a cluster; there are border points and core points [6]. “The Eps-neighborhood of a border point tends to have significantly less points than the Eps-neighborhood of a core point”. The border points will still be a part of the cluster and in order to include these points, they must belong to the Eps- neighborhood of a core point q. pNEps (q) (2) In order for point q to be a core point it needs to have a minimum number of points within its Eps-neighborhood | NEps(q)| ≥ MinPts (core point condition) (3) Definition 3: ( Density-reachable)“A point p is density- reachable from a point q with respect to Eps and MinPts if there is a chain of points p1...,pn, p1=q, pn=p such that pi+1 is directly density-reachable from pi.” Definition 4: (Density-connected) There are cases when two border points will belong to the same cluster but where the two border points don’t share a specific core point. In these situations the points will not be density-reachable from each other. There must however be a core point q from which they are both density-reachable. “A point p is density-connected to a point q with respect to Eps and MinPts if there is a point o such that both, p and q are density-reachable from o with respect to Eps and MinPts.” Definition 5: (cluster) If point p is a part of a cluster C and point q is density- reachable from point p with respect to a given distance and a minimum number of points within that distance, then q is also a part of cluster C. 1) "∀ p, q: if p C and q is density-reachable from p with respect to Eps and MinPts, then qC. Two points belongs to the same cluster C, is the same as saying that p is density-connected to q with respect to the given distance and the number of points within that given distance. 2) ∀p, q C: p is density-connected to q with respect to Eps and MinPts. Definition 6: (noise) Noise is the set of points, that don’t belong to any of the clusters. Lemma 1: A cluster can be formed from any of its core points and will always have the same shape. Lemma 2: Let p be a core point in cluster C with a given minimum distance (Eps) and a minimum number of points within that distance (MinPts). If the set O is density-reachable from p with respect to the same Eps and MinPts, then C is equal to the set O. VDBSCAN VDBSCAN[13] algorithm detects cluster with varied density as well as automatically selects several values of input parameter Eps for different densities.Peng Liu, Dong Zhou and Naijun Wu have proposed a detailed description in [14]. Description of the Algorithm In general the algorithm has two steps, choosing parameters Epsi and cluster with varied densities. The algorithm [13] is as follows, (i) It calculates and stores k-dist for each project and partition the k-dist plots. (ii) The number of densities is given intuitively by k- dist plot. (iii) The parameter Epsi is selected automatically for each density. (iv) Scan the dataset and cluster different densities using corresponding Epsi (v) Display the valid cluster with respect to varied density. Algorithm: Partition k-dist plot. Give thresholds of parameters Epsi(i=1,2,…..n) For each Epsi(i=1,2,…..n) a) Eps = Epsi b) Adopt DBSCAN algorithm for points that are not marked. c) Mark points as ci Display all the marked points as corresponding clusters. By using this VDBSCAN algorithm we can form clusters of nodes and make each cluster as a partition. 3.1.3 Assigning Jobs to the Cloud Partition When a job arrives at the public cloud, the first step is to choose the right partition. Job assignment strategy is presented from [7]. The cloud partition status can be divided into three types: (1) Idle: When the percentage of idle nodes exceeds α, change to idle status. (2) Normal: When the percentage of the normal nodes exceeds β, change to normal load status.
  • 4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 __________________________________________________________________________________________ Volume: 03 Special Issue: 03 | May-2014 | NCRIET-2014, Available @ http://www.ijret.org 170 (3) Overload: When the percentage of the overloaded nodes exceeds γ, change to overloaded status. The parameters α, β and γ are to be set by the cloud partition balancers [7]. When the load status of a cloud partition is idle or normal, this partitioning can be accomplished locally [7]. If the cloud partition load status is not normal or idle, this job should be transferred to another partition. The partition load balancer then decides how to assign the jobs to the nodes. Server load status is divided into three types. If one cloud server is overloaded and it again getting new client request while other servers are in Idle or Normal state then following algorithms are used.  Idle: If it is in idle status, this job should be transferred to another partition by using Round Robin algorithm.  Normal: If it is normal, this job should be transferred to another partition by using Game theory algorithm.  Overload: If it is overload, this job should be transferred to another partition. That partition selected using above two algorithms. Step 1 Define a load parameter set: F = {F1, F2, ..Fm }with each Fi(1 <= i<= m, Fi∈[0,1]) parameter being either static or dynamic. m represents the total number of the parameters.[7] Step 2 Compute the load degree as: Load degree(N ) = αiFi 푚푖 =0 (4) αi( αi n푖 =0=1) are weights that may differ for different kinds of jobs. N represents the current node.[7] Step 3 Define evaluation benchmarks. Calculate the average cloud partition degree from the node load degree statistics as: Load degreeavg = Load_degree(Ni) 푛 푖=0 푛 (5) The bench mark Load_degreehigh is then set for different situations based on the Load_degreeavg.[7] Step 4 Three nodes load status levels are then defined as[7]:  Idle When Load_degree(N) = 0 (6) There is no job being processed by this node so the status is changed to Idle.  Normal For 0 < Load degree(N ) <= Load_degreehigh(7) The node is normal and it can process other jobs.  Overloaded When Load degreehigh<= Load_degree(N) (8) The node is not available and can not receive jobs until it returns to the normal.[7] The load degree results are input into the Load Status Tables created by the cloud partition balancers. Each balancer has a Load Status Table and refreshes it each fixed period T. The table is then used by the balancers to calculate the partition status. Each partition status has a different load balancing solution. When a job arrives at a cloud partition, the balancer assigns the job to the nodes based on its current load strategy. This strategy is changed by the balancers as the cloud partition status changes. 3.1.4 Round Robin Algorithm In the regular Round Robin algorithm, every node has an equal opportunity to be chosen. However, in a public cloud, the configuration and the performance of each node will be not the same; thus, this method may overload some nodes. Thus, an improved Round Robin algorithm is used, which called “Round Robin based on the load status”. Step 1:Job arrives at the main controller Step 2: Job is assigned to balancer based on request location, balancer status Step 3:In particular partition P Set i=0 Set s[n] as no. of servers arranged in increasing order of jobs in P for all n=1,..,n Set s[c] as no. of idle servers in P from s[n] for all c=1,..,n Step 4:When job arrives If s[c] != NULL Then, send the connection to s[i] i= i+1; If i == c Then i=1 Else Go to game theory End if Step 5:go to step 1 3.1.5 Game Theory The players in the game are the nodes and they contend for jobs. Suppose there are n nodes in the current cloud partition with N jobs arriving, then define the following parameters: μi : Processing ability of each node, i = 1,…, n. Фj : Time spent for each job. Ф = ф푁푗 =1j Фj :Time spent by the entire cloud partition, Ф < 휇푁푗 =1i Sij: Fraction of job j that assigned to node i ( 푆푁푗 =1 ij = 1 and 0 <= Sij<= 1). In this model, the most important step is finding appropriate value of Sij. Here “the best reply” method proposed by Grosuet al.[12]can be used to calculate Sij of each node. The Nash Equilibrium here is to minimize the response time of each job.
  • 5. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 __________________________________________________________________________________________ Volume: 03 Special Issue: 03 | May-2014 | NCRIET-2014, Available @ http://www.ijret.org 171 4. CONCLUSIONS Load balancing in public clouds is a difficult task with cloud division problem being the major one. Using clustering algorithms is one way of doing it, and there may be several other ways that get the job done. This problem can be easily solved by combining the concepts of networking and data- mining into the load balancing problem of public cloud FUTURE WORK Since this work is conceptual framework, more work is needed to implement the framework and resolve new problems. (1) Cloud division rules: It can be further enhanced by devising other methods and comparing performance. The division rule should not simply be based on the geographic location. (2) Set the refresh period: Refresh period to communicate the load status should be properly set. It should neither be too short nor too long. Tests and statistical tools can be used to set a reasonable refresh periods. (3) Other load balance strategy: Other load balancing strategies may also be used, based on tests comparing different strategies. Many tests are needed to guarantee system availability and efficiency. REFERENCES [1].Michael Armbrust, Armando Fox, Gunho Lee and Ion Stoica (2009) “Above the clouds: a berkeley view of cloud computing,” University of California at Berkeley Technical Report No. UCB/EECS-2009-28 [2]. Soumya Ray and Ajanta De Sarkar “Execution analysis of load balancing algorithms in cloud computing environment,” International Journal on Cloud Computing: Services and Architecture (IJCCSA), Vol.2, No.5, October 2012 [3]. K. Ramana, A. Subramanyam and A. Ananda Rao, “Comparative analysis of distributed web server system load balancing algorithms using qualitative parameters,” VSRD- IJCSIT, Vol. 1 (8), 2011, 592-600 [4]. Network Working Group J. Moy, Editor Request for Comments: 1245, “OSPF protocol analysis,” Editor Request for Comments: 1245 Proteon, Inc. July 1991 [5]. T. H. Cormen et al, “Introduction to algorithms,” 3rd edition, Prentice-Hall of India,2010 [6]. Henrik Bäcklund, Anders Hedblom andNiklasNeijman, A density-based spatial clustering of application with noise,” Data Mining TNM033 2011-11-30 LinköpingsUniversitet - ITN [7].Gaochao Xu, Junjie Pang, and Xiaodong Fu, “A Load Balancing Model Based on Cloud Partitioning for the Public Cloud,” IEEE transactions on cloud computing year 2013 [8].Suriya Begum, Dr. Prashanth C.S.R, “Review of load balancing in cloud computing,” IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 1, No 2, January 2013 [9]. P. Mell and T. Grance, “The NIST definition of cloud computing,”NIST Special Publication 800-145 September 2011 [10]. Martin J. Osborne, “An introduction to game theory 1995–2000”: Text book, University of Toronto, Canada, Oxford university press 2000, pg 1-8 [11]. Jeffrey M. Galloway, Karl L. Smith and Susan S. Vrbsky, “Power aware load balancing for cloud computing,” World Congress on Engineering and Computer Science 2011Vol I WCECS 2011, October 19-21, 2011, San Francisco, USA [12].Daniel Grosu and Anthony T. Chronopoulos, “Noncooperative load balancing in distributed systems,” Journel of parallel and distributed computing ELSEVIER Comput. 65 (2005) 1022 – 1034 [13].M.Parimala, Daphne Lopez, N.C. Senthilkumar, “ A survey on density based clustering algorithms for mining large spatial databases,” International Journal of Advanced Science and Technology Vol. 31, June, 2011 [14]. Peng Liu, Dong Zhou andNaijun Wu, “VDBSCAN: Varied density based spatial clustering of applications with noise,” 1-4244-0885-7/07/$20.00 ©2007 IEEE [15]. Soumya Ray and Ajanta De Sarkar “Execution Analysis of Load Balancing Algorithms in Cloud Computing Environment,” International Journal on Cloud Computing: Services and Architecture (IJCCSA), Vol.2, No.5, October 2012 [16]. K. Ramana, A. Subramanyam and A. AnandaRao,“Comparative Analysis of Distributed Web Server System Load Balancing Algorithms Using Qualitative Parameters,” VSRD-IJCSIT, Vol. 1 (8), 2011, 592-600