SlideShare a Scribd company logo
UNIT -II
UNIT I INTRODUCTION
Evolution of Distributed computing: Scalable computing
over the Internet – Technologies for network based
systems – clusters of cooperative computers - Grid
computing Infrastructures – cloud computing -
service oriented architecture – Introduction to Grid
Architecture and standards – Elements of Grid –
Overview of Grid Architecture.
2 Dr Gnanasekaran Thangavel 8/30/2016
Distributed Computing
Definition
y“A distributed system consists of multiple
autonomous computers that communicate through
a computer network.
y“Distributed computing utilizes a network of many
computers, each accomplishing a portion of an
overall task, to achieve a computational result
much more quickly than with a single computer.”
y“Distributed computing is any computing that
involves multiple computers remote from each
other that each have a role in a computation
3 Dr Gnanasekaran Thangavel 8/30/2016
Introduction
y A distributed system is one in which hardware or
software components located at networked
computers communicate and coordinate their actions
only by message passing.
y In the term distributed computing, the word
distributed means spread out across space. Thus,
distributed computing is an activity performed on a
spatially distributed system.
y These networked computers may be in the same
4 Dr Gnanasekaran Thangavel 8/30/2016
Introduction
Cooperation
Cooperation
Cooperation
Internet
Large-scale
Application
Resource
Management
Subscription
Distribution
Distribution Distribution
Distribution
Agent
Agent Agent
Agent
Job Request
5 Dr Gnanasekaran Thangavel 8/30/2016
Motivation
y Inherently distributed applications
y Performance/cost
y Resource sharing
y Flexibility and extensibility
y Availability and fault tolerance
y Scalability
y Network connectivity is increasing.
y Combination of cheap processors often more cost-effective
than one expensive fast system.
y Potential increase of reliability.
6 Dr Gnanasekaran Thangavel 8/30/2016
History
y 1975 – 1985
yParallel computing was favored in the early years
yPrimarily vector-based at first
yGradually more thread-based parallelism was introduced
yThe first distributed computing programs were a pair of
programs called Creeper and Reaper invented in 1970s
yEthernet that was invented in 1970s.
yARPANET e-mail was invented in the early 1970s and
probably the earliest example of a large-scale distributed
application.
7 Dr Gnanasekaran Thangavel 8/30/2016
History
y 1985 -1995
yMassively parallel architectures start rising and message
passing interface and other libraries developed
yBandwidth was a big problem
yThe first Internet-based distributed computing project was
started in 1988 by the DEC System Research Center.
yDistributed.net was a project founded in 1997 - considered
the first to use the internet to distribute data for calculation
and collect the results,
8 Dr Gnanasekaran Thangavel 8/30/2016
History
y 1995 – Today
yCluster/grid architecture increasingly dominant
ySpecial node machines eschewed in favor of COTS
technologies
yWeb-wide cluster software
yGoogle take this to the extreme (thousands of nodes/cluster)
ySETI@Home started in May 1999 - analyze the radio signals
that were being collected by the Arecibo Radio Telescope in
Puerto Rico.
9 Dr Gnanasekaran Thangavel 8/30/2016
Goal
y Making Resources Accessible
yData sharing and device sharing
y Distribution Transparency
yAccess, location, migration, relocation, replication,
concurrency, failure
y Communication
yMake human-to-human comm. easier. E.g.. : electronic mail
y Flexibility
ySpread the work load over the available machines in the
most cost effective way
y To coordinate the use of shared resources
y To solve large computational problem
10 Dr Gnanasekaran Thangavel 8/30/2016
Characteristics
yResource Sharing
yOpenness
yConcurrency
yScalability
yFault Tolerance
yTransparency
11 Dr Gnanasekaran Thangavel 8/30/2016
Architecture
yClient-server
y3-tier architecture
yN-tier architecture
yloose coupling, or tight coupling
yPeer-to-peer
ySpace based
12 Dr Gnanasekaran Thangavel 8/30/2016
yExamples of commercial application :
yDatabase Management System
yDistributed computing using mobile agents
yLocal intranet
yInternet (World Wide Web)
yJAVA Remote Method Invocation (RMI)
Application
13 Dr Gnanasekaran Thangavel 8/30/2016
Distributed Computing Using Mobile Agents
y Mobile agents can be wandering around in a network
using free resources for their own computations.
14 Dr Gnanasekaran Thangavel 8/30/2016
Local Intranet
y A portion of Internet that is separately administered & supports
internal sharing of resources (file/storage systems and printers) is
called local intranet.
15 Dr Gnanasekaran Thangavel 8/30/2016
Internet
y The Internet is a global system of interconnected computer
networks that use the standardized Internet Protocol Suite
(TCP/IP).
16 Dr Gnanasekaran Thangavel 8/30/2016
JAVA RMI
y Embedded in language Java:-
y Object variant of remote procedure call
y Adds naming compared with RPC (Remote Procedure Call)
y Restricted to Java environments
RMI Architecture
17 Dr Gnanasekaran Thangavel 8/30/2016
Categories of Applications in distributed
computing
y Science
y Life Sciences
y Cryptography
y Internet
y Financial
y Mathematics
y Language
y Art
y Puzzles/Games
y Miscellaneous
y Distributed Human Project
y Collaborative Knowledge Bases
y Charity
18 Dr Gnanasekaran Thangavel 8/30/2016
Advantages
y Economics:-
y Computers harnessed together give a better price/performance ratio
than mainframes.
y Speed:-
y A distributed system may have more total computing power than a
mainframe.
y Inherent distribution of applications:-
y Some applications are inherently distributed. E.g., an ATM-banking
application.
y Reliability:-
y If one machine crashes, the system as a whole can still survive if
you have multiple server machines and multiple storage devices
(redundancy).
y Extensibility and Incremental Growth:-
y Possible to gradually scale up (in terms of processing power and
functionality) by adding more sources (both hardware and software).
19 Dr Gnanasekaran Thangavel 8/30/2016
Disadvantages
y Complexity :-
y Lack of experience in designing, and implementing a distributed
system. E.g. which platform (hardware and OS) to use, which
language to use etc.
y Network problem:-
y If the network underlying a distributed system saturates or goes
down, then the distributed system will be effectively disabled thus
negating most of the advantages of the distributed system.
y Security:-
y Security is a major hazard since easy access to data means easy
access to secret data as well.
20 Dr Gnanasekaran Thangavel 8/30/2016
Issues and Challenges
y Heterogeneity of components :-
yvariety or differences that apply to computer hardware,
network, OS, programming language and implementations
by different developers.
yAll differences in representation must be deal with if to do
message exchange.
yExample : different call for exchange message in UNIX
different from Windows.
y Openness:-
ySystem can be extended and re-implemented in various
ways.
yCannot be achieved unless the specification and
documentation are made available to software developer.
yThe most challenge to designer is to tackle the complexity of
21 Dr Gnanasekaran Thangavel 8/30/2016
Issues and Challenges cont…
yTransparency:-
yAim : make certain aspects of distribution are
invisible to the application programmer ; focus
on design of their particular application.
yThey not concern the locations and details of
how it operate, either replicated or migrated.
yFailures can be presented to application
programmers in the form of exceptions – must
be handled.
22 Dr Gnanasekaran Thangavel 8/30/2016
Issues and Challenges cont…
y Transparency:-
yThis concept can be summarize as shown in this
Figure:
23 Dr Gnanasekaran Thangavel 8/30/2016
Issues and Challenges cont…
y Security:-
ySecurity for information resources in distributed system
have 3 components :
a. Confidentiality : protection against disclosure to
unauthorized individuals.
b. Integrity : protection against alteration/corruption
c. Availability : protection against interference with the
means to access the resources.
yThe challenge is to send sensitive information over Internet
in a secure manner and to identify a remote user or other
agent correctly.
24 Dr Gnanasekaran Thangavel 8/30/2016
Issues and Challenges cont..
y Scalability :-
yDistributed computing operates at many different scales,
ranging from small Intranet to Internet.
yA system is scalable if there is significant increase in the
number of resources and users.
yThe challenges is :
a. controlling the cost of physical resources.
b. controlling the performance loss.
c. preventing software resource running out.
d. avoiding performance bottlenecks.
25 Dr Gnanasekaran Thangavel 8/30/2016
Issues and Challenges cont…
y Failure Handling :-
yFailures in a distributed system are partial – some
components fail while others can function.
yThat’s why handling the failures are difficult
a. Detecting failures : to manage the presence of failures
cannot be detected but may be suspected.
b. Masking failures : hiding failure not guaranteed in the
worst case.
y Concurrency :-
yWhere applications/services process concurrency, it will
effect a conflict in operations with one another and produce
inconsistence results.
26 Dr Gnanasekaran Thangavel 8/30/2016
Conclusion
y The concept of distributed computing is the most efficient
way to achieve the optimization.
y Distributed computing is anywhere : intranet, Internet or
mobile ubiquitous computing (laptop, PDAs, pagers, smart
watches, hi-fi systems)
y It deals with hardware and software systems, that contain
more than one processing / storage and run in concurrently.
y Main motivation factor is resource sharing; such as files ,
printers, web pages or database records.
y Grid computing and cloud computing are form of distributed
computing.
27 Dr Gnanasekaran Thangavel 8/30/2016
Grid Computing
Grid computing is a form of distributed computing whereby a
"super and virtual computer" is composed of a cluster of
networked, loosely coupled computers, acting in concert to
perform very large tasks.
Grid computing (Foster and Kesselman, 1999) is a growing
technology that facilitates the executions of large-scale
resource intensive applications on geographically distributed
computing resources.
Facilitates flexible, secure, coordinated large scale resource
sharing among dynamic collections of individuals, institutions,
and resource
Enable communities (“virtual organizations”) to share
8/30/2016
28 Dr Gnanasekaran Thangavel
Criteria for a Grid:
Coordinates resources that are not subject to centralized
control.
Uses standard, open, general-purpose protocols and
interfaces.
Delivers nontrivial qualities of service.
Benefits
ƒ Exploit Underutilized resources
ƒ Resource load Balancing
ƒ Virtualize resources across an enterprise
ƒ Data Grids, Compute Grids
ƒ Enable collaboration for virtual organizations
29 Dr Gnanasekaran Thangavel 8/30/2016
Grid Applications
Data and computationally intensive applications:
This technology has been applied to computationally-intensive scientific,
mathematical, and academic problems like drug discovery, economic
forecasting, seismic analysis back office data processing in support of e-
commerce
y A chemist may utilize hundreds of processors to screen thousands of
compounds per hour.
y Teams of engineers worldwide pool resources to analyze terabytes of
structural data.
y Meteorologists seek to visualize and analyze petabytes of climate data
with enormous computational demands.
Resource sharing
y Computers, storage, sensors, networks, …
y Sharing always conditional: issues of trust, policy, negotiation, payment,
…
8/30/2016
30 Dr Gnanasekaran Thangavel
Grid Topologies
• Intragrid
– Local grid within an organization
– Trust based on personal contracts
• Extragrid
– Resources of a consortium of organizations
connected through a (Virtual) Private Network
– Trust based on Business to Business contracts
• Intergrid
– Global sharing of resources through the internet
– Trust based on certification
8/30/2016
31 Dr Gnanasekaran Thangavel
8/30/2016
Dr Gnanasekaran Thangavel
32
Computational Grid
“A computational grid is a hardware and software
infrastructure that provides dependable, consistent, pervasive,
and inexpensive access to high-end computational
capabilities.”
”The Grid: Blueprint for a New Computing Infrastructure”,
Kesselman & Foster
Example : Science Grid (US Department of Energy)
Data Grid
y A data grid is a grid computing system that deals with data —
the controlled sharing and management of large amounts
of distributed data.
y Data Grid is the storage component of a grid environment.
Scientific and engineering applications require access to large
amounts of data, and often this data is widely distributed. A
data grid provides seamless access to the local or remote data
required to complete compute intensive calculations.
Example :
Biomedical informatics Research Network (BIRN),
the Southern California earthquake Center (SCEC). 8/30/2016
33 Dr Gnanasekaran Thangavel
Methods of Grid Computing
y Distributed Supercomputing
y High-Throughput Computing
y On-Demand Computing
y Data-Intensive Computing
y Collaborative Computing
y Logistical Networking
8/30/2016
34 Dr Gnanasekaran Thangavel
Distributed Supercomputing
y Combining multiple high-capacity resources on a
computational grid into a single, virtual distributed
supercomputer.
y Tackle problems that cannot be solved on a single
system.
8/30/2016
35 Dr Gnanasekaran Thangavel
High-Throughput Computing
y Uses the grid to schedule large numbers of loosely
coupled or independent tasks, with the goal of putting
unused processor cycles to work.
On-Demand Computing
| Uses grid capabilities to meet short-term requirements for
resources that are not locally accessible.
| Models real-time computing demands.
8/30/2016
36 Dr Gnanasekaran Thangavel
Collaborative Computing
y Concerned primarily with enabling and enhancing human-to-
human interactions.
y Applications are often structured in terms of a virtual shared
space.
Data-Intensive Computing
| The focus is on synthesizing new information from data that is
maintained in geographically distributed repositories, digital
libraries, and databases.
| Particularly useful for distributed data mining.
8/30/2016
37 Dr Gnanasekaran Thangavel
Logistical Networking
y Logistical networks focus on exposing storage
resources inside networks by optimizing the global
scheduling of data transport, and data storage.
y Contrasts with traditional networking, which does not
explicitly model storage resources in the network.
y high-level services for Grid applications
y Called "logistical" because of the analogy it bears with
the systems of warehouses, depots, and distribution
channels. 8/30/2016
38 Dr Gnanasekaran Thangavel
P2P Computing vs Grid Computing
yDiffer in Target Communities
yGrid system deals with more complex, more
powerful, more diverse and highly
interconnected set of resources than
P2P.
yVO
8/30/2016
39 Dr Gnanasekaran Thangavel
A typical view of Grid environment
User
Resource Broker
Grid Resources
Grid Information Service
A User sends computation or data
intensive application to Global Grids in
order to speed up the execution of the
application.
A Resource Broker distribute the jobs in an
application to the Grid resources based on user’s
QoS requirements and details of available Grid
resources for further executions.
Grid Resources (Cluster, PC, Supercomputer,
database, instruments, etc.) in the Global Grid
execute the user jobs.
Grid Information Service system
collects the details of the available Grid
resources and passes the information
to the resource broker.
Computation result
Grid application
Computational jobs
Details of Grid resources
Processed jobs
1
2
3
4
40 Dr Gnanasekaran Thangavel 8/30/2016
Grid Middleware
y Grids are typically managed by grid ware -
a special type of middleware that enable sharing and manage grid
components based on user requirements and resource attributes (e.g.,
capacity, performance)
y Software that connects other software components or applications to
provide the following functions:
Run applications on suitable available resources
– Brokering, Scheduling
Provide uniform, high-level access to resources
– Semantic interfaces
– Web Services, Service Oriented Architectures
Address inter-domain issues of security, policy, etc.
– Federated Identities
Provide application-level status
monitoring and control 8/30/2016
41 Dr Gnanasekaran Thangavel
Middleware
y Globus –chicago Univ
y Condor – Wisconsin Univ – High throughput
computing
y Legion – Virginia Univ – virtual workspaces-
collaborative computing
y IBP – Internet back pane – Tennesse Univ –
logistical networking
y NetSolve – solving scientific problems in
heterogeneous env – high throughput & data
intensive 8/30/2016
42 Dr Gnanasekaran Thangavel
Two Key Grid Computing Groups
The Globus Alliance (www.globus.org)
y Composed of people from:
Argonne National Labs, University of Chicago, University of Southern
California Information Sciences Institute, University of Edinburgh and
others.
y OGSA/I standards initially proposed by the Globus Group
The Global Grid Forum (www.ggf.org)
y Heavy involvement of Academic Groups and Industry
y (e.g. IBM Grid Computing, HP, United Devices, Oracle, UK e-Science
Programme, US DOE, US NSF, Indiana University, and many others)
y Process
y Meets three times annually
y Solicits involvement from industry, research groups, and academics
8/30/2016
43 Dr Gnanasekaran Thangavel
Some of the Major Grid Projects
Name URL/Sponsor Focus
EuroGrid, Grid
Interoperability (GRIP)
eurogrid.org
European Union
Create tech for remote access to super comp resources
& simulation codes; in GRIP, integrate with Globus
Toolkit™
Fusion Collaboratory fusiongrid.org
DOE Off. Science
Create a national computational collaboratory for fusion
research
Globus Project™ globus.org
DARPA, DOE, NSF,
NASA, Msoft
Research on Grid technologies; development and
support of Globus Toolkit™; application and deployment
GridLab gridlab.org
European Union
Grid technologies and applications
GridPP gridpp.ac.uk
U.K. eScience
Create & apply an operational grid within the U.K. for
particle physics research
Grid Research Integration
Dev. & Support Center
grids-center.org
NSF
Integration, deployment, support of the NSF
Middleware Infrastructure for research & education
8/30/2016
44 Dr Gnanasekaran Thangavel
Grid Architecture
8/30/2016
45 Dr Gnanasekaran Thangavel
The Hourglass Model
y Focus on architecture issues
y Propose set of core services as basic
infrastructure
y Used to construct high-level, domain-specific
solutions (diverse)
y Design principles
y Keep participation cost low
y Enable local control
y Support for adaptation
y “IP hourglass” model
Diverse global services
Core
services
Local OS
A p p l i c a t i o n s
8/30/2016
46 Dr Gnanasekaran Thangavel
Layered Grid Architecture
(By Analogy to Internet Architecture)
Application
Fabric
“Controlling things locally”: Access to, & control
of, resources
Connectivity
“Talking to things”: communication (Internet
protocols) & security
Resource
“Sharing single resources”: negotiating access,
controlling use
Collective
“Coordinating multiple resources”: ubiquitous
infrastructure services, app-specific distributed
services
Internet
Transport
Application
Link
Internet
Protocol
Architecture
8/30/2016
47 Dr Gnanasekaran Thangavel
Example:
Data Grid Architecture
Discipline-Specific Data Grid Application
Coherency control, replica selection, task management, virtual data catalog,
virtual data code catalog, …
Replica catalog, replica management, co-allocation, certificate authorities,
metadata catalogs,
Access to data, access to computers, access to network performance data, …
Communication, service discovery (DNS), authentication, authorization,
delegation
Storage systems, clusters, networks, network caches, …
Collective
(App)
App
Collective
(Generic)
Resource
Connect
Fabric
8/30/2016
48 Dr Gnanasekaran Thangavel
Simulation tools
y GridSim – job scheduling
y SimGrid – single client multiserver scheduling
y Bricks – scheduling
y GangSim- Ganglia VO
y OptoSim – Data Grid Simulations
y G3S – Grid Security services Simulator – security
services
49 Dr Gnanasekaran Thangavel 8/30/2016
Simulation tool
GridSim is a Java-based toolkit for modeling, and
simulation of distributed resource management and
scheduling for conventional Grid environment.
GridSim is based on SimJava, a general purpose discrete-
event simulation package implemented in Java.
All components in GridSim communicate with each other
through message passing operations defined by SimJava.
50 Dr Gnanasekaran Thangavel 8/30/2016
Salient features of the GridSim
y It allows modeling of heterogeneous types of resources.
y Resources can be modeled operating under space- or time-
shared mode.
y Resource capability can be defined (in the form of MIPS
(Million Instructions Per Second) benchmark.
y Resources can be located in any time zone.
y Weekends and holidays can be mapped depending on
resource’s local time to model non-Grid (local) workload.
y Resources can be booked for advance reservation.
y Applications with different parallel application models can
be simulated.
51 Dr Gnanasekaran Thangavel 8/30/2016
Salient features of the GridSim
y Application tasks can be heterogeneous and they can be
CPU or I/O intensive.
y There is no limit on the number of application jobs that can be
submitted to a resource.
y Multiple user entities can submit tasks for execution
simultaneously in the same resource, which may be time-
shared or space-shared. This feature helps in building
schedulers that can use different market-driven economic
models for selecting services competitively.
y Network speed between resources can be specified.
y It supports simulation of both static and dynamic schedulers.
y Statistics of all or selected operations can be recorded and
they can be analyzed using GridSim statistics analysis
methods.
52 Dr Gnanasekaran Thangavel 8/30/2016
A Modular Architecture for GridSim Platform and Components.
Appn Conf Res Conf User Req Grid Sc Output
Application, User, Grid Scenario’s input and Results
Grid Resource Brokers or Schedulers
…
Appn
modeling
Res entity Info serv Job mgmt Res alloc Statis
GridSim Toolkit
Single CPU SMPs Clusters Load Netw Reservation
Resource Modeling and Simulation
SimJava Distributed SimJava
Basic Discrete Event Simulation Infrastructure
PCs Workstation Clusters
SMPs Distributed Resources
Virtual Machine
53 Dr Gnanasekaran Thangavel 8/30/2016
Web 2.0, Clouds, and Internet of Things
HPC: High - Performance Computing HTC: High - Throughput Computing
P2P: Peer to Peer MPP: Massively Parallel Processors
54 Dr Gnanasekaran Thangavel 8/30/2016
55
What is a Service Oriented Architecture?
56
What is a Service Oriented Architecture (SOA)?
yA method of design, deployment, and
management of both applications and the
software infrastructure where:
yAll software is organized into business
services that are network accessible and
executable.
yService interfaces are based on public
standards for interoperability.
57
Key Characteristics of SOA
yQuality of service, security and
performance are specified.
ySoftware infrastructure is responsible for
managing.
yServices are cataloged and discoverable.
yData are cataloged and discoverable.
yProtocols use only industry standards.
58
What is a “Service”?
yA Service is a reusable component.
yA Service changes business data from one state
to another.
yA Service is the only way how data is accessed.
yIf you can describe a component in WSDL, it is a
Service.
59
Information Technology is Not SOA
Business Mission
Information Management
Information Systems
Systems Design
Computing & Communications
Information
Technology
SOA
60
Why Getting SOA Will be Difficult
y Managing for Projects:
ySoftware: 1 - 4 years
yHardware: 3 - 5 years;
yCommunications: 1 - 3 years;
yProject Managers: 2 - 4 years;
yReliable funding: 1 - 4 years;
yUser turnover: 30%/year;
ySecurity risks: 1 minute or less.
y Managing for SOA:
yData: forever.
yInfrastructure: 10+ years.
61
Why Managing Business Systems is Difficult?
y 40 Million lines of code in Windows XP is unknowable.
y Testing application (3 Million lines) requires >1015
tests.
y Probability correct data entry for a supply item is
<65%.
y There are >100 formats that identify a person in DoD.
62
How to View Organizing for SOA
STABILITY HERE
VARIETY HERE
C orp o rate Po licy, C o rp o rate Stan d ard s, Referen ce M o d els,
D ata M anagem en t and To o ls, I n tegrated System s
C o nfigu ratio n D ata Base, Shared C o m p utin g an d
Telecom m un icatio n s
A p p licatio ns D evelo p m en t & M ainten ance
EN T ERPRI SE LEV EL
PRO C ESS LEV EL
BU SI N ESS LEV EL
A PPLI C AT I O N LEV EL
LO C A L LEV EL
G rap h ic I n fo W in d o w , Perso nal To o ls, I n q u iry Lan gu ages
C u sto m iz ed A p plicatio n s, Pro to typ in g To ols, Lo cal
A pp licatio n s and Files
A p p l icati on s
Secu rity Barri er
Bu si n ess
Secu ri ty Barrier
Process
Secu ri ty B arrier
Priv acy an d
I n d i v i d u al
Secu rity Barri er
G LO BA L LEV EL
I n d ustry Stan d ard s, C o m m ercial O ff-the-Sh elf
Pro du cts an d Services
PERSO N A L LEV EL
Private A p plication s and Files
Fu n ctio n al Pro cess A
Fu n ction al Pro cess B
Fu n ction al Pro cess C
Fun ction al Pro cess D
OSD
Service A Service B
63
SOA Must Reflect Timing
Corporate Policy, Corporate Standards, Reference Models,
Data Management and Tools, Integrated Systems
Configuration Data Base, Shared Computing and
Telecommunications, Security and Survivability
Business A Business B
Infrastructure
Support
Applications Development & Maintenance
ENTERPRISE
PROCESS
BUSINESS
APPLICATION
LOCAL
Graphic InfoWindow, Personal Tools, Inquiry Languages
Customized Applications, Prototyping Tools, Local
Applications and Files
GLOBAL
Industry Standards, Commercial Off-the-Shelf
Products and Services
PERSONAL
Private Applications and Files
Functional Process A
Functional Process B
Functional Process C
Functional Process D
LONG TERM
STABILITY &
TECHNOLOGY
COMPLEXITY
SHORT TERM
ADAPTABILITY &
TECHNOLOGY
SIMPLICITY
64
SOA Must Reflect Conflicting Interests
Enterprise
Missions
Organizations
Local
Personal
65
Organization of Infrastructure Services
Infrastructure
Services
(Enterprise Information)
Data
Services
Security
Services
Computing
Services
Communication
Services
Application
Services
66
Organization of Data Services
Data
Services
Discovery
Services
Management
Services
Collaboration
Services
Interoperability
Services
Semantic
Services
67
Data Interoperability Policies
y Data are an enterprise resource.
y Single-point entry of unique data.
y Enterprise certification of all data definitions.
y Data stewardship defines data custodians.
y Zero defects at point of entry.
y De-conflict data at source, not at higher levels.
y Data aggregations from sources data, not from reports.
68
Data Concepts
y Data Element Definition
yText associated with a unique data element within a data
dictionary that describes the data element, give it a specific
meaning and differentiates it from other data elements.
Definition is precise, concise, non-circular, and
unambiguous. (ISO/IEC 11179 Metadata Registry
specification)
y Data Element Registry
yA label kept by a registration authority that describes a
unique meaning and representation of data elements,
including registration identifiers, definitions, names, value
69
Data and Services Deployment Principles
y Data, services and applications belong to the Enterprise.
y Information is a strategic asset.
y Data and applications cannot be coupled to each other.
y Interfaces must be independent of implementation.
y Data must be visible outside of the applications.
y Semantics and syntax is defined by a community of
interest.
y Data must be understandable and trusted.
70
Organization of Security Services
Security
Services
Transfer
Services
Protection
Services
Certification
Services
Systems
Assurance
Authentication
Services
71
Security Services = Information Assurance
y Conduct Attack/Event Response
yEnsure timely detection and appropriate response to
attacks.
yManage measures required to minimize the
network’s vulnerability.
y Secure Information Exchanges
ySecure information exchanges that occur on the
network with a level of protection that is matched to
the risk of compromise.
y Provide Authorization and Non-Repudiation Services
72
Organization of Computing Services
Computing
Services
Computing
Facilities
Resource
Planning
Control &
Quality
Configuration
Services
Financial
Management
73
Computing Services
y Provide Adaptable Hosting Environments
yGlobal facilities for hosting to the “edge”.
yVirtual environments for data centers.
• Distributed Computing Infrastructure
yData storage, and shared spaces for information
sharing.
• Shared Computing Infrastructure Resources
74
Organization of Communication Services
Communication
Services
Interoperability
Services
Spectrum
Management
Connectivity
Arrangements
Continuity of
Services
Resource
Management
75
Network Services Implementation
yFrom point-to-point communications (push
communications) to network-centric
processes (pull communications).
yData posted to shared space for retrieval.
yNetwork controls assure data synchronization
and access security.
76
Communication Services
yProvide Information Transport
yTransport information, data and services
anywhere.
yEnsures transport between end-user devices
and servers.
77
Organization of Application Services
Application
Services
Component
Repository
Code Binding
Services
Maintenance
Management
Portals
Experimental
Services
79
Example of Development Tools
y Business Process Execution Language, BPEL, is an executable
modeling language. Through XML it enables code generation.
Traditional Approach BPEL Approach
- Hard-coded decision logic - Externalized decision logic
- Developed by IT - Modeled by business analysts
- Maintained by IT - Maintained by policy managers
- Managed by IT - Managed by IT
- Dependent upon custom logs - Automatic logs and process
capture
- Hard to modify and reuse - Easy to modify and reuse
80
A Few Key SOA Protocols
y Universal Description, Discovery, and Integration, UDDI.
Defines the publication and discovery of web service
implementations.
y The Web Services Description Language, WSDL, is an XML-
based language that defines Web Services.
y SOAP is the Service Oriented Architecture Protocol. It is a
key SOA in which a network node (the client) sends a request
to another node (the server).
y The Lightweight Directory Access Protocol, or LDAP is
protocol for querying and modifying directory services.
y Extract, Transform, and Load, ETL, is a process of moving

More Related Content

PPTX
Cs6703 grid and cloud computing unit 1
PPTX
Grid and Cloud Computing Lecture 1a.pptx
PPTX
Concepts of Distributed Computing & Cloud Computing
PPTX
Distributed Computing system
PPTX
Unit 1
PPTX
Unit 1
PDF
cloudcomputingdistributedcomputing-171208050503 (1).pdf
Cs6703 grid and cloud computing unit 1
Grid and Cloud Computing Lecture 1a.pptx
Concepts of Distributed Computing & Cloud Computing
Distributed Computing system
Unit 1
Unit 1
cloudcomputingdistributedcomputing-171208050503 (1).pdf

Similar to Cloud and Grid Computing PPT computer science.pdf (20)

PPTX
Unit 1
PPT
Distributed computing
PPTX
Distributed computing
PPTX
Distributed Systems.pptx
PPTX
Lecture no #9.pptx of strategic management
PDF
Distributed Network, Computing, and System
PPTX
Lecture_1.pptx Introduction Introduction
PPTX
Distributed System Unit 1 Notes by Dr. Nilam Choudhary, SKIT Jaipur
DOCX
Distributed computing
DOC
Computing notes
PPT
Chap 01 lecture 1distributed computer lecture
PPTX
Chapter 1-Introduction to distributed system.pptx
PDF
CCUnit1.pdf
PPTX
Lecture03.pptx
PPTX
01 - Introduction to Distributed Systems
PPT
chapter 1- introduction to distributed system.ppt
PDF
DS UNIT1.pdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
PPTX
Cloud computing basic introduction and notes for exam
PPT
Chapter 1 -_characterization_of_distributed_systems
DOC
Distributed Computing Report
Unit 1
Distributed computing
Distributed computing
Distributed Systems.pptx
Lecture no #9.pptx of strategic management
Distributed Network, Computing, and System
Lecture_1.pptx Introduction Introduction
Distributed System Unit 1 Notes by Dr. Nilam Choudhary, SKIT Jaipur
Distributed computing
Computing notes
Chap 01 lecture 1distributed computer lecture
Chapter 1-Introduction to distributed system.pptx
CCUnit1.pdf
Lecture03.pptx
01 - Introduction to Distributed Systems
chapter 1- introduction to distributed system.ppt
DS UNIT1.pdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Cloud computing basic introduction and notes for exam
Chapter 1 -_characterization_of_distributed_systems
Distributed Computing Report
Ad

More from coreyanderson7866 (6)

PPTX
2-The unified process in computer Science.pptx
PDF
DS-Visualization-Unit-4 COMPUTER SCIENCE.pdf
PPTX
Learn various loops and Iterations in Python
PPTX
Presentation_ON _HTML Irfan Rashid .pptx
PPTX
Contingency arguments for Existence Of Allah.edd.pptx
PDF
Microbiology report card test smims .pdf
2-The unified process in computer Science.pptx
DS-Visualization-Unit-4 COMPUTER SCIENCE.pdf
Learn various loops and Iterations in Python
Presentation_ON _HTML Irfan Rashid .pptx
Contingency arguments for Existence Of Allah.edd.pptx
Microbiology report card test smims .pdf
Ad

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Tartificialntelligence_presentation.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Electronic commerce courselecture one. Pdf
PPTX
A Presentation on Artificial Intelligence
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Spectroscopy.pptx food analysis technology
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Approach and Philosophy of On baking technology
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Machine learning based COVID-19 study performance prediction
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Tartificialntelligence_presentation.pptx
Encapsulation_ Review paper, used for researhc scholars
Electronic commerce courselecture one. Pdf
A Presentation on Artificial Intelligence
Group 1 Presentation -Planning and Decision Making .pptx
Assigned Numbers - 2025 - Bluetooth® Document
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Spectroscopy.pptx food analysis technology
NewMind AI Weekly Chronicles - August'25-Week II
SOPHOS-XG Firewall Administrator PPT.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Approach and Philosophy of On baking technology
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Big Data Technologies - Introduction.pptx
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Machine learning based COVID-19 study performance prediction
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

Cloud and Grid Computing PPT computer science.pdf

  • 2. UNIT I INTRODUCTION Evolution of Distributed computing: Scalable computing over the Internet – Technologies for network based systems – clusters of cooperative computers - Grid computing Infrastructures – cloud computing - service oriented architecture – Introduction to Grid Architecture and standards – Elements of Grid – Overview of Grid Architecture. 2 Dr Gnanasekaran Thangavel 8/30/2016
  • 3. Distributed Computing Definition y“A distributed system consists of multiple autonomous computers that communicate through a computer network. y“Distributed computing utilizes a network of many computers, each accomplishing a portion of an overall task, to achieve a computational result much more quickly than with a single computer.” y“Distributed computing is any computing that involves multiple computers remote from each other that each have a role in a computation 3 Dr Gnanasekaran Thangavel 8/30/2016
  • 4. Introduction y A distributed system is one in which hardware or software components located at networked computers communicate and coordinate their actions only by message passing. y In the term distributed computing, the word distributed means spread out across space. Thus, distributed computing is an activity performed on a spatially distributed system. y These networked computers may be in the same 4 Dr Gnanasekaran Thangavel 8/30/2016
  • 6. Motivation y Inherently distributed applications y Performance/cost y Resource sharing y Flexibility and extensibility y Availability and fault tolerance y Scalability y Network connectivity is increasing. y Combination of cheap processors often more cost-effective than one expensive fast system. y Potential increase of reliability. 6 Dr Gnanasekaran Thangavel 8/30/2016
  • 7. History y 1975 – 1985 yParallel computing was favored in the early years yPrimarily vector-based at first yGradually more thread-based parallelism was introduced yThe first distributed computing programs were a pair of programs called Creeper and Reaper invented in 1970s yEthernet that was invented in 1970s. yARPANET e-mail was invented in the early 1970s and probably the earliest example of a large-scale distributed application. 7 Dr Gnanasekaran Thangavel 8/30/2016
  • 8. History y 1985 -1995 yMassively parallel architectures start rising and message passing interface and other libraries developed yBandwidth was a big problem yThe first Internet-based distributed computing project was started in 1988 by the DEC System Research Center. yDistributed.net was a project founded in 1997 - considered the first to use the internet to distribute data for calculation and collect the results, 8 Dr Gnanasekaran Thangavel 8/30/2016
  • 9. History y 1995 – Today yCluster/grid architecture increasingly dominant ySpecial node machines eschewed in favor of COTS technologies yWeb-wide cluster software yGoogle take this to the extreme (thousands of nodes/cluster) ySETI@Home started in May 1999 - analyze the radio signals that were being collected by the Arecibo Radio Telescope in Puerto Rico. 9 Dr Gnanasekaran Thangavel 8/30/2016
  • 10. Goal y Making Resources Accessible yData sharing and device sharing y Distribution Transparency yAccess, location, migration, relocation, replication, concurrency, failure y Communication yMake human-to-human comm. easier. E.g.. : electronic mail y Flexibility ySpread the work load over the available machines in the most cost effective way y To coordinate the use of shared resources y To solve large computational problem 10 Dr Gnanasekaran Thangavel 8/30/2016
  • 12. Architecture yClient-server y3-tier architecture yN-tier architecture yloose coupling, or tight coupling yPeer-to-peer ySpace based 12 Dr Gnanasekaran Thangavel 8/30/2016
  • 13. yExamples of commercial application : yDatabase Management System yDistributed computing using mobile agents yLocal intranet yInternet (World Wide Web) yJAVA Remote Method Invocation (RMI) Application 13 Dr Gnanasekaran Thangavel 8/30/2016
  • 14. Distributed Computing Using Mobile Agents y Mobile agents can be wandering around in a network using free resources for their own computations. 14 Dr Gnanasekaran Thangavel 8/30/2016
  • 15. Local Intranet y A portion of Internet that is separately administered & supports internal sharing of resources (file/storage systems and printers) is called local intranet. 15 Dr Gnanasekaran Thangavel 8/30/2016
  • 16. Internet y The Internet is a global system of interconnected computer networks that use the standardized Internet Protocol Suite (TCP/IP). 16 Dr Gnanasekaran Thangavel 8/30/2016
  • 17. JAVA RMI y Embedded in language Java:- y Object variant of remote procedure call y Adds naming compared with RPC (Remote Procedure Call) y Restricted to Java environments RMI Architecture 17 Dr Gnanasekaran Thangavel 8/30/2016
  • 18. Categories of Applications in distributed computing y Science y Life Sciences y Cryptography y Internet y Financial y Mathematics y Language y Art y Puzzles/Games y Miscellaneous y Distributed Human Project y Collaborative Knowledge Bases y Charity 18 Dr Gnanasekaran Thangavel 8/30/2016
  • 19. Advantages y Economics:- y Computers harnessed together give a better price/performance ratio than mainframes. y Speed:- y A distributed system may have more total computing power than a mainframe. y Inherent distribution of applications:- y Some applications are inherently distributed. E.g., an ATM-banking application. y Reliability:- y If one machine crashes, the system as a whole can still survive if you have multiple server machines and multiple storage devices (redundancy). y Extensibility and Incremental Growth:- y Possible to gradually scale up (in terms of processing power and functionality) by adding more sources (both hardware and software). 19 Dr Gnanasekaran Thangavel 8/30/2016
  • 20. Disadvantages y Complexity :- y Lack of experience in designing, and implementing a distributed system. E.g. which platform (hardware and OS) to use, which language to use etc. y Network problem:- y If the network underlying a distributed system saturates or goes down, then the distributed system will be effectively disabled thus negating most of the advantages of the distributed system. y Security:- y Security is a major hazard since easy access to data means easy access to secret data as well. 20 Dr Gnanasekaran Thangavel 8/30/2016
  • 21. Issues and Challenges y Heterogeneity of components :- yvariety or differences that apply to computer hardware, network, OS, programming language and implementations by different developers. yAll differences in representation must be deal with if to do message exchange. yExample : different call for exchange message in UNIX different from Windows. y Openness:- ySystem can be extended and re-implemented in various ways. yCannot be achieved unless the specification and documentation are made available to software developer. yThe most challenge to designer is to tackle the complexity of 21 Dr Gnanasekaran Thangavel 8/30/2016
  • 22. Issues and Challenges cont… yTransparency:- yAim : make certain aspects of distribution are invisible to the application programmer ; focus on design of their particular application. yThey not concern the locations and details of how it operate, either replicated or migrated. yFailures can be presented to application programmers in the form of exceptions – must be handled. 22 Dr Gnanasekaran Thangavel 8/30/2016
  • 23. Issues and Challenges cont… y Transparency:- yThis concept can be summarize as shown in this Figure: 23 Dr Gnanasekaran Thangavel 8/30/2016
  • 24. Issues and Challenges cont… y Security:- ySecurity for information resources in distributed system have 3 components : a. Confidentiality : protection against disclosure to unauthorized individuals. b. Integrity : protection against alteration/corruption c. Availability : protection against interference with the means to access the resources. yThe challenge is to send sensitive information over Internet in a secure manner and to identify a remote user or other agent correctly. 24 Dr Gnanasekaran Thangavel 8/30/2016
  • 25. Issues and Challenges cont.. y Scalability :- yDistributed computing operates at many different scales, ranging from small Intranet to Internet. yA system is scalable if there is significant increase in the number of resources and users. yThe challenges is : a. controlling the cost of physical resources. b. controlling the performance loss. c. preventing software resource running out. d. avoiding performance bottlenecks. 25 Dr Gnanasekaran Thangavel 8/30/2016
  • 26. Issues and Challenges cont… y Failure Handling :- yFailures in a distributed system are partial – some components fail while others can function. yThat’s why handling the failures are difficult a. Detecting failures : to manage the presence of failures cannot be detected but may be suspected. b. Masking failures : hiding failure not guaranteed in the worst case. y Concurrency :- yWhere applications/services process concurrency, it will effect a conflict in operations with one another and produce inconsistence results. 26 Dr Gnanasekaran Thangavel 8/30/2016
  • 27. Conclusion y The concept of distributed computing is the most efficient way to achieve the optimization. y Distributed computing is anywhere : intranet, Internet or mobile ubiquitous computing (laptop, PDAs, pagers, smart watches, hi-fi systems) y It deals with hardware and software systems, that contain more than one processing / storage and run in concurrently. y Main motivation factor is resource sharing; such as files , printers, web pages or database records. y Grid computing and cloud computing are form of distributed computing. 27 Dr Gnanasekaran Thangavel 8/30/2016
  • 28. Grid Computing Grid computing is a form of distributed computing whereby a "super and virtual computer" is composed of a cluster of networked, loosely coupled computers, acting in concert to perform very large tasks. Grid computing (Foster and Kesselman, 1999) is a growing technology that facilitates the executions of large-scale resource intensive applications on geographically distributed computing resources. Facilitates flexible, secure, coordinated large scale resource sharing among dynamic collections of individuals, institutions, and resource Enable communities (“virtual organizations”) to share 8/30/2016 28 Dr Gnanasekaran Thangavel
  • 29. Criteria for a Grid: Coordinates resources that are not subject to centralized control. Uses standard, open, general-purpose protocols and interfaces. Delivers nontrivial qualities of service. Benefits ƒ Exploit Underutilized resources ƒ Resource load Balancing ƒ Virtualize resources across an enterprise ƒ Data Grids, Compute Grids ƒ Enable collaboration for virtual organizations 29 Dr Gnanasekaran Thangavel 8/30/2016
  • 30. Grid Applications Data and computationally intensive applications: This technology has been applied to computationally-intensive scientific, mathematical, and academic problems like drug discovery, economic forecasting, seismic analysis back office data processing in support of e- commerce y A chemist may utilize hundreds of processors to screen thousands of compounds per hour. y Teams of engineers worldwide pool resources to analyze terabytes of structural data. y Meteorologists seek to visualize and analyze petabytes of climate data with enormous computational demands. Resource sharing y Computers, storage, sensors, networks, … y Sharing always conditional: issues of trust, policy, negotiation, payment, … 8/30/2016 30 Dr Gnanasekaran Thangavel
  • 31. Grid Topologies • Intragrid – Local grid within an organization – Trust based on personal contracts • Extragrid – Resources of a consortium of organizations connected through a (Virtual) Private Network – Trust based on Business to Business contracts • Intergrid – Global sharing of resources through the internet – Trust based on certification 8/30/2016 31 Dr Gnanasekaran Thangavel
  • 32. 8/30/2016 Dr Gnanasekaran Thangavel 32 Computational Grid “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” ”The Grid: Blueprint for a New Computing Infrastructure”, Kesselman & Foster Example : Science Grid (US Department of Energy)
  • 33. Data Grid y A data grid is a grid computing system that deals with data — the controlled sharing and management of large amounts of distributed data. y Data Grid is the storage component of a grid environment. Scientific and engineering applications require access to large amounts of data, and often this data is widely distributed. A data grid provides seamless access to the local or remote data required to complete compute intensive calculations. Example : Biomedical informatics Research Network (BIRN), the Southern California earthquake Center (SCEC). 8/30/2016 33 Dr Gnanasekaran Thangavel
  • 34. Methods of Grid Computing y Distributed Supercomputing y High-Throughput Computing y On-Demand Computing y Data-Intensive Computing y Collaborative Computing y Logistical Networking 8/30/2016 34 Dr Gnanasekaran Thangavel
  • 35. Distributed Supercomputing y Combining multiple high-capacity resources on a computational grid into a single, virtual distributed supercomputer. y Tackle problems that cannot be solved on a single system. 8/30/2016 35 Dr Gnanasekaran Thangavel
  • 36. High-Throughput Computing y Uses the grid to schedule large numbers of loosely coupled or independent tasks, with the goal of putting unused processor cycles to work. On-Demand Computing | Uses grid capabilities to meet short-term requirements for resources that are not locally accessible. | Models real-time computing demands. 8/30/2016 36 Dr Gnanasekaran Thangavel
  • 37. Collaborative Computing y Concerned primarily with enabling and enhancing human-to- human interactions. y Applications are often structured in terms of a virtual shared space. Data-Intensive Computing | The focus is on synthesizing new information from data that is maintained in geographically distributed repositories, digital libraries, and databases. | Particularly useful for distributed data mining. 8/30/2016 37 Dr Gnanasekaran Thangavel
  • 38. Logistical Networking y Logistical networks focus on exposing storage resources inside networks by optimizing the global scheduling of data transport, and data storage. y Contrasts with traditional networking, which does not explicitly model storage resources in the network. y high-level services for Grid applications y Called "logistical" because of the analogy it bears with the systems of warehouses, depots, and distribution channels. 8/30/2016 38 Dr Gnanasekaran Thangavel
  • 39. P2P Computing vs Grid Computing yDiffer in Target Communities yGrid system deals with more complex, more powerful, more diverse and highly interconnected set of resources than P2P. yVO 8/30/2016 39 Dr Gnanasekaran Thangavel
  • 40. A typical view of Grid environment User Resource Broker Grid Resources Grid Information Service A User sends computation or data intensive application to Global Grids in order to speed up the execution of the application. A Resource Broker distribute the jobs in an application to the Grid resources based on user’s QoS requirements and details of available Grid resources for further executions. Grid Resources (Cluster, PC, Supercomputer, database, instruments, etc.) in the Global Grid execute the user jobs. Grid Information Service system collects the details of the available Grid resources and passes the information to the resource broker. Computation result Grid application Computational jobs Details of Grid resources Processed jobs 1 2 3 4 40 Dr Gnanasekaran Thangavel 8/30/2016
  • 41. Grid Middleware y Grids are typically managed by grid ware - a special type of middleware that enable sharing and manage grid components based on user requirements and resource attributes (e.g., capacity, performance) y Software that connects other software components or applications to provide the following functions: Run applications on suitable available resources – Brokering, Scheduling Provide uniform, high-level access to resources – Semantic interfaces – Web Services, Service Oriented Architectures Address inter-domain issues of security, policy, etc. – Federated Identities Provide application-level status monitoring and control 8/30/2016 41 Dr Gnanasekaran Thangavel
  • 42. Middleware y Globus –chicago Univ y Condor – Wisconsin Univ – High throughput computing y Legion – Virginia Univ – virtual workspaces- collaborative computing y IBP – Internet back pane – Tennesse Univ – logistical networking y NetSolve – solving scientific problems in heterogeneous env – high throughput & data intensive 8/30/2016 42 Dr Gnanasekaran Thangavel
  • 43. Two Key Grid Computing Groups The Globus Alliance (www.globus.org) y Composed of people from: Argonne National Labs, University of Chicago, University of Southern California Information Sciences Institute, University of Edinburgh and others. y OGSA/I standards initially proposed by the Globus Group The Global Grid Forum (www.ggf.org) y Heavy involvement of Academic Groups and Industry y (e.g. IBM Grid Computing, HP, United Devices, Oracle, UK e-Science Programme, US DOE, US NSF, Indiana University, and many others) y Process y Meets three times annually y Solicits involvement from industry, research groups, and academics 8/30/2016 43 Dr Gnanasekaran Thangavel
  • 44. Some of the Major Grid Projects Name URL/Sponsor Focus EuroGrid, Grid Interoperability (GRIP) eurogrid.org European Union Create tech for remote access to super comp resources & simulation codes; in GRIP, integrate with Globus Toolkit™ Fusion Collaboratory fusiongrid.org DOE Off. Science Create a national computational collaboratory for fusion research Globus Project™ globus.org DARPA, DOE, NSF, NASA, Msoft Research on Grid technologies; development and support of Globus Toolkit™; application and deployment GridLab gridlab.org European Union Grid technologies and applications GridPP gridpp.ac.uk U.K. eScience Create & apply an operational grid within the U.K. for particle physics research Grid Research Integration Dev. & Support Center grids-center.org NSF Integration, deployment, support of the NSF Middleware Infrastructure for research & education 8/30/2016 44 Dr Gnanasekaran Thangavel
  • 45. Grid Architecture 8/30/2016 45 Dr Gnanasekaran Thangavel
  • 46. The Hourglass Model y Focus on architecture issues y Propose set of core services as basic infrastructure y Used to construct high-level, domain-specific solutions (diverse) y Design principles y Keep participation cost low y Enable local control y Support for adaptation y “IP hourglass” model Diverse global services Core services Local OS A p p l i c a t i o n s 8/30/2016 46 Dr Gnanasekaran Thangavel
  • 47. Layered Grid Architecture (By Analogy to Internet Architecture) Application Fabric “Controlling things locally”: Access to, & control of, resources Connectivity “Talking to things”: communication (Internet protocols) & security Resource “Sharing single resources”: negotiating access, controlling use Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Internet Transport Application Link Internet Protocol Architecture 8/30/2016 47 Dr Gnanasekaran Thangavel
  • 48. Example: Data Grid Architecture Discipline-Specific Data Grid Application Coherency control, replica selection, task management, virtual data catalog, virtual data code catalog, … Replica catalog, replica management, co-allocation, certificate authorities, metadata catalogs, Access to data, access to computers, access to network performance data, … Communication, service discovery (DNS), authentication, authorization, delegation Storage systems, clusters, networks, network caches, … Collective (App) App Collective (Generic) Resource Connect Fabric 8/30/2016 48 Dr Gnanasekaran Thangavel
  • 49. Simulation tools y GridSim – job scheduling y SimGrid – single client multiserver scheduling y Bricks – scheduling y GangSim- Ganglia VO y OptoSim – Data Grid Simulations y G3S – Grid Security services Simulator – security services 49 Dr Gnanasekaran Thangavel 8/30/2016
  • 50. Simulation tool GridSim is a Java-based toolkit for modeling, and simulation of distributed resource management and scheduling for conventional Grid environment. GridSim is based on SimJava, a general purpose discrete- event simulation package implemented in Java. All components in GridSim communicate with each other through message passing operations defined by SimJava. 50 Dr Gnanasekaran Thangavel 8/30/2016
  • 51. Salient features of the GridSim y It allows modeling of heterogeneous types of resources. y Resources can be modeled operating under space- or time- shared mode. y Resource capability can be defined (in the form of MIPS (Million Instructions Per Second) benchmark. y Resources can be located in any time zone. y Weekends and holidays can be mapped depending on resource’s local time to model non-Grid (local) workload. y Resources can be booked for advance reservation. y Applications with different parallel application models can be simulated. 51 Dr Gnanasekaran Thangavel 8/30/2016
  • 52. Salient features of the GridSim y Application tasks can be heterogeneous and they can be CPU or I/O intensive. y There is no limit on the number of application jobs that can be submitted to a resource. y Multiple user entities can submit tasks for execution simultaneously in the same resource, which may be time- shared or space-shared. This feature helps in building schedulers that can use different market-driven economic models for selecting services competitively. y Network speed between resources can be specified. y It supports simulation of both static and dynamic schedulers. y Statistics of all or selected operations can be recorded and they can be analyzed using GridSim statistics analysis methods. 52 Dr Gnanasekaran Thangavel 8/30/2016
  • 53. A Modular Architecture for GridSim Platform and Components. Appn Conf Res Conf User Req Grid Sc Output Application, User, Grid Scenario’s input and Results Grid Resource Brokers or Schedulers … Appn modeling Res entity Info serv Job mgmt Res alloc Statis GridSim Toolkit Single CPU SMPs Clusters Load Netw Reservation Resource Modeling and Simulation SimJava Distributed SimJava Basic Discrete Event Simulation Infrastructure PCs Workstation Clusters SMPs Distributed Resources Virtual Machine 53 Dr Gnanasekaran Thangavel 8/30/2016
  • 54. Web 2.0, Clouds, and Internet of Things HPC: High - Performance Computing HTC: High - Throughput Computing P2P: Peer to Peer MPP: Massively Parallel Processors 54 Dr Gnanasekaran Thangavel 8/30/2016
  • 55. 55 What is a Service Oriented Architecture?
  • 56. 56 What is a Service Oriented Architecture (SOA)? yA method of design, deployment, and management of both applications and the software infrastructure where: yAll software is organized into business services that are network accessible and executable. yService interfaces are based on public standards for interoperability.
  • 57. 57 Key Characteristics of SOA yQuality of service, security and performance are specified. ySoftware infrastructure is responsible for managing. yServices are cataloged and discoverable. yData are cataloged and discoverable. yProtocols use only industry standards.
  • 58. 58 What is a “Service”? yA Service is a reusable component. yA Service changes business data from one state to another. yA Service is the only way how data is accessed. yIf you can describe a component in WSDL, it is a Service.
  • 59. 59 Information Technology is Not SOA Business Mission Information Management Information Systems Systems Design Computing & Communications Information Technology SOA
  • 60. 60 Why Getting SOA Will be Difficult y Managing for Projects: ySoftware: 1 - 4 years yHardware: 3 - 5 years; yCommunications: 1 - 3 years; yProject Managers: 2 - 4 years; yReliable funding: 1 - 4 years; yUser turnover: 30%/year; ySecurity risks: 1 minute or less. y Managing for SOA: yData: forever. yInfrastructure: 10+ years.
  • 61. 61 Why Managing Business Systems is Difficult? y 40 Million lines of code in Windows XP is unknowable. y Testing application (3 Million lines) requires >1015 tests. y Probability correct data entry for a supply item is <65%. y There are >100 formats that identify a person in DoD.
  • 62. 62 How to View Organizing for SOA STABILITY HERE VARIETY HERE C orp o rate Po licy, C o rp o rate Stan d ard s, Referen ce M o d els, D ata M anagem en t and To o ls, I n tegrated System s C o nfigu ratio n D ata Base, Shared C o m p utin g an d Telecom m un icatio n s A p p licatio ns D evelo p m en t & M ainten ance EN T ERPRI SE LEV EL PRO C ESS LEV EL BU SI N ESS LEV EL A PPLI C AT I O N LEV EL LO C A L LEV EL G rap h ic I n fo W in d o w , Perso nal To o ls, I n q u iry Lan gu ages C u sto m iz ed A p plicatio n s, Pro to typ in g To ols, Lo cal A pp licatio n s and Files A p p l icati on s Secu rity Barri er Bu si n ess Secu ri ty Barrier Process Secu ri ty B arrier Priv acy an d I n d i v i d u al Secu rity Barri er G LO BA L LEV EL I n d ustry Stan d ard s, C o m m ercial O ff-the-Sh elf Pro du cts an d Services PERSO N A L LEV EL Private A p plication s and Files Fu n ctio n al Pro cess A Fu n ction al Pro cess B Fu n ction al Pro cess C Fun ction al Pro cess D OSD Service A Service B
  • 63. 63 SOA Must Reflect Timing Corporate Policy, Corporate Standards, Reference Models, Data Management and Tools, Integrated Systems Configuration Data Base, Shared Computing and Telecommunications, Security and Survivability Business A Business B Infrastructure Support Applications Development & Maintenance ENTERPRISE PROCESS BUSINESS APPLICATION LOCAL Graphic InfoWindow, Personal Tools, Inquiry Languages Customized Applications, Prototyping Tools, Local Applications and Files GLOBAL Industry Standards, Commercial Off-the-Shelf Products and Services PERSONAL Private Applications and Files Functional Process A Functional Process B Functional Process C Functional Process D LONG TERM STABILITY & TECHNOLOGY COMPLEXITY SHORT TERM ADAPTABILITY & TECHNOLOGY SIMPLICITY
  • 64. 64 SOA Must Reflect Conflicting Interests Enterprise Missions Organizations Local Personal
  • 65. 65 Organization of Infrastructure Services Infrastructure Services (Enterprise Information) Data Services Security Services Computing Services Communication Services Application Services
  • 66. 66 Organization of Data Services Data Services Discovery Services Management Services Collaboration Services Interoperability Services Semantic Services
  • 67. 67 Data Interoperability Policies y Data are an enterprise resource. y Single-point entry of unique data. y Enterprise certification of all data definitions. y Data stewardship defines data custodians. y Zero defects at point of entry. y De-conflict data at source, not at higher levels. y Data aggregations from sources data, not from reports.
  • 68. 68 Data Concepts y Data Element Definition yText associated with a unique data element within a data dictionary that describes the data element, give it a specific meaning and differentiates it from other data elements. Definition is precise, concise, non-circular, and unambiguous. (ISO/IEC 11179 Metadata Registry specification) y Data Element Registry yA label kept by a registration authority that describes a unique meaning and representation of data elements, including registration identifiers, definitions, names, value
  • 69. 69 Data and Services Deployment Principles y Data, services and applications belong to the Enterprise. y Information is a strategic asset. y Data and applications cannot be coupled to each other. y Interfaces must be independent of implementation. y Data must be visible outside of the applications. y Semantics and syntax is defined by a community of interest. y Data must be understandable and trusted.
  • 70. 70 Organization of Security Services Security Services Transfer Services Protection Services Certification Services Systems Assurance Authentication Services
  • 71. 71 Security Services = Information Assurance y Conduct Attack/Event Response yEnsure timely detection and appropriate response to attacks. yManage measures required to minimize the network’s vulnerability. y Secure Information Exchanges ySecure information exchanges that occur on the network with a level of protection that is matched to the risk of compromise. y Provide Authorization and Non-Repudiation Services
  • 72. 72 Organization of Computing Services Computing Services Computing Facilities Resource Planning Control & Quality Configuration Services Financial Management
  • 73. 73 Computing Services y Provide Adaptable Hosting Environments yGlobal facilities for hosting to the “edge”. yVirtual environments for data centers. • Distributed Computing Infrastructure yData storage, and shared spaces for information sharing. • Shared Computing Infrastructure Resources
  • 74. 74 Organization of Communication Services Communication Services Interoperability Services Spectrum Management Connectivity Arrangements Continuity of Services Resource Management
  • 75. 75 Network Services Implementation yFrom point-to-point communications (push communications) to network-centric processes (pull communications). yData posted to shared space for retrieval. yNetwork controls assure data synchronization and access security.
  • 76. 76 Communication Services yProvide Information Transport yTransport information, data and services anywhere. yEnsures transport between end-user devices and servers.
  • 77. 77 Organization of Application Services Application Services Component Repository Code Binding Services Maintenance Management Portals Experimental Services
  • 78. 79 Example of Development Tools y Business Process Execution Language, BPEL, is an executable modeling language. Through XML it enables code generation. Traditional Approach BPEL Approach - Hard-coded decision logic - Externalized decision logic - Developed by IT - Modeled by business analysts - Maintained by IT - Maintained by policy managers - Managed by IT - Managed by IT - Dependent upon custom logs - Automatic logs and process capture - Hard to modify and reuse - Easy to modify and reuse
  • 79. 80 A Few Key SOA Protocols y Universal Description, Discovery, and Integration, UDDI. Defines the publication and discovery of web service implementations. y The Web Services Description Language, WSDL, is an XML- based language that defines Web Services. y SOAP is the Service Oriented Architecture Protocol. It is a key SOA in which a network node (the client) sends a request to another node (the server). y The Lightweight Directory Access Protocol, or LDAP is protocol for querying and modifying directory services. y Extract, Transform, and Load, ETL, is a process of moving