Performance Evaluation of Open source E-commerce application (Konakart) on private cloud
-
Upload
onkar-kadam -
Category
Technology
-
view
93 -
download
3
Transcript of Performance Evaluation of Open source E-commerce application (Konakart) on private cloud
Performance evaluation of open source
E-commerce Framework in Private cloud
infrastructure
By
Onkar Ramesh Kadam
A Project
in
The Department of
Electrical and Computer Engineering
Presented in Partial Fulfillment of the Requirements
for the Degree of Master of Engineering at
Concordia University
Montreal, Quebec, Canada
2013
©Onkar Kadam,2013
Faculty of Engineering and Computer Science
Expectations of Originality
This is to Certify that project prepared by
Onkar Ramesh Kadam
Submitted in partial fulfillment of the requirements for the
degree of
Master of Engineering
Complies with the regulations of this University and meets the
accepted standards with respect to originality and quality
Onkar Kadam, 6614590
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 1
Acknowledgement
I take this opportunity to express my profound gratitude and deep regards to my guide
Professor Dr Yan Liu for her exemplary guidance, monitoring and constant encouragement
throughout the course of this project. The blessing, help and guidance given by her time to time
shall carry me a long way in the journey of life on which I am about to embark. I thank her for
giving me an opportunity to work on various emerging and booming technologies today.
I also take this to express gratitude towards Department of Electrical and Computer
Engineering, Concordia University for allowing to take up this project towards my Master’s of
Engineering Degree.
Lastly, I thank almighty, my parents, brother, sisters and friends for their constant
encouragement without which this assignment would not be possible.
Onkar Ramesh Kadam
6614590
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 2
Abstract
Performance evaluation of open source E-commerce Framework in Private
cloud infrastructure
Onkar Kadam
The term "Cloud Computing" has been mentioned in relation to services or infrastructural
resources, which can be contracted over a network. Thus, the idea of renting instead of buying IT
is nothing new. The players in the large world of clouds are Software as a Service providers,
outsourcing and hosting providers, network and IT infrastructure providers and, above all, the
companies whose names are closely linked with the Internet's commercial boom. But, all these
services in combination outline the complete package known as Cloud Computing – depending
on the source with the appropriate focus. That which long ago established itself in the private
environment of the Internet is now, noticeably, coming to the attention of businesses too. Not
only developers and startups but also large companies with international activities recognize that
there is more to Cloud Computing than just marketing hype. Cloud Computing offers the
opportunity to access IT resources and services with appreciable convenience and speed. Behind
this primarily, is a solution that provides users with services that can be drawn upon on demand
and invoiced as and when used. Suppliers of cloud services, in turn, benefit as their IT resources
are used more fully and eventually achieve additional economies of scale.
This project is about the evaluation of performance of the open source e-commerce platform
KONAKART over open source private cloud Eucalyptus, using free Application performance
management solution – AppDynamics Lite. This project revolves around how open source
platforms can be used to leverage the performance of ecommerce application at no cost. Also,
this project provides a solution on how an application should be distributed and scaled over the
cloud to handle huge amount of day-to day transactions and traffic. This project can be
foundation for auto-scaling at the Platform as a service level in multi-tenant Cloud
Infrastructures.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 3
Table Of Contents
List of Figures ................................................................................................................................. 7
Cloud Computing Fundamentals .................................................................................................... 8
1.1. What is “Cloud Computing”? .......................................................................................... 8
1.2. What is a “Cloud”? ........................................................................................................... 9
1.2.1. Types of Cloud deployment models ......................................................................... 9
1.3. Advantages and Features of Cloud Computing Services ................................................. 9
1.4. Cloud Computing Layers ................................................................................................. 9
1.4.1. Infrastructure-as-a-Service ...................................................................................... 10
1.4.2. Software-as-a-Service ............................................................................................. 10
1.4.3. Platform-as-a-Service ............................................................................................. 11
1.5. Characteristics of Cloud Computing .............................................................................. 11
1.5.1. On-Demand self-service ......................................................................................... 11
1.5.2. Broad Network Access ........................................................................................... 11
1.5.3. Resource Pooling .................................................................................................... 11
1.5.4. Elasticity ................................................................................................................. 11
1.5.5. Measured Capacity.................................................................................................. 12
1.6.6. Muti-tenancy ........................................................................................................... 12
1.6. Scalability in cloud systems ........................................................................................... 12
1.7. Virtualization Technology.............................................................................................. 12
1.7.1. Virtual Machines ..................................................................................................... 13
1.7.2. Virtualization platforms .......................................................................................... 13
1.7.3. Virtual Infrastructure Management......................................................................... 13
1.7.4. Cloud Infrastructure Manager ................................................................................. 14
Cloud Computing Infrastructures ................................................................................................. 15
2.1. Types of clouds .............................................................................................................. 15
2.1.1. Public Cloud............................................................................................................ 15
2.1.2. Private Cloud .......................................................................................................... 15
2.1.3. Community Cloud ................................................................................................... 16
2.1.4. Hybrid or Mixed Cloud ........................................................................................... 16
2.2. Case Studies ................................................................................................................... 16
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 4
2.2.1. Amazon Web services in a nutshell (IAAS provider) ............................................ 16
2.2.2. Joyent(IAAS provider ............................................................................................. 17
2.2.3. Rackspace Cloud Servers (IAAS provider). ........................................................... 18
2.2.4. Flexiscale (IAAS provider) ..................................................................................... 18
2.2.5. Gogrid ..................................................................................................................... 18
2.2.6. App Engine ............................................................................................................. 19
2.2.7. Microsoft Windows Azure ...................................................................................... 19
2.2.8. Heroku..................................................................................................................... 20
2.2.9. Aneka ...................................................................................................................... 20
2.2.10. AppScale ............................................................................................................. 21
2.2.11. VMWare Vsphere and Vcloud ............................................................................ 21
2.2.12. Nimbus ................................................................................................................ 21
2.2.13. OpenNebula ......................................................................................................... 22
2.2.14. Eucalyptus ........................................................................................................... 22
Eucalyptus: Open Source, AWS compatible private cloud .......................................................... 24
3.1. Architecture .................................................................................................................... 24
3.1.1. Node Controller(NC) .............................................................................................. 24
3.1.2. Cluster Controller(CC)............................................................................................ 24
3.1.3. Walrus Storage Controller ...................................................................................... 24
3.1.4. Storage Controller(SC) ........................................................................................... 25
3.1.5. Cloud Controller (CLC) .......................................................................................... 25
3.1.6. VMware Broker ...................................................................................................... 25
3.2. Eucalyptus Machine Images (EMI)................................................................................ 25
3.3. Security........................................................................................................................... 26
Monitoring in Cloud Infrastructures ............................................................................................. 28
4.1. Need Of Monitoring in Cloud ........................................................................................ 28
4.2. Abstraction levels of monitoring in Cloud ..................................................................... 30
4.2.1. Low–Level Monitoring ........................................................................................... 30
4.2.2. High-Level Monitoring ........................................................................................... 30
4.3. Commercial Cloud Monitoring Platforms and Services ................................................ 30
4.3.1. Amazon EC2 ........................................................................................................... 30
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 5
4.3.2. Microsoft Azure ...................................................................................................... 31
4.3.3. Rackspace ............................................................................................................... 31
4.3.4. CloudStatus ............................................................................................................. 31
4.3.5. GoGrid .................................................................................................................... 32
4.3.6. Nimsoft ................................................................................................................... 32
4.3.7. AppDynamics Pro ................................................................................................... 32
4.4. Open Source Monitoring Platforms and Services .......................................................... 32
4.4.1. Nagios ..................................................................................................................... 32
4.4.2. OpenNebula ............................................................................................................ 33
4.4.3. CloudStack Zenpack ............................................................................................... 33
4.4.4. Nimbus .................................................................................................................... 33
4.4.5. Dargos ..................................................................................................................... 33
4.5. Monitoring Applications in Cloud infrastructures(PaaS Level) .................................... 34
KonaKart ....................................................................................................................................... 36
5.1. KonaKart Community and Enterprise Versions ........................................................... 36
5.2. KonaKart Enterprise version features ............................................................................ 36
5.3. Architecture .................................................................................................................... 38
5.3.1. Software Architecture ............................................................................................. 38
5.3.2. Deployment Architecture ........................................................................................ 39
5.4. KonaKart Use Case UML Diagrams ............................................................................. 40
5.4.1. Top Level Use Case Diagram ................................................................................. 40
5.4.2. View Items Use Case Diagram ............................................................................... 41
5.4.3. Check-out use case Diagram ................................................................................... 42
AppDynamics-Application Performance management and Monitoring ...................................... 43
6.1. AppDynamics Lite components ..................................................................................... 43
6.1.1. AppDynamics Lite Application Server Agent ........................................................ 43
6.1.2. AppDynamics Lite Viewer ..................................................................................... 43
6.2. AppDynamics Lite Features and Uses ........................................................................... 44
6.3. AppDynamics Lite vs AppDynamics Pro ...................................................................... 46
Performance Evaluation ................................................................................................................ 48
7.1. Deployment Architecture ............................................................................................... 48
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 6
7.2. Test Goals ....................................................................................................................... 49
7.3. Test Environment ........................................................................................................... 49
7.4. Test Phase 1 .................................................................................................................... 49
7.4.1. PaaS Metric results(AppDynamics ......................................................................... 49
7.4.2. IaaS Metric Results ................................................................................................. 51
7.5. Testing Phase 2............................................................................................................... 53
7.6. Testing Phase 3............................................................................................................... 54
7.6.1. Phase 3 Results ....................................................................................................... 54
7.7. Comparing IaaS performance metrics with PaaS performance metrics ......................... 55
Conclusion and Future Work ........................................................................................................ 57
8.1. Conclusion ...................................................................................................................... 57
8.2. Future Work ................................................................................................................... 58
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 7
List of Figures · Fig 1.1 Layers of cloud Computing
· Fig 2.1 Types of Cloud Infrastructures
· Fig 2.2 List of Amazon Web services
· Fig 3.1 logical Eucalyptus architecture
· Fig 5.1 Layers of KONAKART e-commerce application
· Fig 5.2 Sample Multi-tier KONAKART deployment architecture
· Fig 5.3 Top level UML use case diagram for Konakart application
· Fig 5.4 View Items Use case Diagrams
· Fig 5.5 Checkout use case diagram
· Fig 6.1 AppDynamics Lite Components
· FIG 7.1 Deployment Architecture
· Fig 7.2 No of HTTP Requests Vs Web server Throughput
· Fig 7.3 No of HTTP requests Vs Database Server throughput
· Fig 7.4. No of HTTP requests Vs average CPU usage(jiffies)
· Fig 7.5 No of HTTP requests Vs Average System Load
· Fig 7.6 MySql Plugin Metric
List of Tables · Table 6.1 AppDynamics Lite vs AppDynamics Pro
· Table 7.1 ECC Instance Configuration
· Table 7.2 Apache Jmeter load testing results
· Table 7.3 Apache Jmeter load testing results(IAAS metrics)
· Table 7.4 Apache Jmeter Load Testing results Phase 2
· Table 7.5 No of MySql Commands per second
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 8
Cloud Computing Fundamentals 1. Cloud Computing Fundamentals
1.1. What is “Cloud Computing”?
When we use services like Gmail, Facebook, Dropbox, etc., we can say that we are
consuming cloud computing services. The Vendors of such cloud computing services make
provisions such that our data is stored on remote locations and made available to us through
easy-to-use web interface or a web browser. The idea of storing, sharing, using resources
from remote locations or data centers can be termed as Cloud computing. In other words,
Cloud Computing can be defined as on-demand delivery of Information technology (IT)
resources or services over the internet, [1]. The term On-demand is self-explanatory which
means using resources when we need them most and pay for the time we use it. The National
Institute of Standards and Technology, U.S department of commerce defines Cloud
Computing as “Cloud computing is a model for enabling ubiquitous, convenient, on-demand
network access to a shared pool of configurable computing resources (e.g., networks, servers,
storage, applications, and services) that can be rapidly provisioned and released with minimal
management effort or service provider interaction”,[2].In simple words we can define cloud
computing as a service model which enables consumers or users to access a shared pool or
collection of applications(software) and resources(hardware) that can be increased or
decreased dynamically depending on the need of the consumer. In Cloud Computing,
dynamically scalable and also virtualized resources are provided as a service over the internet
[3]. Cloud Computing technology is bound to reshape the IT industry in coming years.
Consumers use a variety of devices, like PCs, laptops, smartphones, and PDAs to access
applications, storage, and application-development platforms over the Internet, via services
offered by cloud computing vendors Technologies such as cluster, grid, and now, cloud
computing, have all aimed at allowing access to large amounts of computing power in a fully
virtualized manner, by aggregating resources and offering a single system view. In addition,
an important aim of these technologies has been delivering computing as a utility. Utility
computing describes a business model for on-demand delivery of computing power;
consumers pay providers based on usage (“pay-as-you-go”), similar to the way in which we
currently obtain services from traditional public utility services such as water, electricity, gas,
and telephony. Cloud computing has been coined as an umbrella term to describe a category
of sophisticated on-demand computing services initially offered by commercial providers,
such as Amazon, Google, and Microsoft. It denotes a model on which a computing
infrastructure is viewed as a “cloud,” from which businesses and individuals access
applications from anywhere in the world on demand [7].The main principle behind this
model is offering computing, storage, and software “as a service.”
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 9
1.2. What is a “Cloud”?
The “Cloud” in cloud computing refers to a set or group of identical devices, in terms of
computing power, assembled or deployed to work together to achieve certain computational
goals.
1.2.1. Types of Cloud deployment models
1.1.1.1. Private Cloud: The cloud infrastructure is deployed for exclusive use by
a single organization comprise of multiple consumers (e.g., business units).It
may be owned, managed, maintained by the organization itself or by a third
party vendor or may be combination of both. [2].
1.1.1.2. Public Cloud: The cloud infrastructure is deployed for open use by the
general public .It may be owned, managed, and operated by a business,
academic, or government organization, or some combination of them. It exists
on the premises of the cloud provider [2].
1.2.1.3. Community Cloud: The cloud infrastructure is deployed for exclusive
use by a specific community of consumers from organizations that have shared
concerns, same research motives. It may be owned, managed, and operated by
one or more of the organizations in the community, a third party, or some
combination of them, and it may exist on or off premises [2].
1.2.1.4. Hybrid Cloud: The cloud infrastructure is a composition of two or more
distinct cloud infrastructures (private, community, or public) that remain
unique entities, but are bound together by standardized or proprietary
technology that enables data and application portability.[2].
1.3. Advantages and Features of Cloud Computing Services:
The expenses to deploy an on-premise computing infrastructure for major organizations is
not an issue as their revenue is very high but is of major hindrance to small business owners
and developers who’d like deploy their services in the market for the people to use, such
small business owners and developers can use cloud computing services i.e third party
services to reach the market. The main feature that differentiates Cloud Computing Services
from traditional Web hosting or storage services is that it is on-demand, i.e, only pay for
how much we use. The main advantage of cloud computing services is that they are fully
managed and maintained by the provider or vendor, the consumer or developer does not
need to worry about the maintenance of the remote servers. Advantages of the cloud
computing technology include cost savings, high availability, and scalability.
1.4. Cloud Computing Layers:
Cloud computing can be viewed as a collection of services, which can be presented as a
layered cloud computing architecture.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 10
Fig 1.1 Layers of cloud Computing
1.4.1. Infrastructure-as-a-Service:
Cloud Providers or vendors provide access of physical machines, servers, virtual
machines to the cloud user remotely. Hardware is owned, managed and maintained by the
cloud provider. The provider provides access to such physical or virtual machines to the
users for a certain cost which depends upon the computational power of the machines.
Users install all software and utilities related to their work by themselves. And users are
responsible to maintain the software and applications that they deploy on the physical or
virtual machines provided by the vendors. For e.g : Amazon EC2, Netapp, etc
1.4.2. Software-as-a-Service:
The cloud providers or vendors provide access of various software applications and tools
to cloud users via a simple Web browser Interface. Users do not need to download the
application on their local Machine. SaaS allows users to run applications remotely from
the cloud. Problem of accessing the services during high load are resolved by simply
hosting the application and data on multiple servers. For eg: Gmail, google docs,
dropbox.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 11
1.4.3. Platform-as-a-Service:
Cloud Providers provide a computing platform or software stack to the users such that
they develop and test their own software application on remote servers. PaaS is similar to
IaaS, but also includes operating systems and required services for a particular
application. In other words, PaaS is IaaS with a custom software stack for the given
application. The software stack may include operating system, an integrated development
environment. Vendors can provide a virtual machine or a physical machine as per the
requirements of the user. Users can develop and run their applications on servers
provided by the vendors without having to install environments, servers etc on their local
machine. The main benefits of using a PaaS level cloud services is Zero Infrastructure,
Lower cost and improved profitability, Easy and quick development, Reusability of code
structure and logic, Integration with other web services, Requires no up-front
investments, Centralized information management, Secured and customized access,
Minimize operational costs. [4].
1.5. Characteristics of Cloud Computing:
1.5.1. On-Demand self-service:
On-demand self-service means that customers can request and manage their own
computing resources [2].
1.5.2. Broad Network Access :
Allows services to be offered over the Internet or private networks using wide range of
devices like personal computers, laptops, mobile phones, tablets, workstations, etc [2].
1.5.3. Resource Pooling:
Pooled resources means that consumers draw from a pool of computing resources,
usually in remote data centres. The consumers don’t have knowledge about the exact
location of the resources or the data centres.
1.5.4. Elasticity :
Elasticity in cloud computing is performance property of a service which defines how the
service/application performs when the workload increases or decreases. Basically it
defines the versatility of the cloud to respond to workload changes. Resources are
elastically allotted or released based on the demand of the application or the consumer, in
some cases the allocation and release of the resources is done automatically based on the
current demand of the service,[2]
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 12
1.5.5. Measured Capacity:
The total usage of the resources can be measured, controlled, monitored and reported
continuously so that to provide transparency to the consumers as well as to provide
elasticity based on the current health of the resources.
1.6.6. Muti-tenancy:
In a multi-tenant software architecture, a single software instance is shared by multiple
clients or client organizations, the clients does not have any knowledge of other clients.
The resources of the software stack are virtually divided such that no single thread
interferes with another thread running concurrently.
1.6. Scalability in cloud systems
One of the most essential features of cloud computing services is scalability. Scalability
means that cloud computing offers unlimited processing and storage capacity. Scalability
may refer to increasing the capacity or resources to handle high amount of loads. A scalable
cloud infrastructure is which can easily increase or decrease capacity as per real time
demand of the application or consumer. Providing scalable cloud solutions is not as easy as
its definition. Scalability depends upon many factors like how complex is the architecture,
how organized is the architecture, etc. Scalability deals with the real time performance of a
system (IaaS) or an application (SaaS) or both (PaaS). Providing scalable cloud solution
involves continuously monitoring the real time performance metrics of the physical or
virtual systems or an application. Cloud providers vendors have provisions to monitor real
time performance metrics and based on load perform scaling on the resources, maybe
inward scaling or outward scaling. We will discuss further in depth about cloud computing
in upcoming chapters.
1.7. Virtualization Technology
Virtualization is the idea of partitioning or dividing the resources of a single server into
multiple segregated Virtual machines. Virtualization technology has been proposed and
developed over a relatively long period. The earliest use of VMs was by IBM in 1960,
intended to leverage investments in expensive mainframe computers [8]. The idea was to
enable multitasking – running multiple applications and processes for different users
simultaneously. Robert P. Goldberg described the need for virtual machines in 1974:
“Virtual machine systems were originally developed to correct some of the shortcomings of
the typical third generation architectures and multiprogramming operating systems – e.g.,
OS/360” [9]. Recently, owing to the rapid growth in IT infrastructure, we have seen the
emergence of multicore processors and a wide variety of hardware, operating systems, and
software. In this environment, virtualization has had a resurgence of popularity.
Virtualization can provide dramatic benefits for a computing system, including increased
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 13
utilization, energy saving, rapid deployment, improved maintenance capability, isolation,
and encapsulation. Moreover, virtualization enables applications to migrate from one server
to another while they are still running, without downtime, providing flexible workload
management, and high availability during planned maintenance or unplanned events.
1.7.1. Virtual Machines
A VM is a software implementation of a machine (i.e., a computer) that executes
programs like a physical machine [10]. This differs from a process VM, which is
designed to run a single program, such as the Java Runtime Environment (JRE). A system
VM provides a complete system platform that supports the execution of a complete
operating system (OS). The VM lifecycle has six phases: create, suspend, resume, save,
migrate, and destroy. Multiple VMs can run simultaneously in the same physical node.
Each VM can have a different OS, and a Virtual Machine Monitor (VMM) is used to
control and manage the VMs on a single physical node. A VMM is often referred to as a
hypervisor. Above this level, Virtual Infrastructure Managers (VIMs) are used to
manage, deploy, and monitor VMs on a distributed pool of resources (cluster or data
center). In addition, Cloud Infrastructure Managers (CIMs) are web-based management
solutions on the top of IaaS providers.
1.7.2. Virtualization platforms
Virtualization technology has been developed to best utilize computing capacity. Server
virtualization has been described as follows: “In most cases, server virtualization is
accomplished by the use of a hypervisor (VMM) to logically assign and separate physical
resources. The hypervisor allows a guest operating system, running on the virtual
machine, to function as if it were solely in control of the hardware, unaware that other
guests are sharing it. Each guest operating system is protected from the others and is thus
unaffected by any instability or configuration issues of the others” [12]. Virtualization
methods can be classified into two categories according to whether or not the guest OS
kernel needs to be modified. The two main types of virtualization methods are (1) full
virtualization, (2) Para-virtualization. Full virtualization emulates the entire hardware
environment by utilizing hardware virtualization support, binary code translation, or
binary code rewriting, and thus the guest OS does not need to modify its kernel. Para-
virtualization requires the guest OS kernel to be modified to become aware of the
hypervisor. Because it need not emulate the entire hardware environment, para-
virtualization can attain better performance than full virtualization.[11].
1.7.3. Virtual Infrastructure Management
A Virtual Infrastructure Manager (VIM) is responsible for the efficient management of a
virtual infrastructure as a whole, by providing basic functionality for deploying,
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 14
controlling, and monitoring VMs on a distributed pool of resources. This is done by
communicating with their VMMs.[11]
1.7.4. Cloud Infrastructure Manager
A Cloud Infrastructure Manager (CIM) is a web-based solution focused on deploying
and managing services (deploying, monitoring, and maintaining the VMs) on top of
Infrastructure as a Service (IaaS) clouds. Third-party application-hosting framework
service companies provide higher-level application deployment tools on top of IaaS.[11]
REFERENCES
[1] http://aws.amazon.com/what-is-cloud-computing
[2] The NIST Definition of Cloud Computing, Peter Mell, Timothy Grance.
[3] Handbook of Cloud Computing, Borko Furht, Armando Escalante
[4] http://www.zoho.com/creator/paas.html
[5] http://support.rightscale.com/06-FAQs/FAQ_0043_-_What_is_autoscaling%3F
[6] Auto-scaling Developer Guide , Amazon Web Services.
[7] R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, Cloud computing and
emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility,
Future Generation Computer Systems, 25:599_616, 2009
[8] Vmware (2009) http://www.vmware.com/virtualization/history.html.
[9] Goldberg RP (1974) Survey of virtual machine research. IEEE Comput Mag 7(6):34–45
[10] Virtual Machine (Wikipedia) (2009) http://en.wikipedia.org/wiki/Virtual_machine
[11]Nick Antonopoulos , Lee Gillam , Cloud Computing Principles, Systems and Applications,
Springer
[12] IBM white paper (2009) Seeding the Clouds: Key Infrastructure Elements for Cloud
Computing.ftp://ftp.software.ibm.com/common/ssi/sa/wh/n/oiw03022usen/OIW03022USEN.PD
F
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 15
Cloud Computing Infrastructures 2. Cloud Computing Infrastructures
2.1. Types of clouds
Although cloud computing has emerged mainly from the appearance of public computing
utilities other deployment models with variations in physical location and distribution , have
been adopted. In this sense, regardless of its service class , a cloud can be classified as
public , private, community or hybrid based on model deployment shown in the figure
below.
Fig 2.1 Types of Cloud Infrastructures
2.1.1. Public Cloud
Armbrust et al. [2] propose definitions for public cloud as a “cloud made available in a
pay-as-you-go manner to the general public” and private cloud as “internal data center of
a business or other organization, not made available to the general public.”
2.1.2. Private Cloud
Establishing a private cloud means restructuring an existing infrastructure by adding
virtualization and cloud-like interfaces. This allows users to interact with the local data
center while experiencing the same advantages of public clouds, most notably self-
service interface, privileged access to virtual servers, and per-usage metering and billing.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 16
2.1.3. Community Cloud
A community cloud is “shared by several organizations and supports a specific
community that has shared concerns (e.g., mission, security requirements, policy, and
compliance considerations) [1].”
2.1.4. Hybrid or Mixed Cloud
A hybrid cloud takes shape when a private cloud is supplemented with computing
capacity from public clouds [3]. The approach of temporarily renting capacity to handle
spikes in load is known as “cloud-bursting” [4].
2.2. Case Studies:
2.2.1. Amazon Web services in a nutshell (IAAS provider)
Amazon has a long history of using a decentralized IT infrastructure. This arrangement
enabled our development teams to access compute and storage resources on demand, and
it has increased overall productivity and agility. By 2005, Amazon had spent over a
decade and millions of dollars building and managing the large-scale, reliable, and
efficient IT infrastructure that powered one of the world’s largest online retail platforms.
Amazon launched Amazon Web Services (AWS) so that other organizations could
benefit from Amazon’s experience and investment in running a large-scale distributed,
transactional IT infrastructure. AWS has been operating since 2006, and today serves
hundreds of thousands of customers worldwide. Today Amazon.com runs a global web
platform serving millions of customers and managing billions of dollars’ worth of
commerce every year. Using AWS, you can requisition compute power, storage, and
other services in minutes and have the flexibility to choose the development platform or
programming model that makes the most sense for the problems they’re trying to solve.
You pay only for what you use, with no up-front expenses or long-term commitments,
making AWS a cost-effective way to deliver applications.
It offers a variety cloud services, most notably: S3 (storage), EC2 (virtual
servers), Cloudfront (content delivery), Cloud-front Streaming (video streaming),
SimpleDB (structured datastore), RDS (Relational Database), SQS (reliable messaging),
and Elastic Map-Reduce (data processing). The Elastic Compute Cloud (EC2) offers
Xen-based virtual servers (instances) that can be instantiated from Amazon Machine
Images (AMIs). Instances are available in a variety of sizes, operating systems,
architectures, and price. CPU capacity of instances is measured in Amazon Compute
Units and, although fixed for each instance, vary among instance types from 1 (small
instance) to 20 (high CPU instance). Each instance provides a certain amount of non-
persistent disk space; a persistence disk service (Elastic Block Storage) allows attaching
virtual disks to instances with space up to 1TB. Elasticity can be achieved by combining
the Cloud-Watch, Auto Scaling, and Elastic Load Balancing features, which allow the
number of instances to scale up and down automatically based on a set of customizable
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 17
rules, and traffic to be distributed across available instances. Fixed IP address (Elastic
IPs) are not available by default, but can be obtained at an additional cost. In summary,
Amazon EC2 provides the following features: multiple data centers available in the
United States (East and West) and Europe; CLI, Web services (SOAP and Query), Web-
based console user interfaces; access to instance mainly via SSH (Linux) and Remote
Desktop (Windows); advanced reservation of capacity (aka reserved instances) that
guarantees availability for periods of 1 and 3 years; 99.5% availability SLA; per hour
pricing; Linux and Windows operating systems; automatic scaling; load balancing.[5][6].
Fig 2.2 List of Amazon Web services
2.2.2. Joyent(IAAS provider)
Joyent’s Public Cloud offers servers based on Solaris containers virtualization
technology. These servers, dubbed accelerators, allow deploying various specialized
software-stack based on a customized version of Open-Solaris operating system, which
include by default a Web-based configuration tool and several pre-installed software,
such as Apache, MySQL, PHP, Ruby on Rails, and Java. Software load balancing is
available as an accelerator in addition to hardware load balancers. A notable feature of
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 18
Joyent’s virtual servers is automatic vertical scaling of CPU cores, which means a virtual
server can make use of additional CPUs automatically up to the maximum number of
cores available in the physical host. In summary, the Joyent public cloud offers the
following features: multiple geographic locations in the United States; Web-based user
interface; access to virtual server via SSH and Web-based administration tool; 100%
availability SLA; per month pricing; OS-level virtualization Solaris containers; Open-
Solaris operating systems; automatic scaling (vertical).[6] Joyent uses true cloud-native
technologies, like OS virtualization, to maximize the performance of its computing
resources. Unlike legacy hardware-virtualization designs, the underlying operating
system – SmartOS – runs on bare-metal, meaning there are no extra layers to navigate
before gaining access to hardware resources.[7]
2.2.3. Rackspace Cloud Servers (IAAS provider).
Rackspace Cloud Servers is an IaaS solution that provides fixed size instances in the
cloud. Cloud Servers offers a range of Linux-based pre-made images. A user can request
different-sized images, where the size is measured by requested RAM, not CPU. Cloud
Servers also offers hybrid approach where dedicated and cloud server infrastructures can
be combined to take the best aspects of both styles of hosting as required. Cloud Servers,
as part of its default offering, enables fixed (static) IP addresses, persistent storage, and
load balancing (via A-DNS) at no additional cost.[8],[6]
2.2.4. Flexiscale (IAAS provider)
Flexiscale is a Europe-based provider offering services similar in nature to Amazon Web
Services. However, its virtual servers offer some distinct features, most notably:
persistent storage by default, fixed IP addresses, dedicated VLAN, a wider range of
server sizes, and runtime adjustment of CPU capacity (aka CPU bursting/vertical
scaling). Similar to the clouds, this service is also priced by the hour. In summary, the
Flexiscale cloud provides the following features: available in UK; Web services (SOAP),
Web-based user interfaces; access to virtual server mainly via SSH (Linux) and Remote
Desktop (Windows); 100% availability SLA with automatic recovery of VMs in case of
hardware failure; per hour pricing; Linux and Windows operating systems; automatic
scaling (horizontal/vertical).[9][6].
2.2.5. Gogrid (IAAS provider)
GoGrid, like many other IaaS providers, allows its customers to utilize a range of pre-
made Windows and Linux images, in a range of fixed instance sizes. GoGrid also offers
“value-added” stacks on top for applications such as high-volume Web serving, e-
Commerce, and database stores. It offers some notable features, such as a “hybrid
hosting” facility, which combines traditional dedicated hosts with auto-scaling cloud
server infrastructure. In this approach, users can take advantage of dedicated hosting
(which may be required due to specific performance, security or legal compliance
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 19
reasons) and combine it with on-demand cloud infrastructure as appropriate, taking the
benefits of each style of computing. As part of its core IaaS offerings, GoGrid also
provides free hardware load balancing, auto-scaling capabilities, and persistent storage,
features that typically add an additional cost for most other IaaS providers.[10][6].
2.2.6. App Engine(PAAS provider)
Google App Engine lets you run your Python and Java Web applications on elastic
infrastructure supplied by Google. App Engine allows your applications to scale
dynamically as your traffic and data storage requirements increase or decrease. Google
App Engine supports apps written in several programming languages. With App Engine's
Java runtime environment, you can build your app using standard Java technologies,
including the JVM, Java servlets, and the Java programming language - or any other
language using a JVM-based interpreter or compiler, such as JavaScript or Ruby. App
Engine also features a Python runtime environment, which includes a fast Python
interpreter and the Python standard library. App Engine also features a PHP runtime, with
native support for Google Cloud SQL and Google Cloud Storage that works just like
using a local mySQL instance and doing local file writes. Finally, App Engine provides a
Go runtime environment that runs natively compiled Go code. These runtime
environments are built to ensure that your application runs quickly, securely, and without
interference from other apps on the system.. The App Engine serving architecture is
notable in that it allows real-time auto-scaling without virtualization for many common
types of Web applications. However, such auto-scaling is dependent on the application
developer using a limited subset of the native APIs on each platform, and in some
instances you need to use specific Google APIs such as URLFetch, Datastore, and
memcache in place of certain native API calls. For example, a deployed App Engine
application cannot write to the file system directly (you must use the Google Datastore)
or open a socket or access another host directly (you must use Google URL fetch
service). A Java application cannot create a new Thread either. [11],[6].
2.2.7. Microsoft Windows Azure
Windows Azure delivers a 99.95% monthly SLA and enables you to build and run highly
available applications without focusing on the infrastructure. It provides automatic OS
and service patching, built in network load balancing and resiliency to hardware failure.
It supports a deployment model that enables you to upgrade your application without
downtime .Microsoft Azure Cloud Services offers developers a hosted .NET Stack (C#,
VB.Net, ASP.NET). In addition, a Java & Ruby SDK for .NET Services is also available.
The Azure system consists of a number of elements. The Windows Azure Fabric
Controller provides auto-scaling and reliability, and it manages memory resources and
load balancing. The .NET Service Bus registers and connects applications together.
Windows Azure enables you to use any language, framework, or tool to build
applications. Features and services are exposed using open REST protocols. The
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 20
Windows Azure client libraries are available for multiple programming languages, and
are released under an open source license and hosted on GitHub. Windows Azure
delivers a flexible cloud platform that can satisfy any application need. It enables you to
reliably host and scale out your application code within compute roles. The Windows
Azure Fabric Controller provides auto-scaling and reliability, and it manages memory
resources and load balancing. You can store data using relational SQL databases, NoSQL
table stores, and unstructured blob stores, and optionally use Hadoop and business
intelligence services to data-mine it. You can take advantage of Windows Azure’s robust
messaging capabilities to enable scalable distributed applications, as well as deliver
hybrid solutions that run across a cloud and on-premises enterprise environment.
Windows Azure’s distributed caching and CDN services allow you to reduce latency and
deliver great application performance anywhere in the world.[11][6].
2.2.8. Heroku(PAAS provider)
Heroku is a polyglot cloud application platform. With Heroku, you don’t need to think
about servers at all. You can write apps using modern development practices in the
programming language of your choice, back it with add-on resources such as SQL and
NoSQL databases, Memcached, and many others. Consumers manage your app using the
Heroku command-line tool and you deploy code using the Git revision control system, all
running on the Heroku infrastructure. In the Heroku system, servers are invisibly
managed by the platform and are never exposed to users. Applications are automatically
dispersed across different CPU cores and servers, delivering high performance and
minimizing contention. Heroku has an advanced logic layer than can automatically route
around failures, providing uninterrupted service at all times. [12], [6].
2.2.9. Aneka (PAAS product)
Aneka plays the role of Application Platform as a Service for Cloud Computing. Aneka
supports various programming models involving Task Programming, Thread
Programming and Map-Reduce Programming, which enables a variety of data-mining
and search applications, and tools for rapid creation of applications and their seamless
deployment on private or public Clouds to distribute applications. One of the notable
features of Aneka PaaS is to support provisioning of private cloud resources ranging from
desktops, clusters to virtual datacenters using VMWare, Citrix Zen server and public
cloud resources such as Windows Azure, Amazon EC2, and GoGrid Cloud Service. Each
server in an Aneka deployment (dubbed Aneka cloud node) hosts the Aneka container,
which provides the base infrastructure that consists of services for persistence, security
(authorization, authentication and auditing), and communication (message handling and
dispatching). Cloud nodes can be either physical server, virtual machines (XenServer and
VMware are supported), and instances rented from Amazon EC2. [12][6]
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 21
2.2.10. AppScale (Open Source PAAS)
APPSCALE is an open source distributed software framework or stack used to
implement cloud platform as a service. AppScale is implemented using a collection of
open source technologies like Hadoop, NoSql & Mysql Databases, Apache Zookeeper,
load balancer (haproxy), Mem-cached .etc. AppScale‘s design in based on the Google
App Engine API’s. Using APPSCALE developers can develop their applications
anywhere which are compatible with the Google’s Appengine. Hence, an application
developed on a local AppScale based Virtual Machine can be deployed on any AppScale
based cloud and Google AppEngine, APPSCALE Adds portability to the application and
makes it independent of the cloud beneath. AppScale provides an additional layer of
metrics i.e application level of metrics. AppScale can run on a single a single VM
instance or the can be distributed to multiple instances by assigning different roles to
different VM ‘s and each VM having a Appcontroller to monitor and control the role. The
AppScale platform also provides the scalability, ease of use, and high availability that
users have come to expect from public cloud platforms and infrastructures. This includes
elasticity and fault detection/recovery, authentication and user control, monitoring and
logging, cross-cloud data and application migration, hybrid cloud multitasking, and
offline analytics and disaster recovery. AppScale couple elasticity and fault tolerance to
start/stop platform components within and across VMs, and we ultimately rely on Apache
Zookeeper — which is employed for distributed co-ordination and state management —
for system survivability.
2.2.11. VMWare Vsphere and Vcloud
VMware’s vsphere is suite of tools aimed at transforming IT infrastructures into private
clouds. It distinguishes from other Virtual Infrastructure managers as one of the most
feature-rich, due to the company’s several offerings in all levels the architecture. It
enables IT to meet SLAs (service-level agreements) for the most demanding business
critical applications, at the lowest TCO (total cost of ownership). vSphere accelerates the
shift to cloud computing for existing data centers and also underpins compatible public
cloud offerings, forming the foundation for the industry’s only hybrid cloud model. With
VMware vSphere, a comprehensive set of management tools, and the VMware vCloud
Initiative to ensure cloud compatibility, VMware enables IT organizations to abstract
applications and information from the complexity of the underlying infrastructure –
whether it’s internal or external and deliver applications as a service over which they
have complete control.[13],[14]
2.2.12. Nimbus(IAAS )
In a nutshell, Nimbus allows a client to lease remote resources by deploying virtual
machines (VMs) on those resources and configuring them to represent an environment
desired by the user. It is an open source "infrastructure-as-a-service" (IaaS) solution. Also
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 22
included in a Nimbus installation is a storage cloud implementation called Cumulus that
has been tightly integrated with the other central services .Cumulus is compatible with
the Amazon Web Services S3 REST API, but extends it as well. A set of software needs
to be installed to one service node in a non-root account and a separate piece of software
needs to be installed on any number of virtual machine monitor (VMM) nodes. There is
also a separate download for a client called the cloud client that makes life very easy for
users. Clients interact with the service using credentials over multiple protocols. The
easiest client to use is the cloud client which is geared towards getting users up and
running in minutes. The service must be configured in the cloud configuration to serve
requests from the cloud client. These clients implement web services messaging .But
Nimbus also provides an implementation of Amazon's Elastic Compute Cloud (EC2) that
allows you to use clients developed for the real EC2 system against Nimbus based
clouds. [15]
2.2.13. OpenNebula(Open Source Data Center Virtualization)
OpenNebula is the open-source industry standard for data center virtualization, offering
the most feature-rich, flexible solution for the comprehensive management of virtualized
data centers to enable on-premise Infrastructure as a Service Clouds. OpenNebula
provides many different interfaces that can be used to interact with the functionality
offered to manage physical and virtual resources. OpenNebula provides a powerful,
scalable and secure multi-tenant cloud platform for fast delivery and elasticity of virtual
resources. Multi-tier applications can be deployed and consumed as pre-configured
virtual appliances from catalogs.[16]
2.2.14. Eucalyptus (Open Source private cloud)
“Elastic Utility Computing Architecture Linking Your Programs To Useful Systems” – is
an open-source software infrastructure for implementing cloud computing on clusters.
The current interface to Eucalyptus is compatible with Amazon’s EC2, S3, and EBS
interfaces, but the infrastructure is designed to support multiple client-side interfaces. The
Eucalyptus framework was one of the first open-source projects to focus on building IaaS
clouds. It has been developed with the intent of providing an open-source implementation
nearly identical in functionality to Amazon Web Services APIs. Therefore, users can
interact with a Eucalyptus cloud using the same tools they use to access Amazon EC2. It
also distinguishes itself from other tools because it provides a storage cloud API—
emulating the Amazon S3 API—for storing general user data and VM images. In
summary, Eucalyptus provides the following features: Linux-based controller with
administration Web portal; EC2-compatible (SOAP, Query) and S3- compatible (SOAP,
REST) CLI and Web portal interfaces; Xen, KVM, and VMWare backends; Amazon
EBS-compatible virtual storage devices; interface to the Amazon EC2 public cloud;
virtual networks.[17]
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 23
REFERENCES
[1] P. Mell and T. Grance, The NIST Definition of Cloud Computing, National Institute of
Standards and Technology, Information Technology Laboratory,Technical Report Version 15,
2009
[2] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, and R. Katz, Above the Clouds:A Berkeley
View of Cloud Computing, UC Berkeley Reliable Adaptive Distributed Systems Laboratory
White Paper, 2009.
[3] B. Sotomayor, R. S. Montero, I. M. Llorente, and I. Foster, Virtual infrastructure
management in private and hybrid clouds, IEEE Internet Computing,
13(5):14_22,September/October, 2009
[4] P. T. Jaeger, J. Lin, J. M. Grimes, and S. N. Simmons, Where is the cloud? Geography,
economics, environment, and jurisdiction in cloud computing, FirstMonday, 14(4_5): 2009
[5] http://d36cz9buwru1tt.cloudfront.net/AWS_Overview.pdf
[6] William Voorsluys, James Broberg, and Rajkumar Buyya, Cloud Computing Principles and
Paradigms, 2011, wiley publications
[7] http://www.joyent.com/technology
[8] www.rackspace.com/Cloud_Servers
[9] http://www.flexiscale.com/
[10] http://www.gogrid.com
[11]http://www.windowsazure.com/en-us/overview/what-is-windows-azure/
[12] https://devcenter.heroku.com/articles/quickstart
[13] VMWare Inc., VMware vSphere, the First Cloud Operating, White Paper, 2009
[14] VMWare Inc., VMware vSphere, http://www.vmware.com/products/vsphere/,22/4/2010
[15] http://www.nimbusproject.org/docs/2.10.1/summary.html
[16] http://opennebula.org/documentation:rel4.2:intro
[17] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and
D.Zagorodnov, The Eucalyptus open source cloud computing system, in Proceedingsof
IEEE/ACM International Symposium on Cluster Computing and the Grid(CCGrid 2009),
Shanghai, China, pp. 124_131, University of California, Santa Barbara. (2009, Sep.) Eucalyptus
[online]. http://open.eucalyptus.com.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 24
Eucalyptus: Open Source, AWS compatible
private cloud 3. Eucalyptus: Open Source, AWS compatible private cloud
Eucalyptus is open source software for building, on-premise, AWS-compatible private clouds.
Eucalyptus is a set of web services, modeled after and compatible with Amazon Web Services
(AWS). Eucalyptus is written mostly in Java, Eucalyptus integrates components from over 100
open source projects, tested and packaged into a single easy-to-install and easy-to-use product.
Eucalyptus runs on virtualized infrastructure (Linux +KVM or VMware).Eucalyptus is a
software framework or stack which provides an AWS compatible API which is used to
communicate with various AWS services. Eucalyptus helps in creating and managing a private or
even a publicly accessible cloud. Eucalyptus has become very popular and is seen as one of the
key open source cloud platforms. As Eucalyptus is compatible with AWS, the client tools written
for AWS can be used for Eucalyptus as well.
3.1. Architecture
3.1.1. Node Controller(NC)
Node Controller runs on each node and controls the life cycle of all instances running on
the node. The NC interacts with the OS and the hypervisor (KVM or XEN) running on
the node on one side and the Cluster Controller (CC) on the other side. NC queries the
Operating System running on the node to discover the node's physical resources {the
number of cores, the size of memory, the available disk space and also to learn about the
state of VM instances running on the node and propagates this data up to the CC. The
main function of Node Controller is collection of data related to the resource availability
and utilization on the node and reporting the data to CC. Node Controller is also
responsible for Instance life cycle management.
3.1.2. Cluster Controller(CC)
CC manages single or multiple Node Controllers and deploys/manages instances on
them. CC also manages the networking for the instances running on the Nodes under
certain types of networking modes of Eucalyptus. CC is responsible for Network
management.CC communicates with the Cloud controller (CLC) and the Node controller
CC receives requests from Cloud Controller (CLC) to deploy instances. CC makes
decisions on which NC’s are to be used for deployment of the instances.CC controls the
virtual network available to the instances.CC collects information about the NC’s
registered to it and reports it to CLC.
3.1.3. Walrus Storage Controller(WS3)
WS3 provides a persistent simple storage service using REST and SOAP APIs
compatible with S3 APIs. WS3 is responsible for storing the machine images, snapshots
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 25
and storing and serving files using the S3 API. WS3 can be considered as simple file
storage system. Using Walrus users can store persistent data, which is organized as
buckets and objects. WS3 is a file level storage system, as compared to the block level
storage system of Storage Controller.
3.1.4. Storage Controller(SC)
SC provides persistent block storage for use by the instances. This is similar to the Elastic
Block Storage (EBS) service from AWS. The main function of Storage Controller is to
create persistent EBS devices. It provides the block storage over AoE(ATA-over-
Ethernet) or iSCSI protocol to the instances.SC allows creation of snapshots of volumes.
3.1.5. Cloud Controller (CLC)
The Cloud Controller (CLC) is the front end to the entire Eucalyptus cloud infrastructure.
CLC provides an AWS compliant web services interface to the client tools on one side
and interacts with the rest of the components of the Eucalyptus infrastructure on the other
side. CLC also provides a web interface to users for managing certain aspects of the
cloud infrastructure. The main function of CLC is to monitor the availability of resources
on various components of the cloud infrastructure, including hypervisor nodes that are
used to actually provision the instances and the cluster controllers that manage the
hypervisor nodes .CLC is responsible in deciding which clusters will be used for
provisioning the instances. CLC is also responsible for monitoring the running instances.
CLC has most of the knowledge of the availability and usage of resources in the cloud
and the state of the cloud.[1][2][3][4]
3.1.6. VMware Broker
Eucalyptus framework includes an optional component, the VMware Broker. The
VMware Broker mediates all interaction between Eucalyptus and VMware infrastructure
components (that is, ESX/ESXi, and vCenter). VMware broker is an interface between
Eucalyptus cloud infrastructure and VMware infrastructure.[1][2][3][4]
3.2. Eucalyptus Machine Images (EMI)
A Eucalyptus Machine Image (EMI) is a combination of a virtual disk image(s), kernel and
ram-disk images as well as an xml file containing metadata about the image. These images
are stored on WS3 and are used as templates for creating instances. An EMI is a
combination of following XML’s:
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 26
Fig 3.1 logical Eucalyptus architecture
· An XML file with a name like \linux.img.manifest.xml" with information
about one or more hard disk images, a kernel image and a ram-disk image (id
emi-66441F7E).
· An XML file with a name like “\vmlinuz-2.6.28-11-server.manifest.xml" with
information about the corresponding kernel image (id eki-39FC1255).
· An XML file with a name like \initrd.img-2.6.28-11-server.manifest.xml"
with information about the corresponding ram-disk image (id eri-71ED1644).
Each of these images has its own ID that can be used while running the instances. Since most
enterprise/individual users of Eucalyptus have a need for bringing up instances based on
custom images, image management plays a key role in Eucalyptus administration. Bundling
an EMI is a multi-step procedure involving the following steps:
· Creating a virtual disk image.
· Installing the OS.
· Installing required applications.
· Making the OS ready to run over Eucalyptus cloud .
· Registering the images with Eucalyptus cloud.
· Testing the Image.
3.3. Security
Eucalyptus provides ingress filtering on the instances based on the concept of security
groups which is similar to security groups in AWS. We create custom rules according to our
requirements. You can specify a security group while launching an instance. Each security
group can have multiple rules associated with it. Each rule specifies the source IP/network,
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 27
protocol type, destination ports etc. Any packet matching these parameters specified in a
rule is allowed to access the eucalyptus instance. Rest of the packets are blocked. A security
group that does not have any rules associated with it causes blocking of all incoming traffic.
The security group mechanism does not provide any egress filtering, so all outgoing traffic
from the instance is allowed.[1][2][3][4]
REFERENCES
[1] www.eucalyptus.com
[2]http://www.eucalyptus.com/docs/eucalyptus/3.3/admin-guide/
[3] http://www.eucalyptus.com/docs/eucalyptus/3.3/user-guide/
[4] https://github.com/eucalyptus
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 28
Monitoring in Cloud Infrastructures 4. Monitoring in Cloud Infrastructures.
Monitoring of Cloud is a task of great importance for both Cloud service providers and Cloud
users. It is a key aspect for controlling, maintaining and managing hardware and software
infrastructures. Monitoring provides information and key performance parameters for platforms
and applications. Real-Time monitoring of the Cloud and of its SLAs provides both the Vendor
and the Consumers with information such as the workload generated by the latter and the
performance and QoS offered through the Cloud, also allowing to implement mechanisms to
prevent or recover violations and breakdowns. Cloud Computing involves many activities for
which monitoring is an essential task.
4.1. Need Of Monitoring in Cloud
4.1.1. Capacity and Resource Planning
One of the most challenging tasks for application and system developers, before the large
scale adoption of Cloud Computing, has always been resource and capacity planning. In
order to guarantee the performance required by applications and services, developers
have to (i) quantify capacity and resources (e.g. CPU, memory, storage, etc.) to be
purchased, depending on how such applications and services are designed and
implemented, and (ii) determine the estimated workload for the application or service.[1].
However, while an estimation can be obtained through static analysis, testing and
monitoring, the real values are unpredictable and highly variable. Cloud vendors usually
offer guarantees in terms of QoS and thus of resources and capacity for their services as
specified in SLAs[2], and they are in charge of their resource and capacity planning so
that service and application developers do not have to worry about them[3]. Monitoring is
the most important aspect for Cloud vendors to predict and keep track of the evolution of
all the parameters involved in the process of QoS assurance.
4.1.2. Capacity and Resource Management
The initial step to manage a complex system like a Cloud consists in having a monitoring
system able to accurately capture its real time state,[4]. Virtualization has become a key
component to implement Cloud Computing. Hiding the high heterogeneity of resources
of the physical infrastructure, virtualization technologies introduced another complexity
level for the infrastructure provider, which has to manage both physical and virtualized
resources [1]. Virtualized resources may migrate from a physical machine to another at
any time. Hence, in Cloud Computing scenarios monitoring is necessary to cope with
volatility of resources and fast changing network conditions. The main concern of Cloud
vendors is to provide 100% uptime to the consumer’s application.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 29
4.1.3. Data Center Management
Cloud services are provided through large scale data centers, located at multiple remote
places, whose management is a very important activity. Actually, this activity is part of
resource management and we reported it here because of its importance and of its
peculiar requirements. Data center management activities (e.g. data center control) imply
two fundamental tasks: (i) monitoring, that keeps track of desired hardware and software
metrics; (ii) data analysis, that processes such metrics to infer system or application states
for resource provisioning, troubleshooting, or other management actions [5]. In order to
properly manage such data centers, both monitoring and data analysis tasks must support
real time operation and scale up to tens of thousands of heterogeneous nodes, dealing
with complex network topologies and I/O structures.[1]
4.1.4. SLA Management
Monitoring may allow Cloud providers to formulate more realistic and dynamic SLAs
and better pricing models by exploiting the knowledge of user perceived performance [6]
4.1.5. Billing.
One of the most essential characteristics of Cloud Computing is to offer “measured
services”, allowing the Consumer to pay proportionally to the use of the service with
different metrics and different granularity, according to the type of service and the price
model adopted [1].
4.1.6. Troubleshooting
A comprehensive, reliable and timely monitoring platform is therefore needed for
Providers to understand where to locate the problem inside their complex infrastructure
and for Consumers to understand if any occurring performance issue or failure is caused
by the Provider, network infrastructure, or by the application itself.[7].
4.1.7. Performance and Security Management
Monitoring the perceived performance of the cloud is necessary to adapt to the changes
or to apply corrective measures to overcome failures. Monitoring is then necessary since
it may considerably improve the performance of real applications [8] and affect activity
planning and repeatability of experiments. Cloud security is very important for a number
of reasons. Security is considered as one of the most significant obstacles to the spread of
Cloud Computing, especially considering certain kinds of applications. For hosting
critical services for public agencies, Clouds have to satisfy strict regulations and prove it.
And this can be done through a monitoring system that enables auditing.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 30
4.2. Abstraction levels of monitoring in Cloud
Monitoring can be done on two different levels, depending on the beneficiary of the
monitoring information.
4.2.1. Low–Level Monitoring
Low level monitoring is related to information collected by the Cloud Vendor and
usually not exposed to the Consumer, and it is more concerned with the status of the
physical infrastructure of the whole Cloud (e.g. servers and storage areas, etc.). In the
context of IaaS, both levels are of interest for both Consumers and Providers. Low level
monitoring specific utilities collect information at the hardware layer, at the operating
system layer and at middleware layer, at the network layer and at the facility layer. For
commercial Cloud providers, the low-level monitoring service is usually kept
confidential.
4.2.2. High-Level Monitoring
High-level monitoring information is typically interesting for Cloud consumers. High
level monitoring is related to information on the status and health of the virtual platform.
This information is collected at the middleware, application and user layers by Providers
or Consumers through platforms and services operated by themselves or by third parties.
In the case of SaaS, high level monitoring information is generally of more interest for
the Consumer than for the Provider.
4.3. Commercial Cloud Monitoring Platforms and Services
Commercial Platforms implements both low-level and high level monitoring.
4.3.1. Amazon EC2
As a commercial cloud, Amazon does not provide any information about low-level
monitoring system that it uses to gather information of the physical infrastructure. The
way the monitoring data is collected and analyzed is kept confidential. The approach that
Amazon has adapted with respect to high-level resource monitoring is to provide a
service called CloudWatch , in which collected information are mainly related to the
virtual platforms,[9]. CloudWatch is able to monitor Amazon services like EC2, Elastic
Load Balancing and Amazon's Relational Database Service. CloudWatch gathers several
kinds of monitoring information and it stores them for two weeks. On these data, users
can build plots, statistics, indicators, temporal behaviors, thresholds, alarms, etc.. Alarms
can trigger specific actions like event notification, through the Amazon SNS service, or
Auto-scaling [9]. CloudWatch is actually a generic mechanism for measurement,
aggregation and querying of historic data. All the measurements are aggregated together
over a period of one minute.[9]. In association with the Elastic Load Balancer service and
the Auto-scaling feature, CloudWatch can be configured to automatically replace
unhealthy platform instances with new instances. CloudWatch comes with an alarm
feature. An alarm has a number of actions that are triggered when a measure of a
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 31
parameter acquired by the monitoring service increases over a threshold or decreases
under a threshold. The measures of the parameter that the CloudWatch is supposed to
monitor, are configurable and the thresholds correspond to configurable limits for these
measures. The possible actions are either a scaling action (Auto-scaling) or a notification
action(messaging alert).The basic monitoring plan (Free Tier) is free of charge in which
cloudwatch monitors basic metrics over a sampling rate of five minutes. An advanced
plan is charged in which cloudwatch provides advanced performance metrics and one
minute sampling rate,[10].
4.3.2. Microsoft Azure
The information about the low level monitoring system is not disclosed by Microsoft. For
monitoring applications deployed on Microsoft Windows Azure cloud infrastructure, the
application developer is given a software library that facilitates application diagnostics
and monitoring for Azure applications. This library is integrated into the Windows Azure
SDK. It features performance counters, logging, and log monitoring. Some third party
monitoring services are developed around the software library. Among many,
AzureWatch is a monitoring service that is mostly used by the developers,[11].
AzureWatch monitors and aggregates key performance metrics from Azure resources like
instances, databases, database federations, storage, websites and web applications.
Applications. It supports user defined performance counters related to application
metrics. According to the information available on the website, it explicitly addresses
Scalability, Adaptability, Autonomicity , and Extensibility.[11]
4.3.3. Rackspace
The information about the low-level monitoring entity has not made public. RackSpace,
uses CloudSites to provide its consumers with monitoring data like CPU utilization and
traffic volume. In addition, RackSpace provides tools, called Cloud tools, able to build a
complete monitoring solution with specific focus on virtual machines and alerting
mechanisms. RackSpace has recently acquired CloudKick [12], a multi-- Cloud
management platform with a wide range of both high and low level monitoring features
and metrics, and provisions to develop custom plugins. Monitoring data can be visualized
in real time and alert systems can be configured to inform users in real time (e.g. through
email or SMS) [12]. The platform mainly provides Scalability and Adaptability.
4.3.4. CloudStatus
It is the first independent Cloud monitoring service which can be used to monitor
Amazon web Services and Google’s App Engine. It is built on top of Hyperic HQ. It
provides monitoring of user application performance, a methodology for evaluating the
root cause analysis of performance changes and degradations, and both real time and
weekly trends of monitored metrics. The main feature of such platform is Timeliness.[13]
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 32
4.3.5. GoGrid
GoGrid has not disclosed the low-level monitoring system that it uses to monitor the
physical infrastructure. GoGrid runs a collaboration program. The name of this program
is GoGrid exchange[14].This program provides third party services that can be useful for
cloud consumers. Some of the services offered by the program include monitoring
features ranging from platform security monitoring, resource usage monitoring and
database monitoring. These services also include the possibility of configurable alerts
based on the values of the monitored metrics.
4.3.6. Nimsoft.
Nimsoft Monitor dramatically improves service quality and reduces the costs of IT
service delivery by utilizing a single, unified IT monitoring solution spanning both
traditional data centers and newer virtualization and cloud environments. CA Nimsoft
Monitor allows you to proactively monitor the quality of service delivered by your entire
IT environment including servers, applications, networks, databases, storage, cloud
environments and end user experience. IT professionals and business users gain end-to-
end visibility into their IT services using our customizable portals, dashboards and
reports. Nimsoft provides features like scalability and multi-tenancy.[15]
4.3.7. AppDynamics Pro
AppDynamics Pro is an advanced Application performance monitoring service.
AppDynamics can be used to monitor single system, or an entire datacenter or a Cloud
infrastructure. AppDynamics is independent of the underlying cloud fabric.
AppDynamics Pro provides real time monitoring of many types of applications
specifically database oriented applications. Detail information about AppDynamics is
provided in coming chapters.
There are many other commercial monitoring services like Monitis, LogicMonitor, Aneka,
GroundWork, etc. that can be used as a monitoring solution for cloud infrastructures.
4.4. Open Source Monitoring Platforms and Services
4.4.1. Nagios
Nagios is a powerful monitoring system that enables organizations to identify and resolve
IT infrastructure problems before they affect critical business processes[16].
Nagios has been extended with monitoring capabilities for both virtual instances and
storage services for cloud infrastructures.[17]. Thanks to such extensions it has been
adopted for monitoring Eucalyptus, a well known open source platform for Cloud
Computing, compatible with both EC2 and S3 Amazon services. It is also used for
monitoring OpenStack , an open source Cloud platform for IaaS.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 33
4.4.2. OpenNebula
OpenNebula is an open-source project used for building and managing virtualized
enterprise data centers, enterprise private, public and hybrid clouds.[18]. Using as entity
called Information Manager, it monitors Cloud physical infrastructures and provides
information to Cloud providers. Monitoring data are collected through probes installed on
the nodes, queried through SSH connections, and they are related to information
concerning the status of physical nodes. It provides Scalability and Adaptability as main
features.[1].
4.4.3. CloudStack Zenpack
CloudStack [19] is an open source software written in Java, designed to deploy and
manage large networks of virtual machines, as a highly available and scalable Cloud
platform. It currently supports the most popular hypervisors (e.g. VMware, Oracle VM,
KVM, XenServer, and Xen Cloud Platform), and offers three ways to manage Cloud
computing environments: via a web interface, a command line tool, and a RESTful API.
In order to monitor CloudStack virtual and physical devices, a Zenoss extension called
ZenPack [20] can be used. It manages both alerts and events and provides the parameters
(aggregated from all zones, pods, clusters and hosts) related to the memory, CPU, and
storage, as well as to the network. The main feature offered by the CloudStack ZenPack
is Timeliness.
4.4.4. Nimbus
The Nimbus [21] platform is an integrated set of tools (application instantiation,
configuration, monitoring, repair, etc.) to implement infrastructure Clouds for scientific
users supporting the combination of OpenStack, Amazon, and other Clouds.
4.4.5. Dargos
DARGOS [22] is a distributed Cloud monitoring architecture using a hybrid push/pull
approach to disseminate resource monitoring information. DARGOS provides measures
of the physical and virtual resources in the Cloud while maintaining a low overhead. In
addition, it has been designed to be flexible and extensible with new metrics easily.
DARGOS ensures an accurate measurement of physical and virtual resources in the
Cloud keeping at the same time a low overhead. In addition, DARGOS is flexible and
adaptable and allows defining and monitoring new metrics easily. The proposed
monitoring architecture and related tools have been integrated into a real Cloud
deployment based on the OpenStack platform [22].
Some of other open source monitoring services are Ganglia, Hyperic HQ, PCMONS,
Sensu,etc.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 34
4.5. Monitoring Applications in Cloud infrastructures(PaaS Level):
Businesses are driving their IT operatives to improve performance and boost productivity by
becoming increasingly application-centric, a radical change from the infrastructure-centric
approach they had earlier. At the same time, however, applications themselves are becoming
increasingly difficult to manage as they move toward highly-distributed, multi-tier, multi-
element constructs that in many cases rely on application development frameworks such as
Microsoft .NET or Java. Moreover, businesses try to host multiple applications on a single
physical machine or virtual machine, to minimize the cost. Small businesses outsource the
hosting of their application to public datacenter or public clouds to further lower the costs.
Virtualization has furthermore reduced the cost for hosting applications in cloud. Multi-
tenancy has become one of the main features of public data centers and cloud infrastructure.
In a multi-tenancy architecture, multiple cloud clients share computing resources. The main
features a multi-tenant cloud architecture should satisfy are scalability, security and cost
effectiveness. In reality, because of the cost-effectiveness, businesses are migrating to public
cloud infrastructures. But affordability comes at a price –“sharing of resources”. To provide
scalability, in a multi-tenant architecture, real time monitoring at the PaaS level is required
such that applications can be scaled horizontally or vertically. Application performance
management (APM) is a discipline within systems management. It focuses on the
monitoring and availability of software applications. APM looks at how fast transactions are
completed for an end user or how fast information is delivered to the end user, via a
particular network or web services infrastructure. Virtualization and sharing of resources has
added more to already existing complexity of cloud infrastructures. Application
performance management or monitoring tools have become of great importance.
REFERENCES
[1] Giuseppo Aceto,Alessio Botta, Walter de Donato, Antonio Pescapè1 University of Napoli
Federico II, Cloud Monitoring: a Survey
[2] P.Hasselmeyer and N. d'Heureuse, "Towards holistic multi-- tenant monitoring for virtual
data centers," in Network Operations and Management Symposium Workshops (NOMS Wksps),
2010 IEEE/IFIP, Apr. 2010,pp. 350 356
[3] J.Shao and Q.Wang, "A performance guarantee approach for cloud applications based on
monitoring," in Computer Software and Applications Conference Workshops (COMPSACW),
2011 IEEE 35th
Annual, Jul. 2011, pp. 25 30.
[4] A.Viratanapanu, A.K. A.Hamid, Y. Kawahara, T. Asami, On demand fine grain resource
monitoring system for server consolidation. Kaleidoscope: Beyond the Internet? Innovations for
Future Networks and Services, 2010 ITU T,IEEE, pp.1 8
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 35
[5] C.Wang,K.Schwan,V. Talwar, G. Eisenhauer, L.Hu,and M.Wolf: A flexible architecture
integrating monitoring and analytics for managing large scale data centers, Proceedings of
ICAC, 2011.
[6] A.Khurshid, A.Al Nayeem,and I. Gupta, "Performance evaluation of the Illinois cloud
computing testbed," unpublished, Tech. Rep., Jun.2009
[7] L.Romano.D. D.Mari, Z. Jerzak, and C. Fetzer,“A novel approach to QoS monitoring in the
cloud”, Data Compression, Communications and Processing, International Conference on, vol. 0,
pp. 45 51, 2011.
[8] J.Schad, J.Dittrich,and J. A. Q. Ruiz, "Runtime measurements in the cloud: observing,
analyzing, and reducing variance," Proc. VLDB Endow., vol. 3, no. 1 2, pp. 460 471,Sep.2010.
[9] http://awsdocs.s3.amazonaws.com/AmazonCloudWatch/latest/acw---dg.pdf
[10] http://aws.amazon.com/pricing/cloudwatch/
[11] http://www.paraleap.com/azurewatch.
[12] https://www.cloudkick.com/home
[13]http://www.hyperic.com/products/cloud-status-monitoring
[14] http://exchange.gogrid.com/
[15] http://www.ca.com/us/lpg/nimsoft.aspx
[16] http://www.nagios.org/about/
[17] Frédéric Desprez, Eddy Caron, Luis Rodero-Merino Adrian Muresan, “Auto scaling, load
balancing and monitoring in commercial and open source clouds”, CRC Press, Chapter in
"Cloud computing: methodology, system and applications" book, 2011.
[18] http://opennebula.org/about:about
[19] http://www.cloudstack.org/
[20] https://github.com/zenoss/ZenPacks.zenoss.CloudStack
[21] http://www.nimbusproject.org/
[22] http://community.rti.com/paper/dargos-highly-adaptable-and-scalable-monitoring-
architecture-multi-tenant-clouds
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 36
KonaKart 5. KonaKart
KonaKart is a software application that implements an enterprise java eCommerce / online
shopping cart system. It's main modules are:[1][2]
· A shop web application used by customers to buy your products online.
· An Administration application to enable you to manage your online store.
· Many Customization and Extension features.
5.1. KonaKart Community and Enterprise Versions
KonaKart comes in two separate installations/versions viz , A free Community Edition
which can be downloaded from the KonaKart website and an Enterprise edition which can
be purchased from the KonaKart website. The Community Edition is intended for use by
small businesses and charitable organizations. A condition of the license agreement is to
display "Powered By KonaKart" with a link to the KonaKart web site, on the main page of
the online store. The Enterprise Extensions are available as a separate installation kit which
is installed on top of the community edition to provide more features and functionality. The
"Powered by KonaKart" link is not mandatory for a KonaKart based store when the
Enterprise Extensions are installed over the Community version. The full source code of the
storefront application including the KonaKart client engine, the Struts action classes, the
JSPs, the payment modules, order total modules and shipping modules are included in the
Enterprise Edition of KonaKart. The Community Edition includes all of the above except
the source of the client engine. KonaKart supports most popular databases through JDBC.
(e.g. MySQL, PostgreSQL, Oracle, DB2, MS SQL Server are all supported in the download
package). Written in Java. needs a servlet engine such as Apache Tomcat to run. Modular
approach with APIs at various levels. The APIs are available as Java APIs, SOAP, JSON
and RMI. The JSON APIs are used by the jQuery plugin that allows the KonaKart
application engine to be called using AJAX JavaScript calls. The variety of protocols
supported, promote connectivity even from outside of the company firewall and allow client
side applications (i.e. .Net, MS Excel etc.) to use the KonaKart engine.[1][2]
5.2. KonaKart Enterprise version features:[1][2]
· KonaKart Client Engine source code which includes the full source code of the
client engine as well as a utility for creating an Eclipse project for customizing the
storefront application.
· Multi-Store mode allows you to run an unlimited number of stores with a single
KonaKart installation and a single database schema.
· Indexed search using Lucene search technology gives web users a lightning fast
search experience even for very large product catalogs. As you type into the
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 37
search box, a list of suggested search items appear matching the typed letters. The
suggestions are weighted by popularity so the most common suggestions are
shown first.
· Advanced marketing functionality that allows you to capture customer data as the
customer uses your KonaKart eCommerce store; and to use that data within
complex expressions in order to show dynamic content, activate promotions and
send eMail communications.(Real Time Data Processing).
· Promotion evaluation directly for products, rather than as Order Total modules.
This allows a customer to view the available promotions for a product without
having to add it to the cart.
· Unlimited number of custom product attributes. Each attribute may include
metadata for validation and widget selection during data entry using the Admin
App.
· Unlimited number of miscellaneous objects may be associated with products and
categories.
· Wish List functionality.
· Gift Registry functionality.
· Reward Points (redeemable by the customers during checkout)
· Gift Certificates.
· The KonaKart APIs are available via Java RMI (Remote Method Invocation),
JSON (application engine only with JSON) and JavaScript.
· jQuery plugin.
· Shopping Widgets.
· Job Scheduling.
· Support for Recurring Billing.
· Google Base integration which allows you to publish your product information
for inclusion in Google search results (Search Engine Optimization)
· Product Synchronization feature to allow the synchronization of products between
pre-production and production environments.
· XML Import/Export feature for KonaKart objects such as product, customer,
order etc.
· Java Message Queue Integration (Apache ActiveMQ) to support the guaranteed
delivery of messages to external systems.
· The language of the admin app may be changed dynamically.
· PDF invoices can be created and sent to customers as email attachments and d•
LDAP module to connect to an LDAP directory in order to validate customer and
admin user credentials downloaded from the storefront application.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 38
5.3. Architecture:
5.3.1. Software Architecture
KonaKart has a modular architecture consisting of different software layers as can be
seen in the diagram below. Source code is provided for the blocks colored in light blue.
The diagram shows how storefront applications written in Java and other technologies
may interface to the KonaKart engines using one of the supported API technologies.
Fig 5.1 Layers of KONAKART e-commerce application
The KonaKart Server is a multi-threaded component that contains the core functionality
of the e-commerce application. It exposes a SOAP Web Service interface, an RMI
interface, a JSON interface and a Java API. It interfaces to a persistence layer and plug
in modules for calculating shipping costs, promotional discounts and for connecting to
payment gateways. The persistence layer supports databases from many different
database vendors such as Oracle, Microsoft’s SQL Server, DB2 from IBM, MySQL,
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 39
PostgreSQL and many others. The KonaKart Client manages the state of a user
(associated with the user’s session) as he navigates around the application. Struts
[http://struts.apache.org/] is a popular framework that implements the Model-View-
Controller (MVC) architecture. The source code of the Struts Action classes (for the
store front application [http://www.konakart.com/konakart/Welcome.action]) is
included in the download package in order to provide examples of how to call the
KonaKart Client API. The store front application uses JSPs to generate the UI.
However, different technologies can easily be implemented thanks to the modular
approach of KonaKart.[1][2]
5.3.2. Deployment Architecture
KonaKart can be deployed on a single machine but usually in a production environment
KonaKart is deployed in a distributed manner.
Fig 5.2 Sample Multi-tier KONAKART deployment architecture.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 40
5.4. KonaKart Use Case UML Diagrams
5.4.1. Top Level Use Case Diagram
Fig 5.3 Top level UML use case diagram for Konakart application.
1.
Login
Add to
Cart
Checkout
Register
View
Items
Customer
Authentication
Identity
Provider
Payment Service
kout
Exten
d
include
Extend
Includ
Extend
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 41
5.4.2. View Items Use Case Diagram
Fig 5.4 View Items Use case Diagrams
Search Items
Browse sections
Browse New
Arrivals
Browse
Featured
Product
Browse Items
on sale
Add to cart View Items
Extend
Extend
Extend
Extend
Extend
Extend
Extend
Extend
Extend
Extend
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 42
5.4.3. Check-out use case Diagram
Fig 5.5 Checkout use case diagram
REFERENCES
[1]www.konakart.com
[2]Konakart user and Developer guide
Login
Register
View/Upd
ate cart
Shipping
calculation
Pay By
Debit
Payment
Pay By
Credit
Checkout
Ch kokoutkout
Include
Include
Include
Include
Include
Include Include
Include Include
Include
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 43
AppDynamics-Application Performance
management and Monitoring
6. AppDynamics - Application Performance Management and Monitoring
AppDynamics is unique application performance management solution. The combination of
intelligent technology and an intuitive user interface make AppDynamics the best solution for
managing modern applications.[1] AppDynamics is the next-generation application performance
management solution, used by leading business organizations to manage and monitor the
performance and availability of their revenue-critical Java applications.[2].AppDynamics is
available as two editions viz ., AppDynamics Lite (free) and AppDynamics Pro (commercial).
AppDynamics is one of the first APM built for highly distributed service-oriented environments.
It enables fast root cause diagnostics at the method/class level, while creating no more than 2%
overhead, even in high-volume production deployments.[2]. AppDynamics can dynamically
scale them in cloud and virtual environments.
6.1. AppDynamics Lite components
AppDynamics Lite enables its users to monitor application requests in real time. It organizes
these requests into Business Transactions and calculates the health of each transaction based
on the number and rate of slow or stalled requests or those with errors. AppDynamics Lite
provides continuous automatic diagnostic visibility and helps you drill down to find the root
cause of the problem [2]. The AppDynamics Application Server Agent collects performance
data overall and diagnostic data and sends it to the AppDynamics Lite Viewer where you
can monitor the health of each Business Transaction from a single dashboard. You can reach
the root of all Business Transactions to troubleshoot code hot spots, exception stack traces,
and SQL queries.[2]
6.1.1. AppDynamics Lite Application Server Agent
The main function of Application Server Agent is to collect performance statistics, data
about business transactions, and application performance data. Users install and run the
agent inside of a JVM or CLR process on an application server. The Agent and Viewer
communicate through a one-way standard HTTP connection (Agent to Viewer). The
Agent sends all the performance data that it monitors to the Viewer for storage and
analysis [2].
6.1.2. AppDynamics Lite Viewer
The Viewer stores and analyzes application performance data about a single JVM or
CLR, using data collected by the Agent. You install the Viewer on the same machine as
the application server or Java process that you want to monitor. The Viewer requires very
little disk space and memory, and you can open multiple Viewers on multiple application
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 44
servers running on the same machine. The Viewer is a browser-based Flash application
that runs on commonly-used computer systems. To monitor multiple applications running
on the same machine user can start Liteviewer of separate ports. Every application has
their own liteviewer and appagent.
Fig 6.1 AppDynamics Lite Components
6.2. AppDynamics Lite Features and Uses:
6.2.1. Organizing User Requests Into Business Transactions
To simplify and organize the flow of traffic, AppDynamics Lite groups all application
requests into categories of requests called business transactions. AppDynamics Lite
applies automatic discovery rules for identifying the entities in a business application.[2]
6.2.2. Monitoring Business Transaction Health
AppDynamics Lite measures business transaction health based on the individual requests
for that transaction. If an ‘Add to cart’ operation from a shopping cart is identified as a
business transaction, AppDynamics Lite reports how many AddTocarts were normal,
slow, very slow or had errors. It also monitors stalled checkouts.[2]. AppDynamics Lite
tracks these metrics for different time ranges across a 2-hour period and displays a
graphical health indicator on the dashboard against the business transaction.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 45
6.2.3. Monitoring Backend Activity
Outbound calls from an App Server can be very performance intensive. In addition, they
are sensitive to availability of resources such as connections and sockets. Any delays or
errors in the external backends result in immediate performance problems for the
application. AppDynamics Lite monitors external calls from the monitored App Server to
standard backends.[2]
6.2.4. Identifying And Diagnosing Slow Requests
Identifying slow requests is an important part of monitoring business transaction health.
Slow requests are caused by problems with code or its design, database queries, resource
contention, etc. AppDynamics Lite watches every request to isolate which requests were
slow or very slow and can keep track of the total slow requests for a two-hour period.
AppDynamics Lite monitors two levels of slowness(slow or very slow) to separate a bad
user experience from a worse user experience. To detect slow requests, AppDynamics
Lite compares the time taken to complete a request and marks it slow or very slow
depending on a set of thresholds. It sends diagnostic information to the Viewer.
AppDynamics Lite supports both dynamic and static thresholds to determine slow
requests. It supports a global value only where there is a common threshold for slow or
very slow requests across all business transactions.In a scenario where there are a lot of
frequent slow requests, AppDynamics Lite captures a limited number of the slow
requests to ensure low system overhead.[2]
6.2.5. Identifying And Diagnosing Stalls
Stalls are highly visible application performance problems since there is often no
feedback to the end user for an unacceptable amount of time. The most common reasons
for stalls in are not able to get a connection from a connection pool because of limited
resources and not being able to connect to an external backend because of network issues.
There are many other issues that can cause stalls. AppDynamics Lite isolates which
application requests have stalled and keeps track of the total number of stalls for a two-
hour period.[2].
6.2.6. Isolating Root cause
AppDynamics Lite captures diagnostic data about bad user requests and presents it in call
graphs. You can analyze these detailed call graphs to determine the root cause of
problems. AppDynamics Lite provides snapshots, which are call graphs with high
amount of detail.[2].
6.2.7. Setting alerts and email notification
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 46
AppDynamics Lite enables the user to set up custom alerts for custom metrics.An Email
notification system can be set if thresholds of the metrics are met.
6.2.8. Identifying and diagnosing errors
Identifying and being able to look at the root causes of errors in real time helps system
developers to resolve them faster, without having to scan through log files. AppDynamics
Lite watches for errors including exceptions, error log entries, and HTTP return codes. In
scenarios where there are many errors, AppDynamics Lite captures a limited number of
them to maintain an acceptable overhead.[2]
6.3. AppDynamics Lite vs AppDynamics Pro
Here are 14 key differences between the two editions.[3]
AppDynamics Lite AppDynamics Pro
Lite is typically deployed on the users
desktop machine
Pro requires a dedicated management server
that can either be managed as a service by
AppDynamics or by the user on-premise
Lite provides visibility for a single JVM or
IIS web server;
Pro provides complete visibility of every
JVM, IIS web server, CLR, and tier across
your distributed application.
Lite is limited to 30 business transaction
types.
Pro is unlimited.
Lite is limited to Server visibility Pro can monitor the User Experience across
the Browser, Network and Server.
Code Execution and Call stack visibility are
limited to a single JVM and IIS web server
in Lite
With Pro, users can follow code execution
across every JVM, IIS instance and CLR
while maintaining full business transaction
context.
Lite and Pro for Java now both offer real-
time monitoring of JMX metrics.
Pro in addition, allows users to perform
historical and long term trending with
baselines overtime, and also with other
metrics such as host OS, application metrics
and CLR counters for .NET.
Thresholds in Lite are driven by a single
static threshold for all business transactions.
In Pro, every business transaction has its
own dynamic baseline. This feature ensures
that AppDynamics can learn your
application’s normal behavior and only send
alerts when real problems occur.
Lite now provides basic alerting capabilities
for simple thresholds on the application,
business transactions and
JMX metrics (for Java)
Pro provides more advanced features such as
rule based policies for more granular alerting
along with launch in context links for rapid
alert triage.
Diagnostic sessions and data are triggered
manually by a user in Lite.
With Pro, they are triggered automatically
when a threshold breach occurs.
Lite is limited to 2 hours of rolling data. Pro is unlimited and can hold years of
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 47
historical data
NA Pro also monitors the Host OS machine to
provide visibility into CPU, memory, disk
and network utilization
NA Pro for Java has automatic memory leak
detection for resolving leaks in production
environments.
NA Pro offers workflow automation so that
proactive action can be taken when needed.
For example, AppDynamics enables cloud
bursting when more resources need to be
provisioned.
Table 6.1 AppDynamics Lite vs AppDynamics Pro
REFERENCES
[1]www.appdynamics.com
[2] http://litedocs.appdynamics.com
[3] http://www.reekya.pl/appd/images/downloads/ReekyaAppDynamics_Lite%20vs.%20Pro.pdf
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 48
Performance Evaluation 7. Performance Evaluation
7.1. Deployment Architecture
The Test environment consisted of five machines viz, 4 Eucalyptus Community Cloud [1]
Instances and one local machine for the benchmark test. The Architecture was deployed in a
distributed manner with one instances dedicated to open source HaProxy load balancer , two
ECC instances with Apache Tomcat[2] server hosting the application files of konakart and a
single instance dedicated to the konakart MySql database which is shared by both the
konakart java applications hosted on the Tomcat servlet container. Apache JMeter[3]is used
to generate load on the Haproxy load balancer[4]. JMeter was configured on localhost to
simulate real world traffic over the public internet.
FIG 7.1 Deployment Architecture
Apache
JMeter
KONAKART-
TOMCAT1
HAPROXY
KONAKART-
TOMCAT2
Depl nt A hi
KONAKA
RT
MySql
Database
- KO
euca-173-205-188-102.eucalyptus.ecc.eucalyptus.com:8080/konakart
10.9.40.2:8080/konakart 10.9.40.1:8080/konakar
10.9.40.3:330 10.9.40.3:3306
Local terminal
ECC
ECC ECC
ECC
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 49
The konakart application communicates with the konakart MySql database using the
MySql Java Database Connector (JDBC). The Haproxy load balances the traffic
between the servers using the private IP address of the servers and the port. As
Konakart is a ecommerce application, the database between the instance is shared to
maintain consistency between the transactions. Apache JMeter is configured to generate
HTTP request over a time period of 300 sec. The number of requests is varied until the
application breakdown. A IaaS level monitoring tool “collectd”[5] is deployed on all
the ECC instances, which collect IaaS level metrics like CPU usage, load , Disk
operations and memory.
7.2. Test Goals
The goals of the above performance evaluation tests are:
· To test performance of an E-commerce application on open source private cloud
infrastructure
· To Compare PaaS Level application metrics collected by the AppDynamics Lite
monitoring and IaaS level metrics collected by collectd tool.
· To analyze performance of the architecture and provide possible improvements in
the architecture.
7.3. Test Environment
All the Eucalyptus instances have following configuration
Instance Type m3.xlarge
No of Processors 4
RAM 2048 MB
Storage 15 GB
OS Ubuntu Lucid 10.04
Kernel Linux
Architecture x86_64 bit
Table 7.1 ECC Instance Configuration
7.4. Test Phase 1
Apache JMeter was configured to generate load of 500, 1000, 1500, 2000, 2500, 3000,3500
and 4000 HTTP requests.
7.4.1. PaaS Metric results(AppDynamics
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 50
No of
HTTP
Requests
Server
Throughput
(no of calls
per minute)
Database
Throughput
(no of calls
per minute)
Haproxy
Average
Response
time(ms)
500 20 1333 500 Server 1
20 1334 500 Server 2
1000 42 1522 600 Server 1
43 1522 600 Server 2
1500 104 3076 500 Server 1
113 3333 500 Server 2
2000 145 4666 500 Server 1
140 4505 500 Server 2
2500 179 5747 500 Server 1
169 5824 500 Server 2
3000 217 6967 500 Server 1
200 6767 500 Server 2
3500 270 8666 500 Server 1
262 8404 500 Server 2
4000 291 9333 600 Server 1
291 9333 600 Server 2
Table 7.2 Apache Jmeter load testing results
Note: Each of the above throughput values are average values over last 30 minutes
Fig 7.2 No of HTTP Requests Vs Web server Throughput
0
50
100
150
200
250
300
350
0 1000 2000 3000 4000 5000
No Of HTTP Requests
Server 1 Throughput(calls/min)
Server2 Throughput(Calls/min)
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 51
Fig 7.3 No of HTTP requests vs Database Server throughput
7.4.1.1. Observations(PaaS Metric results):
· As the Haproxy load balancing algorithm was set to “round-robin”, the load
balancer efficiently load balances the traffic between two web servers as the
throughput for both the web servers is almost equal.
· We were successful to distribute the traffic among web servers.
· The database throughput is very high, as single database is shared by both
the web servers.
· The load balancer response time was almost constant throughout the tests, so
we can say that, the deployment setup handled the web traffic efficiently.
· The main short-coming of AppDynamics Lite is that it aggregates the
application performance statistics over a 2 hour period.
7.4.2. IaaS Metric Results (collectd)
No of HTTP requests CPU USAGE LOAD
500 0.75 0.07 Server 1
0.92 0.08 Server 2
1000 1.53 0.14 Server 1
2.05 0.18 Server 2
1500 1.47 0.15 Server 1
1.63 0.14 Server 2
2000 1.54 0.21 Server 1
1.98 0.23 Server 2
2500 1.78 0.22 Server 1
1.75 0.27 Server 2
3000 2.08 0.24 Server 1
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 1000 2000 3000 4000 5000
No of HTTP Requests
Database throughput from
Server 1
Database throughput from
Server 2
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 52
2.06 0.31 Server 2
3500 2.46 0.22 Server 1
2.44 0.34 Server 2
4000 2.8 0.28 Server 1
2.78 0.3 Server 2
Table 7.3 Apache Jmeter load testing results(IAAS metrics)
Fig 7.4. No of HTTP requests vs average CPU usage(jiffies)
Fig 7.5 No of HTTP requests Vs Average System Load
7.4.2.1. Observations(IaaS Metrics)
· The unit of CPU usage is Jiffies. A jiffy is the total duration of one tick of
the system timer interrupt. It is not an absolute time interval unit, since its
0
0.5
1
1.5
2
2.5
3
0 1000 2000 3000 4000 5000
J
I
F
F
I
E
S
No of HTTP requests
CPU usage webserver 1
CPU usage webserver 2
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 1000 2000 3000 4000 5000
S
y
s
t
e
m
L
o
a
d
No of HTTP requests
LOAD SERVER 1
LOAD SERVER 2
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 53
duration depends on the clock interrupt frequency of the particular hardware
platform.[6] Every Hardware platform has a different jiffy value. A Jiffy is a
proverbial short amount of time, which is 1/100 second on most CPUs. The
collectd CPU plugin [7], collects the amount of time spent by the CPU in
various states, most notably executing user code, executing system code,
waiting for IO-operations and being idle in terms of ‘jiffies’. Linux Systems
provide cumulative count of no of jiffies spent in user, nice, system and idle
CPU modes over a specified period of time. In most Systems, 4 jiffies counts
are provided, viz the minimum value, average value, maximum value and last
value which a calculated over a specific period of time. In our case, we have
considered the average value of user CPU mode over a one hour period. By
observing the trend of CPU utilization (Fig 7.4), we can conclude that no of
Jiffies increases as we increase no of HTTP requests.
· System load average represents the average number of processes that are in
the running (using the CPU) or runnable (waiting) states. One notable
exception exists: Linux includes processes in uninterruptible sleep states,
typically waiting for some I/O activity to complete. This can markedly
increase the load average on Linux systems. The load average is calculated as
an exponential moving average of the load number (the number of processes
that are running or runnable). The three numbers returned as the system’s load
average represent the one, five, and fifteen minute moving load average of the
system. for a single processor machine a load average of 1 means that, on
average, there is always a process in the running or runnable state. Thus, the
CPU is being utilized 100% of the time and is at capacity. In our case we the
machines we selected have 4 cpu’s , so for 100 % utilization , the value of
load average should be 4. By observing the trend of system load (Fig 7.5)
(Table 7.2)over increasing amount of HTTP requests, we can conclude that
the system load increases with no of HTTP requests. The highest value
attained by Load is 0.3. hence the system is very High –End as the maximum
value of System load average for 100 % cpu utilization is 4(4 cores). We can
compromise the system configuration to lower the cost over a public cloud
infrastructure.
7.5. Testing Phase 2
In this phase we will closely monitor the performance of the database server as two web
server instances share same database and throughput of the database server is more as
compared to the individual web server throughput. In this phase , we analyze the IaaS level
metrics for the database server.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 54
No Of HTTP
requests
CPU Usage LOAD
1000 2.18 0.16 Server 1
2.42 0.17 Server 2
0.12 0.05 Database
Server
4000 2.91 0.22 Server 1
2.73 0.26 Server 2
0.81 0.09 Database
Server
Table 7.4 Apache Jmeter Load Testing results Phase 2
· By observing the IaaS metrics from above table we can conclude that the total
no of database calls does not affect the server performance at the IaaS level. The
actual load on the database server cannot be presented in terms of IaaS metrics
like CPU usage and LOAD.
7.6. Testing Phase 3
In this phase we closely monitor the mysql database performance using the MySQL plugin
[8]. The MySQL plugin connects to an MySQL-database and issues a SHOW
STATUS command periodically. The command returns the server status variables, many of
which are collected [8].This plugin will help to find the performance statistics of
KONAKART-MySQL database and display it in a graphical manner.
7.6.1. Phase 3 Results:
Fig 7.6 MySql Plugin Metric.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 55
No of HTTP requests MysqlCommands(Select
statements)per second
500 6.24
1000 12.16
4000 36.58
Table 7.5 No of MySql Commands per second
· Above figure (Fig 7.6) contain the number of issues per second for various
SQL-commands. But in the case of static testing we will only analyze the Mysql
“Select statement”.
· For 500 requests, the average value of MySqlcommands(Select) is 6.24 , that
means 6.24 select commands were executed in one second. as we run the test for
300 seconds the approximate number of select requests will be 1872.
· Similarly in the case of 1000 and 4000 requests the approximate no of Select
requests will be 3648 and 10,974 respectively.
· This implies that there are high no of requests on the database server and we
need to dig deep into MySql performance statistics as the traditional IaaS
metrics does not provide enough information about the actual load on the
database server.
· In our case, we only analyzed one MySql command, but in production
environment there are many MySql commands executed simultaneously, which
will add more load to the Database server. Hence , Auto-scaling the database is
must in such scenarios.
7.7. Comparing IaaS performance metrics with PaaS performance metrics:
· Load
System load at the IaaS level gives a rough overview of utilization of the machine,
The system load is defined as the number of runnable tasks in the run-queue and is
provided by many operating systems as a one, five or fifteen minute average [5]. The
PaaS level metric Load, monitored by AppDynamics, is the application level load
and defines more about the throughput of a single JVM. By observing collectd
statistics, we can deduce that the IaaS level load increases for both the web servers
as we increase the total no of HTTP requests. But in case of database server the load
plugin of collectd does not give enough information about the actual load as the
AppDynamics Load metric (Throughput) is high for Database server.
· Response Time:
The AppyDynamics Lite response time metric corresponds to the time taken by the
web servers to respond to Haproxy load balancer. Whereas the Response time in
case of the Haproxy load balancer (Apache Jmeter metric) is the time taken by the
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 56
load balancer to respond to a HTTP request , which can be a sum of web server
response time and time taken for a request to reach Haproxy website from a local
Jmeter machine setup.
· CPU-utilization:
By observing the CPU metric on collectd, we can conclude that the number of
system Input/output waits increase as we increase the no of HTTP requests.
REFERENCES
[1] http://www.eucalyptus.com/eucalyptus-cloud/get-started/try/community-cloud
[2] http://tomcat.apache.org/
[3] http://jmeter.apache.org/
[4] http://haproxy.1wt.eu/
[5]https://collectd.org/wiki/index.php/Plugin:Load
[6] http://en.wikipedia.org/wiki/Jiffy_(time)
[7] https://collectd.org/wiki/index.php/Plugin:CPU
[8] https://collectd.org/wiki/index.php/Plugin:MySQL
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 57
Conclusion and Future Work 8. Conclusion and Future Work
8.1. Conclusion
The Problem on evaluating an open
source e-commerce framework developed over a cloud infrastructure is presented in this
project deployed as a multi-tier architecture. More specifically, this project focuses on
challenges faced in improving the application performance running over multi-tenant cloud
platforms. This project evaluates AppDynamics Lite as a possible PaaS level solution for
application monitoring in cloud infrastructures.
E-commerce websites are emerging very rapidly in recent years. With likes of major e-
commerce players such as Amazon.com,ebay.com, etc, in the market , many small vendors
are emerging to improve their sales and so that they can reach customers from all corners of
the world. An E-commerce website must handle most critical business transactions or
operations efficiently. E-commerce, being a service oriented business should meet all the
consumer demand, hence monitoring such website is important. The most important rule
that businesses follow nowadays is to increase the profits and reduce losses, hence they
prefer deploying their websites in a multi-tenant architecture, which are cheap as multiple
organizations or vendors share computing resources. As public web-hosting service or
public clouds provide multi-tenant hosting services for very affordable prices, monitoring
and auto-scaling has become very essential for such public cloud. AppDynamics provide
such monitoring solution to monitor application level metrics.
As AppDynamics Lite version collects monitoring data for over a period of 2 hours, so data
analysis is impossible in this case but the AppDynamics Pro version doesn’t have this
restriction and also has more advanced features.
From the test results it can be concluded that Konakart E-commerce is a database intensive
application as number of transactions between web servers and the database server is greater
than the no of HTTP requests. During High traffic and load conditions, the shared database
server might breakdown. Hence a scaling of database server is more important in the case of
e-commerce frameworks.
FALL 2013
©Onkar kadam, 6614590, Concordia University, [email protected] 58
8.2. Future Work
Future Work on developing a open
source ecommerce platform can include evaluating different multi-tier deployment
archictecture of konkart on Eucalyptus. Finding a solution for how to scale database servers
will be good step forward for such e-commerce applications. Evaluating e-commerce
solution on various other cloud infrastructures such as Amazon AWS and to compare the
performances of the application on these clouds. Also evaluating various open source e-
commerce applications like magento , Zen Cart , osCommerce, openCart, spree commerce,
prestashop, etc. over various cloud infrastructures can be regarded as future research work