CloudPlatform Deployment Reference Architecture - Citrix · PDF fileCloudPlatform Deployment...
Transcript of CloudPlatform Deployment Reference Architecture - Citrix · PDF fileCloudPlatform Deployment...
CloudPlatform Deployment
Reference Architecture
For Citrix CloudPlatform Version 3.0.x
© 2012 Citrix Systems, Inc. All rights reserved. Specifications are subject to change without notice. Citrix Systems, Inc., the
Citrix logo, Citrix XenServer, Citrix XenCenter, and Citrix CloudPlatform are trademarks or registered trademarks of Citrix
Systems, Inc. All other brands or products are trademarks or registered trademarks of their respective holders.
CloudPlatform Deployment Reference Architecture
2 © 2012 Citrix Systems, Inc. All rights reserved.
Contents
What's In This Guide .................................................................................................................................................................... 4
Workload-Driven Deployment Process........................................................................................................................................ 5
Types of Cloud Workloads ....................................................................................................................................................... 5
CloudPlatform Supports Both Workload Types ....................................................................................................................... 8
Traditional Workload ........................................................................................................................................................... 8
Cloud-Era Workload ............................................................................................................................................................. 9
Management Server Cluster Deployment ................................................................................................................................. 10
What Type of Workload is the Management Server? ........................................................................................................... 10
Management Server Cluster Backup and Replication ........................................................................................................... 11
Management Server Cluster Hardware ................................................................................................................................. 12
Primary Management Server Cluster................................................................................................................................. 12
Standby Management Server Cluster ................................................................................................................................ 12
Management Server Cluster Configuration ........................................................................................................................... 13
Primary Management Server Cluster Configuration.......................................................................................................... 13
Cloud-Era Availability Zone Deployment ................................................................................................................................... 16
Overview ................................................................................................................................................................................ 16
Network Configuration ...................................................................................................................................................... 17
Cloud-Era Availability Zone Hardware ................................................................................................................................... 18
Primary Storage Sizing ....................................................................................................................................................... 19
Secondary Storage Sizing ................................................................................................................................................... 19
Cloud-Era Availability Zone Configuration ............................................................................................................................. 20
Traditional Availability Zone Deployment ................................................................................................................................. 22
Overview ................................................................................................................................................................................ 22
Traditional Availability Zone Hardware ................................................................................................................................. 23
Primary Storage Sizing ....................................................................................................................................................... 24
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 3
Secondary Storage Sizing ................................................................................................................................................... 24
Choice of Hypervisor in Traditional Availability Zone ............................................................................................................ 25
Traditional Availability Zone Configuration (for vSphere) ..................................................................................................... 25
Traditional Availability Zone Configuration (for Xenserver) .................................................................................................. 28
Disclaimer: Vendors and products mentioned in this document are provided as examples and should not be taken as
endorsements or indication of vendor certification.
CloudPlatform Deployment Reference Architecture
4 © 2012 Citrix Systems, Inc. All rights reserved.
What's In This Guide
This Guide is for cloud operators who are planning medium to large-scale production deployments of Citrix CloudPlatform.
It is designed to work in conjunction with the CloudPlatform Installation Guide . This document is intended to offer high-
level planning and architectural guidance, as opposed to detailed installation procedures, for production deployments. The
reader should refer to the CloudPlatform Installation Guide for the detailed steps needed to install and configure Citrix
CloudPlatform. The reader should also refer to the CloudPlatform Administration Guide for the instructions on how to
operate, maintain, and upgrade a CloudPlatform installation.
Citrix CloudPlatform supports a large number of hypervisor, network and storage configurations. To simplify the planning of
large-scale production deployments, this document is designed to provide guidance on selecting the proper architecture
and configuration according to the target workload the cloud is designed to support.
Before we cover the details of different options of deployment architecture we’ll first establish the foundation of our
methodology: workload-driven deployment.
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 5
Workload-Driven Deployment Process
Citrix CloudPlatform™ is an open source software platform that pools datacenter resources to build public, private, and
hybrid Infrastructure as a Service (IaaS) clouds. CloudPlatform abstracts the network, storage, and compute nodes that
make up a datacenter and enables them to be delivered as a simple-to-manage, scalable cloud infrastructure. These nodes
or components of a cloud can vary greatly from datacenter to datacenter and cloud to cloud because they are defined by
the unique workloads or applications that they support. With so many options for servers, hypervisors, storage and
networking it is imperative that cloud operators design with a specific application in mind to ensure the infrastructure
meets the scalability and reliability requirements of the application.
The following figure illustrates the steps a cloud operator typically follows to determine the appropriate deployment
architecture for CloudPlatform.
Types of Cloud Workloads
Two distinct types of application workloads have emerged in cloud operator’s datacenters.
The first type is a traditional enterprise workload. The majority of existing enterprise applications fall into this category. They include, for example, applications developed by leading enterprise vendors such as Microsoft, Oracle, and SAP. These applications are typically built to run on a single server or on a cluster of front-end and application server nodes backed by a database. Traditional workloads typically rely on technologies such as enterprise middleware clusters and vertically-scaled databases.
Citrix commonly refers to the second type as a Cloud-Era workload. Internet companies such as Amazon, Google, Zynga, and Facebook have long realized that traditional enterprise infrastructure was insufficient to serve the load generated by millions of users. These Internet companies pioneered a new style of application architecture that
Define target workloads
Determine how that application workload will be delivered reliably
Develop the deployment architecture
Implement cloud deployment
Operate cloud environment (e.g., monitor, upgrade, patch)
Iaas Cloud
CloudPlatform Deployment Reference Architecture
6 © 2012 Citrix Systems, Inc. All rights reserved.
does not rely on enterprise-grade server clusters, but on a large number of loosely-coupled computing and storage nodes. Applications developed this way often utilize technologies such as MySQL sharding, no-SQL, and geographic load balancing.
There are two fundamental differences between traditional workloads and cloud-era workloads.
SCALE: The first difference is scale. Traditional enterprise applications serve tens of thousands of users and
hundreds of sessions. Driven by the growth of Internet and mobile devices, Internet applications serve tens of millions of users. The orders of magnitude difference in scale translates to significant difference in demand for computing infrastructure. As a result the need to reduce cost and improve efficiency becomes paramount.
RELIABILITY: The difference in scale has an important side effect. Enterprise applications can be designed to run on
reliable hardware. Application developers do not expect the underlying enterprise-grade server or storage cluster to fail during normal course of operation. Sophisticated backup and disaster recovery procedures can be setup to handle the unlikely scenario of hardware failure. The Internet scale changed the paradigm. As the amount of hardware resources grow, it is no longer possible to deliver the same level of enterprise-grade reliability, backup, and disaster recovery at the scale needed to support Internet workloads in a cost effective and efficient manner.
Traditional vs. Cloud-Era Workload Requirements
Traditional Workload Cloud-Era Workload
Scale 10s of thousands of users Millions of Users
Reliability 99.999 uptime Assumes failure
Infrastructure Proprietary Commodity
Applications SAP, Microsoft, Oracle Web Content, Web Apps, Social Media
Cloud-era workloads assume that the underlying infrastructure can fail and will fail. Instead of implementing disaster recovery as an after-thought, multi-site geographic failover must be designed into the application. Once the application can expect infrastructure failure, it no longer needs to rely on technologies such as network link aggregation, storage multipathing, VM HA or fault tolerance, or VM live migration. Instead the application is expected to treat servers and storage as “Ephemeral Resources,” a term that means resources can be used while they are available, but they may become unavailable after a short period of use. Some cloud-era applications, such as the Netflix streaming video service, have notably employed a mechanism called “Chaos Monkey” that randomly destroys infrastructure nodes to ensure that the application can continue to function despite infrastructure failure.
Common Cloud Workloads
Traditional Workload Candidates
Communications / Productivity
Outlook, Exchange or SharePoint
CRM / ERP / Database
Oracle, SAP
Desktop
Desktop-based computing, desktop service and support applications, and desktop management applications
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 7
Cloud-Era Workload Candidates
Web Service Static and dynamic web content, streaming media, RSS, mash-ups and SMS
Web Applications
Web service-enabled applications, eCommerce, eBusiness, Java application servers
Rich Internet Applications
Videos, online gaming and mobile apps (Adobe Flex, Flash, Air, Silverlight, iPhone)
Disaster Recovery
Onsite/Offsite backup and recovery, live failover, cloud bursting for scale
HPC
Engineering design and analysis, scientific applications, high performance computing
Collaboration / Social media
Web 2.0 applications for online sharing and collaboration (Blog, CMS, File Share, Wiki, IM)
Batch Processing
Predictive usage for processing large workloads - Data mining, warehousing, analytics, business intelligence
Development and Test
Software development and test processes and image management
CloudPlatform Deployment Reference Architecture
8 © 2012 Citrix Systems, Inc. All rights reserved.
CloudPlatform Supports Both Workload Types
Citrix CloudPlatform is the only product in the industry today that supports both traditional enterprise and Cloud-era
workloads. While Cloud-era workloads represent an application architecture that will likely become more dominant in the
future, the majority of applications that exist today are written as enterprise-style workloads. With CloudPlatform, a cloud
operator may design for one style of workload and add support for the other style later. Or a cloud operator may design for
supporting both styles of workload from the beginning.
The ability to support both styles of workload lies in CloudPlatform’s architectural flexibility. Cloud operators can, for
example, configure multiple availability zones using different hypervisor, storage, and networking capabilities required to
support different types of workloads to meet security, compliance and scalability needs of multiple cloud initiatives.
Traditional Workload
The following figure illustrates how a CloudPlatform Traditional Availability Zone can be constructed to support a traditional
enterprise style workload
Traditional workloads in the cloud are typically designed with a requirement for high availability and fault tolerance and use
common components of an enterprise datacenter to meet those needs. This starts with an enterprise-grade hypervisor,
such as VMware vSphere or Citrix XenServer that supports live migration of virtual machines and storage and has built-in
high availability. Storage of virtual machine images leverages high-performance SAN devices. Traditional physical network
infrastructure like firewalls and layer 2 switching are used and VLANs are designed to isolate traffic between servers and
tenants. VPN tunneling provides secure remote access and site-to-site access through existing network edge devices.
Applications are packaged using industry-standard OVF files.
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 9
Cloud-Era Workload
The following figure illustrates how a CloudPlatform Cloud-Era Availability Zone can be constructed to support cloud-era
workloads:
The desire for cost-savings can easily offset the need for features in designing for a cloud-era workload making open source
and commodity components such as XenServer and KVM a more attractive option. In this workload type, virtual machine
images are stored in EBS volumes and object store can used to store data that must persist through availability zone
failures. Because of VLAN scalability limitations, software defined networks are becoming necessary in cloud-era availability
zones. CloudPlatform meets this need by supporting Security Groups in L3 networking. Elastic Load Balancing (ELB) or
Global Server Load Balancing (GSLB) is used to redirect user traffic to servers in multiple availability zones. Third party tools
developed for Amazon Web Services to manage applications in this type of environment are readily available and have
tested proven integrations with CloudPlatform.
CloudPlatform Deployment Reference Architecture
10 © 2012 Citrix Systems, Inc. All rights reserved.
Management Server Cluster Deployment
The management server deployment is not dependent on the underlying style of cloud workload. A single management
server cluster can manage multiple availability zones across multiple datacenters enabling cloud operators to create
different availability zones to handle different workload types as needed. The following figure illustrates how a single cloud
can contain both cloud-era availability zones and traditional availability zones that are local or geographically dispersed.
What Type of Workload is the Management Server?
CloudPlatform Management Server is designed to run as a traditional enterprise-grade application or traditional workload.
It is designed as a simple, lightweight, and highly efficient application with the majority of work running inside system VMs
(see CloudPlatform Administration Guide – Working with System Virtual Machines) and executed on computing nodes. This
design choice is for two reasons:
First, managing a cloud is not a cloud-scale problem. In CloudPlatform version 3.0.x, each management server node is
certified to manage 10,000 computing nodes. This level of scalability is sufficient for today’s production cloud deployments.
When CloudPlatform deployments continue to grow, we expect to be able to tune management server code so that each
individual management server node can scale to many times more computing nodes.
The second reason for designing the management server as an enterprise application is a pragmatic one. Few people who
deploy CloudPlatform will have a Cloud-era infrastructure already in place. Without an existing IaaS cloud and 3rd
party
management tools like RightScale or EnStratus in place, deploying cloud workload is not an easy task. Building
CloudPlatform Management Server as a cloud-era workload would therefore lead to a bootstrap problem.
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 11
Management Server Cluster Backup and Replication
As a traditional-style enterprise application, the management server cluster is front ended by a load balancer and connects
to a shared MySQL database. While the cluster nodes themselves are stateless and can be easily recreated, the MySQL
database node should be backed up and replicated to a remote site to ensure continuing operation of the cloud. The
following figure illustrates how a standby management server cluster is setup in a remote datacenter.
During the normal course of operation, the primary management server cluster serves all UI and API requests. Individual
server failures in the management server cluster are protected as other servers in the cluster will take over the load.
To ensure the management server cluster can recover from a MySQL database failure, an identical database machine is
setup to serve as the backup MySQL server. All database transactions are replayed in real time on the Backup MySQL server
in an active-passive setup. If the primary MySQL server fails, the admin can reconfigure the management server cluster to
point to the backup MySQL server.
To ensure that the system can recover from the failure of the entire availability zone 1 that contains the primary
management server cluster, a standby management server cluster can be setup in another availability zone. Asynchronous
replication is setup between the backup MySQL server in the primary management server cluster and the MySQL server in
the standby management server cluster. If availability zone 1 fails, a cloud administrator can bring up the standby
management server cluster and then update the DNS server to redirect cloud API and UI to the standby management server
cluster.
CloudPlatform Deployment Reference Architecture
12 © 2012 Citrix Systems, Inc. All rights reserved.
Management Server Cluster Hardware
Primary Management Server Cluster Citrix recommends a two-node management server cluster that is capable of managing a cloud
deployment totaling 10,000 computing nodes.
Load Balancer NetScaler VPX or MPX based on the number of concurrent active sessions.
Management Server Node 1 Intel or AMD CPU server with at least 2GHZ, 1 socket, 4 cores, 16GB of memory, and
250GB of RAID 1 local disk storage.
Management Server Node 2 Intel or AMD CPU server with at least 2GHZ, 1 socket, 4 cores, 16GB of memory, and
250GB of RAID 1 local disk storage.
Primary MySQL Server Intel or AMD CPU server with at least 2GHZ, 1 socket, 4cores, 16GB of memory, and
250GB of RAID 1 local disk storage.
Backup MySQL Server Intel or AMD CPU server with at least 2GHZ, 1 socket, 4 cores, 16GB of memory, and
250GB of RAID 1 local disk storage.
As long as adequate performance is available
It is permissible to run management server and MySQL server as virtual machines.
It is permissible to run NetScaler VPX virtual appliance.
Standby Management Server Cluster Standby Management Server cluster is identical to the primary management server cluster with one difference:
backup MySQL server is not required.
Load Balancer NetScaler VPX or MPX.
Management Server Node 1 Intel or AMD CPU server with at least 2GHZ, 1 socket, 6 cores, 32GB of memory,
and 250GB of RAID 1 local disk storage.
Management Server Node 2 Intel or AMD CPU server with at least 2GHZ, 1 socket, 6 cores, 32GB of memory,
and 250GB of RAID 1 local disk storage.
Primary MySQL Server Intel or AMD CPU server with at least 2GHZ, 1 socket, 6 cores, 32GB of memory,
and 250GB of RAID 1 local disk storage.
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 13
Management Server Cluster Configuration
Primary Management Server Cluster Configuration
The database replication between primary and standby clusters can be done using MySQL replication methodology with hot
backup option. You can find more information about this at http://www.innodb.com/wp/products/hot-backup/.
CloudPlatform Internal DNS CPMS-URL (Example URL pointing at CloudPlatform Management server)
Management nodes:10.52.2.148; 10.52.2.149
CloudPlatform Version CloudPlatform 3.0.x
MySQL Version MySQL 5.1.61
MySQL Database (Master)
IP Address
10.52.2.142
MySQL Database (Slave)
IP Address
10.52.2.143
Management Server Node Configuration
Management Servers
Number of Servers (VM) for Management
2 This is a redundant design for high availability.
Name(s) CPMGSRV01, CPMGSRV02 Naming convention does not provide any standards or suggestions. These are sample names.
IP Address(es) 10.52.2.148; 10.52.2.149 The IP addresses specified only for reference and needs to be changed to fit network configuration of datacenter.
Deployment Hypervisor
XenServer 6.0.2 6.0.2 is the latest version of XenServer and is tested and entitled with CloudPlatform 3.0.x
Management Server VM Properties
CPU: 4 x vCPU RAM: 16 GB RAM NIC: 1 NIC HDD: 250GB
Management server is memory intensive and having enough RAM ensures performance requirements.
Operating System RHEL 6.2 (64-bit) RHEL is the recommended OS for its available commercial support.
Management Servers – Load Balancing
CloudPlatform Deployment Reference Architecture
14 © 2012 Citrix Systems, Inc. All rights reserved.
Load Balancing used Yes Load balancing management servers is a recommended practice to meet performance requirements.
Load Balancer NetScaler VPX Considering the load and number of users and SSL connections this Cloud architecture needs to manage Netscaler VPX would suffice. NetScaler MPX is an option if the load requirement goes beyond what is mentioned in this document.
Load Balancer (NetScaler) Configuration
The CloudPlatform UI is to be load balanced using Load Balancers. CloudPlatform requires that ports 8080 and 8250 are
configured on the LB VIP and that it requires persistence/stickiness across multiple sessions.
Source Port Destination Port Protocol Persistence
8080 8080 HTTP Yes
8250 8250 TCP Yes
Master/Slave MySQL Configuration
CloudPlatform requires a MySQL database, to store configuration information, VM Staging, and events related to every VM
(i.e. every guest VM started as part of the cloud environment creates an associated event which is stored in the database).
The script provided with the CloudPlatform installation creates two different databases referred to as cloud and
cloud_usage and populates the initial data within each database. The CloudPlatform Installation Guide details the scripts
used for installing and preparing the databases for CloudPlatform.
Currently CloudPlatform has a dependency on the InnoDB Engine used in MySQL for foreign key support in both the cloud
and cloud usage databases; therefore a MySQL Cluster cannot be used. The following section refers to a Master / Slave
configuration of MySQL.
MySQL replication works on a master/slave topology therefore there is no requirement for shared storage. Internally by
means of the asynchronous transfer mode data is kept consistent between both of the servers. The replication
methodology used for CloudPlatform is ROW Based.
The MySQL community edition (GPL) is to be deployed on two separate Virtual Servers running Red Hat Enterprise Linux 6.2
with Replication (master and slave) configured between them for high availability.
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 15
MySQL Database
Number of MySQL
Databases
Instances
2 In Master/Slave configuration
Virtual Machine
Configuration
2 vCPU, 16GB RAM, 250 GB Local Disk Use shared storage for DB Storage.
High Availability MySQL Master/Slave Replication
INNODB_ROLLBACK_ON_TIMEOUT=1
INNODB_LOCK_WAIT_TIMEOUT=600
MAX_CONNECTIONS=350
LOG-BIN=MYSQL-BIN
BINLOG-FORMAT = 'ROW'
CloudPlatform does not support mysql
clustering -- it is a manual failover.
InnoDB Rollback
on Timeout
1 (second) The InnoDB Rollback on Timeout
(innodb_rollback_on_timeout)
configuration. Located in /etc/my.cnf
InnoDB Lock Wait
Timeout
600 (seconds) The InnoDB Lock Wait Timeout
(innodb_lock_wait_timeout)
configuration. Located in /etc/my.cnf
MySQL Max
Connections
700 The configuration for the max number of
MySQL connections (max_connections).
This should be set to 350 * (Number of
CloudPlatform Management nodes).
Located in /etc/my.cnf
Binary Log
Location
mysql-bin This setting (log-bin) enables and sets the
location for the binary log. Located in
/etc/my.cnf
Binary Log Format Row Based This setting (binlog-format) defines the
binary Log format. Located in
/etc/my.conf
CloudPlatform Deployment Reference Architecture
16 © 2012 Citrix Systems, Inc. All rights reserved.
Cloud-Era Availability Zone Deployment
Overview
In this section we will describe how to design and configure a 3200-node cloud-era availability zone where all 3200 nodes
reside in the same datacenter. These nodes are divided into 200 racks or pods, with 16 hosts in each. The number of hosts
in each pod is typically a function of the available power. In the event that blade servers are used, 16 hosts constitute a
typical blade chassis and embedded networking switches would eliminate the need for TOR switches. Each pod also
contains an NFS server for primary storage.
The following figure illustrates how the compute hosts and storage servers are interconnected
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 17
Network Configuration
Here is the summary of networking configuration in the Cloud-era availability zone:
1. A pair of NetScaler MPX appliances in HA configuration is connected directly to the public Internet on one side, and on the other side to the datacenter core switch on a RFC 1918 private network.
2. Datacenter core switch and aggregation switches create 200 pairs of RFC 1918 private IP networks. Each pod consumes 1 pair of RFC 1918 private IP networks: a storage/management network and a guest network.
3. Each host in the pod is connected to 2 RFC 1918 private IP networks. One is a 10Gbps network used for storage and management traffic. The other is a 1Gbps network used to carry guest VM traffic.
4. There is one NFS server in each pod. The NFS server is connected to the storage/management network via a 10Gbps Ethernet link.
5. Link aggregation may be used in the datacenter core and aggregation switches. Link aggregation is not used in TOR switches, hosts, or primary storage NFS servers.
6. A high performance NFS server is directly connected to the datacenter aggregation switch layer and is used as the secondary storage server for this datacenter.
The datacenter core and aggregation switches set up the appropriate network ACL to ensure that various networks are
properly isolated. The following table details best practices on whether access should be allowed or denied based on
source and destination.
Destination
S
o
u
r
c
e
Storage/Mgmt
Network
Guest Network Secondary Storage
NFS Server
Public
Internet
Storage/Mgmt Network Allowed Denied Allowed NAT’ed
Guest Network Denied Allowed Denied NAT’ed
Secondary Storage NFS Server Allowed Denied Allowed Denied
Public Internet Denied Denied Denied Allowed
The detailed network and IP address configuration is listed in the following table:
Storage/Management
Network
Each host in the pod must have an IP address in the storage/management network.
CloudPlatform will also use a small number of private IP addresses for system VMs.
So a minimum of /27 RFC 1918 private IP address must be allocated for each pod.
These IP addresses will be exclusively used by CloudPlatform. Each pod must have
a different address range for storage/management.
CloudPlatform Deployment Reference Architecture
18 © 2012 Citrix Systems, Inc. All rights reserved.
Guest Network The number of guest IP addresses for each pod is determined by the profile of the
VM supported. For example, if VMs on average have 2GB memory, allocate 64 VM
in each host, and 1024 VMs on each pod. To be safe, allocate /21 RFC 1918 private
IP range for the guest network in each pod, allowing a maximum of 2048 VMs to be
created.
Guest network IP ranges in different pods must not overlap.
Cloud operators may choose to create site-to-site VPN tunnels that enable VMs in
different availability zones to communicate with each other via their private IP
addresses. If that is a requirement guest network IP ranges in different availability
zones must not overlap.
Secondary Storage
Server IP
One or more RFC 1918 IP addresses for the NFS server.
Cloud-Era Availability Zone Hardware
Load Balancer NetScaler MPX
Core Switch and
Aggregation Switch
Follow established networking practices
TOR Switch 2 per pod. 1 10G and 1 1G. 24 ports each. More ports would be required if
using IPMI or ILO for managing the individual hosts.
Computing node Intel or AMD CPU server with at least 2GHZ, 2 sockets, 6 cores per socket,
128GB of memory, and 250GB of RAID 1 local disk storage.
Primary Storage NFS
Server
NFS server sized based on the profiles of VMs the cloud is designed to support.
CloudPlatform supports thin provisioning and primary storage sizing can take
advantage of this to reduce the initial storage requirements. (See sizing
calculation below)
Secondary Storage NFS
Server
Sized according to number of hosts and VM profiles.
(See sizing calculation below)
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 19
Primary Storage Sizing
Primary storage sizing is based on the VM Profile. The formula for calculating the primary storage for each pod-specific NFS
storage would be as follows:
R = Average size of the system/root disk.
D = Average size of the Data volume.
N = Average number of Data volumes attached per VM.
V = Total number of VMs per pod.
The size of the primary storage required per pod would be
V * (R + (N*D))
Overprovisioning is supported on NFS storage devices in CloudPlatform and can be used to reduce the initial size
requirement of the primary storage per pod.
Secondary Storage Sizing
For Secondary Storage Sizing, here is a formula to follow:
N = Number of VMs in the Zone.
S = Average Number of Snapshots per VM.
G = Average size of snapshot per VM.
T = Number of Templates in the zone.
I = Number of ISOs in the zone.
Secondary Storage sizing would be
((N * S * G) + (I * Avg Size of ISOs) + (T * Avg size of Templates)) * 1.2
There is a 20% spare capacity built into the formula. The actual size could be further reduced based on the following
factors
Deduplication in the Storage Array.
Thin Provisioning.
Compression.
CloudPlatform Deployment Reference Architecture
20 © 2012 Citrix Systems, Inc. All rights reserved.
Cloud-Era Availability Zone Configuration
We will configure CloudPlatform as follows:
1. Each pod consists of 2 XenServer pools.
2. There are 8 hosts in each pool.
3. Create 2 NFS exports in the primary storage NFS server for each pool
Availability Zone(s) – 1 (it is always recommended to go with minimum of two availability zones)
ZONE-01
Network Mode Basic (L3 Network Model) This Zone has two hundred PODs and two clusters in each POD. The configuration is specified for one cluster that can be replicated for all other clusters in all the PODs.
\ZONE-01 \PODS\ <Z-01-POD01-Xen-CL01-04>
Name of Cluster(s) Z-01-POD01-Xen-CL01 Z-01-POD01-Xen-CL02 Z-01-POD02-Xen-CL03 Z-01-POD02-Xen-CL04
These names for POD and Clusters are specific to implementation.
Number of Hypervisors (compute nodes) per Cluster
8 x XenServer 6.0.X
Storage Infrastructure
Type / Make NetApp FAS3270
Number of Controllers
2 Two controllers for availability.
Primary Protocol NFS
Available Capacity 20TB/pod This is an example. Calculate the capacity requirement for primary and secondary storage using the formulas mentioned in the section above.
Primary Storage (two per cluster)
Z-01-POD01-CL (Replicate this for every cluster)
Availability Zone ZONE-01 Name of the Zone/POD/Clusters should be treated as examples.
Pod Z-01-POD01
Cluster Z-01-POD010-CL01
Protocol NFS
Size 4TB Please refer to the computation in the section above to determine the exact size.
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 21
Path / LUN NFS:/PS/Z-01-CL01-PS01/ NFS:/PS/Z-01-CL01-PS02/
This is a sample path.
Secondary Storage
Z-01-SS01
Type / Make NetApp FAS3270
Number of Controllers
2
Primary Protocol NFS
Available Capacity 50TB This is per Zone. Calculate the capacity requirement from the formula mentioned in the section above.
Host Configuration
XenServer Version 6.0.2 Latest version of XenServer
XenServer Edition Advanced
Server Hardware Specifications
HP DL360p Gen8
Networking Configuration
1 G NIC (1) 10 G NIC (1) + 1 NIC for IPMI
1 G NIC will be dedicated for public network and 10G NIC for private/storage.
Number of XenServer Hosts (computing nodes)
16 (hosts per pod)
Network Configuration
Distribution (core) Switch
Juniper EX4500
Access Switch Juniper EX4200 (4) 2 per POD
48 10G ports
CloudPlatform Deployment Reference Architecture
22 © 2012 Citrix Systems, Inc. All rights reserved.
Traditional Availability Zone Deployment
Overview
In this section we will describe how to design and configure a 64-node traditional server virtualization availability zone. The
availability zone consists of 4 pods, each comprised of 16 nodes. Unlike the Cloud-era setup, where each pod has its own
NFS servers, the entire zone shares a centralized storage server over a SAN. The availability zone is connected to 4 shared
VLANs: public, DMZ, test-dev, and production. In addition, tenants can be allocated isolated VLANs from a pool of zone
VLANs. A VM can be connected to one or more of these networks:
An isolated VLAN NAT’ed to public internet via the virtual router
The DMZ VLAN
The test-dev VLAN
The production VLAN
The following figure illustrates the physical network setup for a traditional availability zone:
Every host is connected to 3 networks:
1. A storage network that connects the host to primary storage. Storage multipath technology should be used to ensure reliability.
2. An untagged Ethernet network used for management and vMotion traffic. NIC bonding should be used to ensure reliability.
3. An Ethernet network used for shared and public VLAN traffic. NIC bonding should be used to ensure reliability. This network is used to carry 4 shared VLANs: public, DMZ, test-dev, and production. It is also used to carry the isolated zone VLANs.
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 23
Either 1Gbps or 10Gbps network can be used depending on the workload and VM density requirements.
The detailed network and IP address configuration is listed in the following table:
Storage Area Network Apply vendor’s best practices for SAN setup
Management/vMotion
Network
Each host needs 1 RFC 1918 private IP address. CloudPlatform consumes additional
private IPs for system VMs like CloudPlatform virtual routers. Reserve at least a /22
private IP address range to ensure plenty of private IPs (1024) are available for
system VMs.
Management/vMotion network IP ranges in different pods must not overlap.
VLAN network Carries tagged VLAN traffic for shared and isolated VLANs.
Secondary Storage
Server IP
One or more RFC 1918 IP addresses for the NFS server.
Traditional Availability Zone Hardware
Core Switching Fabric Follow established networking practices
TOR Switch 2 per pod. 48 ports each to allow NIC bonding.
Computing node Intel or AMD CPU server with at least 2GHZ, 2 sockets, 6 cores per socket,
128GB of memory, and 250GB of RAID 1 local disk storage.
Primary Storage Server Sized based on the profiles of VMs the cloud is designed to support
Secondary Storage NFS Server Sized according to VM profiles
CloudPlatform Deployment Reference Architecture
24 © 2012 Citrix Systems, Inc. All rights reserved.
Primary Storage Sizing
Primary Storage Sizing is based on the VM Profile. The storage sizing is based on the formula for calculating the primary
storage:
S = Average size of the system/root disk.
D = Average size of the Data volume.
N = Average number of Data volumes attached per VM.
V = Total number of VMs per pod.
R = Number of pods in the zone.
The size of the primary storage required per pod would be
R* V * (S + (N*D))
If using tiered storage, which is quite common in traditional Enterprise style workloads, repeat the calculation for each tier.
Secondary Storage Sizing
For Secondary Storage Sizing, here is a formula to follow:
N = Number of VMs in the Zone.
S = Average Number of Snapshots per VM.
G = Average size of snapshot per VM.
T = Number of Templates in the zone.
I = Number of ISOs in the zone.
Secondary Storage sizing would be
((N * S * G) + (I * Avg Size of ISOs) + (T * Avg size of Templates)) * 1.2
There is a 20% spare capacity built into the formula. The actual size could be further reduced based on the following
factors
Deduplication in the Storage Array.
Thin Provisioning.
Compression.
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 25
Choice of Hypervisor in Traditional Availability Zone
There are a variety of choices of hypervisors in a traditional availability zone. The following table lists the recommended
configuration for each hypervisor type.
XenServer vSphere
Primary Storage NFS iSCSI or FC
Storage Network Link Aggregation (LACP) Multipathing
Cluster Size 8 8
Traditional Availability Zone Configuration (for vSphere)
Availability Zone 1- ZONE-VMW-01
Name of Zone [ZONE-VMW-01] Sample name for the Zone.
Network Mode Advanced Network with VLANs Advanced Networking is a stipulation
for using the VMware ESX hypervisor
with vCenter
VLAN Type Tagged VLANs
Guest Networks CIDR 10.2.1.0/24 This CIDR is a sample and
CloudPlatform administrator can
choose based on their networking
best practices.
Guest VLAN Range 300-1000 These VLANs are allocated for each
account and any isolated/shared
network created apart from a guest
network. You can compute a range
roughly at an average of 3 VLANs per
CloudPlatform Deployment Reference Architecture
26 © 2012 Citrix Systems, Inc. All rights reserved.
customer.
Guest networks (VM Traffic) VMware Switch: vSwitch0 Virtual switch specified are sample
names/values and should be changed
to fit actual configurations.
Storage network VMware Switch: vSwitch3 Virtual switch specified are sample
names/values and should be changed
to fit actual configurations.
Management network
(Control Plane Traffic)
VMware Switch: vSwitch1 Virtual switch specified are sample
names/values and should be changed
to fit actual configurations.
Public network VMware Switch: vSwitch2 Virtual switch specified are sample
names/values and should be changed
to fit actual configurations.
POD (Z01-POD01) Replicate this for each POD
Pod Name Z01-POD01 Sample name for POD.
Start Reserved System IPs 10.144.53.201 These are examples. Need to change
to suit your network configuration.
IPs for the CloudPlatform hosts,
storage and network devices within
the Pod. For 64 hosts enough IPs
should be allocated to virtual routers,
storage VM, Console Proxy VMs.
End Reserved System IPs 10.144.53.235 These are examples. Need to change
to suit your network configuration.
IPs for the CloudPlatform hosts,
storage and network devices within
the Pod
Number of Clusters 2 Clusters Citrix recommends 8 servers per
cluster -- this provides the optimum
management to performance ratio
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 27
Cluster Name Z01-POD1-VMW-CL01
Z01-POD1-VMW-CL02
Sample names for clusters.
Hypervisor VMware ESXi 5.0 vCenter must use port 443 (default).
Compute Nodes in cluster (replicate for each cluster)
Number of Servers 8 hosts per Cluster. Cisco UCS B230 M1 blades (Guest
VMs)
Make & Model Cisco UCS B230 M1 Blade Servers
CPUs 2 x 6-core Intel CPUs
Memory 128GB RAM 128 GB should be sufficient for most
workloads but can be increased based
on target workload and hypervisor
capacity.
Target Number of VMs 60 per server
Network Hardware
Access switches 2 x Cisco Nexus 5548
Storage Hardware
Shared Hypervisor Storage
(Primary Storage)
Storage System: EMC VNX 7500
Protocol: VMFS
VMFS Datastore: Z1-P1-CL01-
PS01
Citrix recommends a minimum of two
primary storage volumes per cluster.
Use VMFS file system when storage is
connected by iSCSI or FC.
Name for VMFS Datastore mentioned
here is a sample.
CloudPlatform Deployment Reference Architecture
28 © 2012 Citrix Systems, Inc. All rights reserved.
Traditional Availability Zone Configuration (for XenServer)
Availability Zone 1- ZONE-XEN-01
Name of Zone [ZONE-XEN-01] Sample name for the zone.
Network Mode Advanced Network with VLANs Advanced Networking is a stipulation
for using the Citrix XenServer
hypervisor
VLAN Type Tagged VLANs
Guest Networks CIDR 10.2.1.0/24 This CIDR is a sample and
CloudPlatform administrator can
choose based on their networking
best practices.
Guest VLAN Range 300-1000 These VLANs are allocated for each
account and any isolated/shared
network created apart from a guest
network. You can compute a range
roughly at an average of 3 VLANs per
customer.
Guest networks (VM Traffic) Network Label (XenServer Bridge)
: cloud-guest
These names are examples and
should be changed to meet XenServer
Configuration.
Storage network Network Label (XenServer
Bridge): cloud-storage
These names are examples and
should be changed to meet XenServer
Configuration.
VM Management network
(Control Plane Traffic)
Network Label (XenServer
Bridge):cloud-mgmt
These names are examples and
should be changed to meet XenServer
Configuration.
Public networks Network Label (XenServer
Bridge):cloud-pub
These names are examples and
should be changed to meet XenServer
CloudPlatform Deployment Reference Architecture
© 2012 Citrix Systems, Inc. All rights reserved. 29
Configuration.
POD (Z01-POD01) Replicate this for each POD
Pod Name Z01-POD01 Sample name for the pod.
Start Reserved System IPs 10.144.53.201 These are examples and should be
changed to suit your network
configuration.
XenServer uses link local addresses
for virtual routers and system VMs.
End Reserved System IPs 10.144.53.210 These are examples and should be
changed to suit your network
configuration.
IPs for the CloudPlatform hosts,
storage and network devices within
the Pod
Number of Clusters 2 clusters Citrix recommends 8 servers per
cluster -- this provides the optimum
management to performance ratio.
Cluster Name Z01-POD1-XEN-CL01
Z01-POD1-XEN-CL02
Sample name for clusters.
Hypervisor Citrix XenServer 6.0.x
Compute Nodes in Cluster (replicate for each cluster)
Number of Servers 8 hosts per cluster.
Make & Model HP DL360 G8
CPUs 2 x 6-core Intel CPUs
Memory 128 GB RAM 128 GB should be sufficient for most
workloads but can be increased based
CloudPlatform Deployment Reference Architecture
30 © 2012 Citrix Systems, Inc. All rights reserved.
on target workload and hypervisor
capacity.
Target Number of VMs 60 per server
Network Hardware
Access switches 2 x Cisco Nexus 5548
Storage Hardware
Shared Hypervisor Storage
(Primary Storage)
Storage System:NetApp FAS3240AE
Protocol: NFS
NFS Mount 1:/Z1-P1-CL01-PS01/
NFS Mount 2: /Z1-P1-CL01-PS02/
Citrix recommends a minimum of two
primary storage volumes per cluster.
Sample NFS Mount names.