h8886 Prosphere Overview

13
8/13/2019 h8886 Prosphere Overview http://slidepdf.com/reader/full/h8886-prosphere-overview 1/13  White Paper  Abstract This white paper introduces ProSphere®, the next generation of Storage Resource Management (SRM) from EMC. The architecture is discussed at a high-level with the key concepts and principles of the architecture presented. A focus of the paper is the innovative ProSphere information architecture which enables a more open and flexible SRM platform. August 2011 PROSPHERE: NEXT GENERATION STORAGE RESOURCE MANAGEMENT  

Transcript of h8886 Prosphere Overview

Page 1: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 1/13

 

White Paper  

Abstract

This white paper introduces ProSphere®, the next generation ofStorage Resource Management (SRM) from EMC. Thearchitecture is discussed at a high-level with the key conceptsand principles of the architecture presented. A focus of the

paper is the innovative ProSphere information architecturewhich enables a more open and flexible SRM platform.

August 2011

PROSPHERE: NEXT GENERATION STORAGERESOURCE MANAGEMENT  

Page 2: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 2/13

Page 3: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 3/13

  3Next Generation Storage Management

Table of Contents

Executive summary.................................................................................................. 4 Audience ............................................................................................................................ 4

 Architectural overview ............................................................................................. 4 

Standards-based ................................................................................................................ 4 Resource-Oriented .............................................................................................................. 5 Event-Driven ....................................................................................................................... 6 Deploying ProSphere .......................................................................................................... 8 Many models: one system .................................................................................................. 9 Federated Information Provide a Global Perspective ......................................................... 10 Anything can be a data source .......................................................................................... 10 Information is Searchable ................................................................................................ 10 Information is Integrated .................................................................................................. 10 Information is Secured ..................................................................................................... 11 SRM meets Big Data ......................................................................................................... 11 

Big Data ....................................................................................................................... 11 Greenplum ................................................................................................................... 11 

Conclusion ............................................................................................................ 12 Simplicity, quality and value ............................................................................................ 12 

References ............................................................................................................ 13 

Page 4: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 4/13

  4Next Generation Storage Management

Executive summary

EMC Ionix™ ControlCenter® has been the market leader for enterprise StorageResource Management (SRM) for the better part of a decade. During that time, theindustry has witnessed an explosion in the quantity and variety of information that

SRM applications must gather, process, and analyze. The physical infrastructure of atypical corporate data center has undergone profound change with the spread ofheterogeneous environments, virtualization and the commoditization of storage. Thebenefits promised by new cloud computing models and the adoption of virtualizationtechnologies create a new category of management challenges for corporations of allsizes. It is no wonder that SRM products that were developed in the 90s are strugglingto meet the challenges of today's IT organizations.

EMC ProSphere is architected to meet the demands of the modern data center. It isable to meet the "web-scale" challenges of a global enterprise, and has the ability toquickly support new hardware and emerging business models. In short, it is

architected to support change.

Audience

This white paper provides a high-level overview of the ProSphere architecture, itsprinciples and some of the technologies leveraged. The intended audience for thisdocument is anyone interested in understanding the ProSphere architecture. Thisdocument should be easily understood by an end-user (e.g. system and storageadministrators and architects), system implementers (e.g. solutions architects, EMCengineering), support, and EMC partners.

Architectural overview

ProSphere was architected and developed with key principles that address the qualityattributes (simplicity, flexibility, maintainability, usability, performance, andscalability) that are required for an enterprise-level management product.

Standards-based

The first of these principles is that the product should be standards-based, meaning

the technologies leveraged internally and exposed externally should be standards-based.

The use of standards begins with the standard protocols used to discover acustomer’s environment. The primary protocols in use in ProSphere are SMI-S forstorage element discovery, SNMP for network devices, and WMI and SSH for hostdiscovery. The standards used in the implementation of ProSphere cover both

Page 5: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 5/13

  5Next Generation Storage Management

standard protocols (HTTP, SSL) as well as commonly deployed architectural principles(e.g. REST).

REST is an architectural style that is the basis for the implementation of the WorldWide Web. One reason this style is so widely deployed and successful is that it isrelatively simple to understand and implement, which makes the interoperability ofnetworked systems straightforward and robust, the web being a prime example. The

REST architectural style is also the basic of many of today’s commonly used APIs (e.g.Google Data Protocol, Microsoft’s Open Data Protocol (OData), Netflix) and hasbecome the communication style used in the majority of software products andplatforms developed today.

The primary communication protocol in a RESTful architecture is HTTP, with SSL (i.e.HTTPS) implemented to provide secure communication and authorization betweennetwork endpoints. ProSphere components communicate exclusively via HTTP.

The formats used in the exposure of information from ProSphere are also based onstandards such as Atom and JSON. The Atom specification has its origin as a betterversion of RSS, used in content subscriptions on the web. In addition, it is commonly

used as the method of communication for blogging applications. Both the AtomSpecification and the Atom Publishing Protocol have been co-opted by manyorganizations because of its simplicity and flexibility. The Atom format can easilyrepresent lists of time-based entries, as well as express content represented in plaintext, XML and HTML. ProSphere uses this format to expose the majority of itsinformation model with one of the benefits being the ability to use any web browserto navigate the information model.

By leveraging standards throughout the ProSphere architecture, the ability forProSphere and its customers to take advantage of widely deployed and easilyunderstood tools provides immediate value and low-cost integration opportunities.

The barriers to adoption are removed, as there are no new APIs, programminglanguages or protocols to learn and implement. An additional benefit is the shortramp-up time for integration partners and other developers leveraging the ProSpherearchitecture, as the tools used are likely well-known to them.

Resource-Oriented

All objects managed in ProSphere, from a storage array to a user, are resources thatare modeled, discovered, managed, and exposed using the most appropriatetechnology available. An HTTP URI (Uniform Resource Identifier) uniquely identifiesthese resources.

There is an important distinction to be made between a Uniform Resource Locator(URL), a Uniform Resource Name (URN) and a URI. A URL is a network locatorreference, or address of the resource. Where, as its name implies, a resource name isjust the name or identifier of the resource. The term URI applies to both of theseterms, as both the location of the URL and the name of the URN are unique.

HTTP URIs exists in Semantic Web applications that leverage RDF. An HTTP URI isused to identify an object uniquely, within the context of the application.

Page 6: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 6/13

  6Next Generation Storage Management

Each resource in the ProSphere system has a type, that is uniquely identified by aURI. This includes all storage elements (e.g. storage volume, storage port, HBA,switch port), as well as all application infrastructure entities (e.g. log file, service,process). All of the instances of these resource types comprise the ProSphereinformation model.

Properties, expressed as URIs, are used to associate types and instances of these

types. When instances are discovered (e.g. storage array) a unique HTTP URI iscreated which identifies this new instance, a resource URI. With this newly mintedURI, the resource instance can now be associated with other instances andstatements may be made about the resource (e.g. “storageArray1 isLocatedInHouston”).

Information about resources is exposed using the standards-based representation(e.g. Atom feeds etc.) most appropriate for the task to be accomplished. For example,a list of storage arrays is easily acquired and then divided into discrete pages, byusing Atom Feed pagination. A collection of performance metrics for an entire day ismost compactly expressed using JSON or CSV.

The consistent use of a resource information model greatly simplifies thedevelopment and maintenance of ProSphere, as this leads to a model that is tightlyintegrated and consistently used throughout the application. This uniformity, in turn,provides greater reliability for the customer, since ProSphere applies the sameservices against both external and internal models.

The RESTful exposure of resources, with consistent URIs, also simplifies the systemfor end users, as the API can be understood without requiring prior reading oreducation.

Event-Driven

An event-driven architecture is an architectural pattern, related to Service OrientedArchitecture (SOA). It promotes the coordination of loosely coupled components toprovide intelligent data processing and enables highly scalable systems. In an event-driven architecture, messages representing events are communicated point-to-pointor across a shared bus. Components use these messages to trigger analysis andother processes, usually emitting additional messages to communicate statechanges.

ProSphere uses messages to communicate:

•  The existence of events such as storage element modifications

•  Exceptional state changes (e.g. port down)

•  State changes in processing (e.g. discovery complete)•  State changes of resources (e.g. entity created, entity updated, entity deleted)

•  State changes of system components

•  Alert creation and modification

By decoupling the message and the processing related to the message, componentscan be distributed and aggregated to provide high availability and scalability.

Page 7: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 7/13

  7Next Generation Storage Management

Because the interface to the component is simply a message, any number ofcomponents or threads can be deployed to process these messages concurrently,thus increasing the throughput of the system.

Complex Event Processing (CEP) services can be layered over this foundation tocreate a highly scalable and flexible framework to analyze events generated in a datacenter in near real-time to provide greater insight and clarity into storage related

events. ProSphere uses CEP technology to analyze large volumes of performancemetrics against performance thresholds, in order to produce events that ultimatelyresult in the creation of alerts. The event-driven architecture and CEP allow largevolumes of events (thousands per second) to be processed. This enables near real-time results that can notify a storage administrator as soon as a problem is detected,regardless of the size of the environment under management.

Figure 1 - ProSphere Architecture

ProSphere vApp ProSphere vApp 

Page 8: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 8/13

  8Next Generation Storage Management

Deploying ProSphere

ProSphere is distributed as a virtual application (vApp) that can be deployed andmanaged through industry standard tools provided by VMware. Scaling the system isas simple as deploying additional virtual machines (VMs) and distributing theprocessing among them. Data is federated among ProSphere VMs using a simple, butsecure, mechanism that runs over HTTP using SSL (i.e., HTTPS) and standard TCP

ports. Because the federated data distribution uses standard ports, typically throughwhich firewall access is already typically allowed, the need to manage complexfirewall policies is avoided.

The entire system can be updated from a central location using the VMware vCenter™Update Manager, to distribute ProSphere enhancements and patches (packaged asLinux RPMs) from a centralized location. VMware vCenter Update Manager scans thestate of the ProSphere deployment, as well as guest operating systems, comparesthem with baselines set by the administrator and then applies updates and patchesto enforce compliance to mandated patch standards. An Update Manager web UIgives administrators visibility into the patch status of the entire deployment. This

capability dramatically simplifies the patch management process, while helpingprotect the data center against bugs and security vulnerabilities.

The initial ProSphere deployment of ProSphere consists of three virtual machines:

•  ProSphere Appliance

•  Historical Database

•  Discovery Engine

The ProSphere Appliance is the foundation of the system. All of ProSphere’sapplication components and discovered information are contained in this VM.

The Historical Database hosts the EMC Greenplum database, which holds the

historical performance data used to populate dashboards and performance charts.This same database instance is extendible and is capable of hosting other historicalinformation.

The Discovery Engine hosts agentless technology used to discover a customer’senvironment. The discovery appliances leverage lightweight, standards-baseddiscovery mechanisms that are less intrusive than agent-based discovery systems.Agentless technology enables quicker time to value, as well as reducing the amountof time customers need to spend managing ProSphere. For more details, please referto the white paper: ProSphere Discovery and Monitoring for the Modern Data Center .

Page 9: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 9/13

  9Next Generation Storage Management

Figure 2 - ProSphere Appliance Components

Many models: one system

One of the great challenges with traditional enterprise applications is their ability torespond to change. One of the root causes for this limitation is a single applicationmodel that constrains a system’s ability to scale and easily and flexibly adapt toinnovations in technology, markets, and business goals. ProSphere addresses thesechallenges by leveraging the power of the Semantic Web.  With the Semantic Web,

models are essentially directed graphs, where each edge is made up of a series ofverifiable statements in the form of subject, predicate, and object("host1.somecompany.com" "connected To" "Symm1"). The power of this modelresults from being able to uniquely identify a subject or object (with a URI) and thenhave the ability to add arbitrary facts about objects without having to reengineer themodel and force a costly upgrade process. A more pragmatic example is support foradditional information models (such as compliance) or support for the latest

Page 10: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 10/13

  1Next Generation Storage Management

virtualization innovation can be added without having to re-engineer the wholeproduct. Using  RDF technology, the system is able to persist billions of statementsabout a data center, equating to a representation of millions of objects, and leveragethat information in a number of ways.

Federated Information Provide a Global Perspective

Another common limitation of traditional enterprise applications is the locking ofinformation into application or data center silos. This makes it difficult to combinedata from multiple locations and provide a global picture for executive management.

ProSphere utilizes the inherent flexibility of the Semantic Web to "push" top-leveldata across each ProSphere vApp to provide a holistic view of the whole enterprise. Amore detailed view of lower level objects such as storage volumes is obtainable byfollowing the appropriate URL back to the cluster that originally discovered theinformation.

The ProSphere Single Sign-On Service provides seamless integration within andbetween vApp instances.

Anything can be a data source

One key challenge for any enterprise-level product addressing storage resourcemanagement is data collection. The ability to discover the equipment in a data centeris constrained by the ability to gain access to the resource in the first place. Issuessuch as credential management, physical access, and organizational boundaries arepervasive and all contribute to make data collection problematic if not impossible.

ProSphere has adopted a flexible and pragmatic approach to this data collection.ProSphere employs agentless discovery appliances, which support data collectionusing standard protocols such as SMI-S, WMI and SSH for discovery of storage,

switches and hosts.

Information is Searchable

The data ProSphere gathers about an environment is indexed, so that it can besearched in a similar way to how one searches the internet using a web searchengine. This results in a search capability that is particularly useful for problemsolving activities and for navigating the large amounts of data that are gathered in allbut the smallest of data center environments. In addition to indexing the data,ProSphere also enriches the gathered data by adding information from additionalmodels, such as metadata, to improve search results.

Information is IntegratedOne of the challenging aspects of ControlCenter has been the management of a widerange of storage devices from a central location. Although this is desirable, keepingup with the timely manner that the users need for a depth of capability for this largerange of devices is hardly sustainable. In ProSphere, this challenge is addressed byproviding the ability to launch an element manager seamlessly with Single Sign On, incontext.

Page 11: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 11/13

  1Next Generation Storage Management

Information is Secured

Security is no longer an optional extra for today’s applications. As the number ofsecurity related incidents rise and the adverse publicity they attract causes lastingdamage to an organization, internal security policies, and external regulation are

becoming increasingly proscriptive.The first and overriding security principle behind ProSphere is to protect the data thatis gathered during operations, and respect the environment into which ProSphere hasbeen deployed. Communications between ProSphere components are secured usingindustrial strength cryptography, and any credentials that ProSphere requires fordevice access is securely stored on the ProSphere Appliance. ProSphere does notallow unauthenticated access to the system and both users and system componentsare required to identify themselves before accessing system resources.

The second security principle in ProSphere is to follow industry best practices for thesecure development and support of ProSphere in the field. ProSphere adheres to the

EMC Secure Development Lifecycle program, which lays out a series of security-focused activities for each step in the product lifecycle. The vulnerability assessmentprogram keeps customers informed of the availability of security patches the productand for third party components used in EMC products.

SRM meets Big Data

Big Data

Organizations increasingly face the need to maintain large amounts of data for longperiods, due to regulatory requirements and an emerging understanding that there isa real advantage in the processing and statistical analysis of streams of real-time

data. Given these needs, it is important that products that utilize “web-scale” levelsof data, use the right persistence and query mechanisms. The ability to scale-outhorizontally across machine boundaries is fundamental to achieving the levels ofperformance that are required.

ProSphere meets these requirements using the bigdata® RDF store. This store holdsall the discovery and related alerting data. The bigdata RDF store is a horizontallyscaled storage and computing fabric that supports optional transactions, with veryhigh concurrency, and very high aggregate IO rates. The bigdata store was designedfrom the ground up as a distributed database architecture running over clusters oftens to hundreds of machines, but can also run in a high-performance single-servermode, as is done in ProSphere. 

Greenplum The integration of Greenplum technology into ProSphere adds new capabilities for theadvanced analysis of performance and storage data. The main difference betweenGreenplum’s technology and other database software schemes is how data isaccessed. In traditional database management systems, like the one used inControlCenter, different query processing jobs generally share access to the same

Page 12: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 12/13

  1Next Generation Storage Management

hard-drive disks, which can slow down individual queries. Greenplum’s “shared-nothing” system divides data across multiple servers or segments, each of which hasits own connection to a disk drive. That means a single database query can be runagainst many segments of data simultaneously, which is perfect for advancedanalytical operations.

Figure 3 - Greenplum s MPP Shared Nothing Architecture

Conclusion

Simplicity, quality and value

ProSphere strives to deliver on the wider goals of simplicity, quality, and value. Byselecting technologies such as CEP and REST, ProSphere is able to provide a scalable,

robust architecture without the overhead of a complex technology stack thatsometimes comes with enterprise scale applications. With simplicity comes anincreased opportunity to deliver quality on a predictable and repeatable basis.Software that is “as simple as possible, but not simpler” is easier to test, change, andextend.

The technologies used in ProSphere were selected to deliver a product that willprovide clear and immediate value to customers. With simplified deployment, toagentless discovery, to a unified UI, ProSphere will reduce the cost of ownership, andprovide the information that storage administrator’s need, while enabling ITorganizations to prepare for the challenges of an industry undergoing immense

change.

Page 13: h8886 Prosphere Overview

8/13/2019 h8886 Prosphere Overview

http://slidepdf.com/reader/full/h8886-prosphere-overview 13/13

  1Next Generation Storage Management

References

Adobe Flex - http://www.adobe.com/products/flex/ 

ATOM - http://tools.ietf.org/html/rfc4287 

bigdata - http://www.systap.com/bigdata.htm 

Complex Event Processing - http://en.wikipedia.org/wiki/Complex_Event_Processing  

CSV - http://en.wikipedia.org/wiki/Comma-separated_values 

Event-drive architecture - http://martinfowler.com/eaaDev/EventNarrative.html 

FOAF – http://xmlns.com/foaf/spec/ 

Google gdata - http://code.google.com/apis/gdata/ 

Greenplum - http://greenplum.com/ 

 JSON - http://www.json.org/ 

OData - http://www.odata.org/ RDF – http://www.w3.org/RDF/ 

Resource Oriented Architecture - http://www.infoq.com/articles/roa-rest-of-rest 

REST - http://www.infoq.com/articles/rest-introduction 

Semantic Web - http://www.w3.org/standards/semanticweb/ 

SSH - http://en.wikipedia.org/wiki/Secure_Shell 

SMI-S - http://www.snia.org/tech_activities/standards/curr_standards/smi 

SNMP - http://en.wikipedia.org/wiki/Simple_Network_Management_Protocol 

SOA - http://www.soaglossary.com/service_oriented_architecture.php 

URI, URL - http://www.w3.org/TR/uri-clarification/ 

VMware Glossary - http://www.vmware.com/pdf/master_glossary.pdf  

VMware vSphere - http://www.vmware.com/products/vsphere/overview.html 

VMware vCenter Update Manager - http://www.vmware.com/products/update-manager/overview.html 

WMI - http://msdn.microsoft.com/en-us/library/aa394582(v=vs.85).aspx 

WS-Management - http://www.dmtf.org/standards/wsman