ASKING THE HARD QUESTIONS: REPORTING IN VIPR SRM · ASKING THE HARD QUESTIONS: REPORTING IN VIPR...
Transcript of ASKING THE HARD QUESTIONS: REPORTING IN VIPR SRM · ASKING THE HARD QUESTIONS: REPORTING IN VIPR...
ASKING THE HARD QUESTIONS: REPORTING IN VIPR SRM
Daniel StaffordAdvisory Systems [email protected](Words and figures)
Tiffany [email protected](Illustrations)
2015 EMC Proven Professional Knowledge Sharing 2
Table of Contents
Introduction: What is a Hard Question? ...................................................................................... 4
Automating the Answers ......................................................................................................... 5
About this Article .................................................................................................................... 6
Basic Search Skills and the Data Model ..................................................................................... 8
Metric Search ......................................................................................................................... 8
Property Search ..................................................................................................................... 9
Notes on the ViPR SRM Data Model .....................................................................................10
Why is Metric and Property Searching a Foundational Skill? .................................................10
Building a Table with a Simple Expansion .................................................................................11
More on Simple Expansions ..................................................................................................14
Adding Related Disks ................................................................................................................16
Adding Physical Hosts and Disk Capacity .................................................................................19
The Basics of Time Management ..........................................................................................22
Recipes for Success: Common Time Management Configurations .......................................24
Adding Disk Capacity ............................................................................................................25
Recipes for Success: Common Complex Expansions ...........................................................28
Configuring the Expansion for Physical Hosts .......................................................................30
Configuring the Expansion for Virtual Machines ....................................................................31
Data Enrichment .......................................................................................................................36
Registering a Collector ..........................................................................................................36
Configuring a Tag Set ............................................................................................................37
Saving a Tag Set ...................................................................................................................39
Checking for Updates ............................................................................................................40
Using Data Enrichment for Application Chargeback ..................................................................42
Alerting on Reports ...................................................................................................................45
2015 EMC Proven Professional Knowledge Sharing 3
Building the Report ................................................................................................................46
Scheduling the Report ...........................................................................................................46
Configuring the Alert ..............................................................................................................47
Notes on the Report Data Adapter .....................................................................................47
The Alerting Definition ........................................................................................................48
Automating the Policy Change ..............................................................................................49
Testing Port Deregistration ....................................................................................................51
Conclusion ................................................................................................................................52
Disclaimer: The views, processes or methodologies published in this article are those of the
author. They do not necessarily reflect EMC Corporation’s views, processes or methodologies.
2015 EMC Proven Professional Knowledge Sharing 4
Introduction: What is a Hard Question?
Most enterprise information technology products have some sort of built-in reporting capability.
In storage arrays this might be something like Unisphere® for VMAX® or Isilon® InsightIQ.
VMware vCenter natively shows statistics about a VMware environment. Oracle has Automatic
Workload Repository (AWR) and Oracle Enterprise Manager (OEM).
The common thread among these tools is that they were written with the intent of reporting on a
specific product or system. This initial architectural decision introduces some inherent limitations
into their capabilities. For instance, it may be difficult to ask questions that involve long time
periods. It may be difficult to scale to report on a large number of the target systems in a unified
way. Most importantly, it may not be possible to build a query or report that the developer did
not envision.
This means traditional tools are most often used to answer easy questions. Answering hard
questions with only these tools available often means doing painstaking analysis by hand.
Let’s consider a few examples of this:
Easy Question Hard Question Value of the Hard
Answer
Draw a graph of the
write throughput to LUN
70 on VMAX 1581 over
the past day
Draw a graph of the total write
throughput to all LUNs associated with
critical apps across VMAXs, XIVs, and
NetApp Filers in the Eastside Data
Center over the past month
Allows network team to
right-size the WAN circuit
used for replication
Which processor had
the highest average
utilization last week?
Rank the utilization of all processors
based on a combination of average,
maximum, and 95th percentile
utilization
Enable orchestration
engine to provision new
load based on
performance, leading to
higher utilization and less
spend
Which ports on the SAN Which ports on the SAN fabric are up
but haven’t passed a packet in the past
Reclaim SAN ports to
2015 EMC Proven Professional Knowledge Sharing 5
fabric are down? month? avoid new purchase
What is the
oversubscription ratio
on this thin pool?
What is the average filesystem
utilization of every host attached to this
thick array?
Accurately design a thin
storage solution to save
money at refresh time
Send an alert when a
pool reaches 80% full
Trigger a provisioning stop for an array
when a combined set of pools reaches
80% or disk utilization is consistently
over 70%
Provide safeguards to
improve application
availability
How many disks are
attached to this host?
Which of those disks are local and
which are on arrays? What arrays are
they located on, what are their IDs,
and what SAN ports are used? Are
there other hosts using the same
array?
Perform migrations with
less labor and lower risk
of error
The common thread among the hard questions is that they are the things actually being asked
by senior resources. This is because their answers have direct, hard-dollar impact on budgets.
Often the only reason for asking the easy questions is as research in service of the hard
questions. This is part of the labor-intensive analysis necessary to answer hard questions.
This investment of labor also means that the number of hard questions that can be answered is
inherently limited: There are always more questions than answers. The ability to get directly to
the hard answers without the labor investment has the potential to change the way an IT
organization operates in fundamental ways.
Automating the Answers
EMC’s ViPR SRM has become a very popular monitoring and reporting package for good
reason. It can collect data from a diverse set of IT infrastructure products, produce thousands of
useful out-of-the-box reports, and scale to meet the needs of the world’s largest data centers.
Many of these out-of-the-box reports are even the sort that would qualify as ‘hard questions’
under the criteria we’ve laid out above.
2015 EMC Proven Professional Knowledge Sharing 6
This is the proposition that leads many enterprises to deploy ViPR SRM. What many
organizations quickly discover is that many of their hard questions can’t be answered by those
thousands of built-in reports. At this point, customization is needed.
Custom reporting in any tool can be intimidating. In this article we will attempt to overcome that
intimidation. Starting from the perspective of a new administrator, we build the foundational
knowledge necessary to create custom reports which address specific business problems. In
the process, a number of recipes are developed which can be used as-is to immediately get
interesting and valuable results not included in an off-the-shelf install.
About this Article
This article is written as a how-to guide in which previously learned skills build on each other. It
can be read by itself, but for maximum benefit it is intended as a sort of lab guide. By following
along in your own ViPR SRM environment as you
read, you will find yourself retaining more and going
beyond the text to solve problems specific to your
own needs.
If you don’t have a ViPR SRM environment in which to try this out, reach out to your EMC or
Partner Systems Engineer. They can provide you with links to online virtual environments that
are meant for learning.
Throughout this article Online Checkpoints are often referenced. These can be found at
https://community.emc.com/people/DannoOfNashville/blog/2015/05/03/knowledge-sharing-2015-vipr-srm-reports.
If you run into trouble following the configuration steps, or simply want to jump right to a usable
report, visit this site and download the appropriate checkpoint. This site will also include any
errata discovered after publication.
Explore: Boxes like this suggest
experiments for you to try on your
own
2015 EMC Proven Professional Knowledge Sharing 7
2015 EMC Proven Professional Knowledge Sharing 8
Explore: Any property can be
used in this search expansion.
Try adding source to the front, or
vstatus anywhere in the list.
Basic Search Skills and the Data Model
Metric Search
Before discussing any of the theory
behind ViPR SRM’s data model, the
most instructive thing can be to first
spend some time exploring it. Let’s
start with a search using the
Advanced Search dialog.
Leave the filter as-is, set to Everything
(*). For now, just enter the following in
the Expansion box:
devtype device parttype part name
The meaning of this will become clear as you click through the search results. The results start
with:
A list of Device Types (devtype) which…
Leads to a list of Devices of that type (device)…
Then to Component Types associated with that
device (parttype)…
On to Components of that type (part), and finally to…
Time series Metrics which are associated with that component (name)
2015 EMC Proven Professional Knowledge Sharing 9
Property Search
The components (devtype, device, parttype, part, name)
which make up the expansion are known as properties.
These particular five properties are special because every
metric (name) in the database has them. However, there
are typically many more properties associated with most
collected metrics. Let’s look at some of those now.
To do this we’ll use the Management of
Database Metrics interface, located in
‘Administration’. (Note: Non-Admin users
should browse to ‘Modules’ instead of
‘Administration’)
Because this interface returns raw
metrics, we’ll add a Filter to ensure the
search is fast and we don’t get too many
results. Filters can be created graphically
by clicking on the box which starts with
Everything. Try refining the filter down to a particular device type, device, and component type.
When you run this query, a matching list of metrics will appear. Clicking on any of these will
display a list of properties attached to the metric.
Note that any of these properties can be used as a Filter
term. They can be accessed with an easy autocomplete
interface by choosing ‘Using a Wizard’. Advanced users
may choose to simply ‘Edit Expression’ and enter the filter
directly. The syntax of these filters is very similar to SQL or
CQL.
2015 EMC Proven Professional Knowledge Sharing 10
Notes on the ViPR SRM Data Model
It is important to note here that the metrics are
stored in an entirely flat, unstructured data
model. When we search in the original example,
the only reason the results appear in a tree-like
structure is because we have requested it with
the expansion (devttype device parttype part name). This is a good starting expansion because
these properties are common to almost every metric. By combining different filters and
expansion patterns we can impose almost any structure that makes sense for a given situation.
Why is Metric and Property Searching a Foundational Skill?
As you read through the rest of this article, references to Metric and Property names may seem
to appear out of nowhere. For example, in the first report (Online Checkpoint #1), we use the
property deviceid, which contains the UUID of a VMware Virtual Machine. In the second (Online
Checkpoint #2), we use the metric ‘Capacity’ to sum up the SAN capacity consumed by each
Virtual Machine.
When these appear, refer back to this Basic Search Skills section. You will always be able to
find them in one of two ways:
Grouping relationships can be found by using the Advanced Search. For example:
o A devtype ‘Array’ will include one or more ‘LUN’ parttypes, and each associated
part will have a metric with name ‘WriteRequests’.
Metadata (properties) can be found using Management of Database Metrics. For
example:
o Every metric associated with a devtype ‘VirtualMachine’ stores the associated
UUID in property deviceid.
o VMware Datastores have their unique World Wide Names (WWNs) stored in
property partsn. Property partsn is also used for the WWNs of Array LUNs,
making partsn a unique key which connects these different data sets.
Explore: Do a search with a Filter/
Expansion set that lets you drill down
to view LUNs by Pool (hint:
poolname). Try the same thing with
Virtual Machines by vSphere Cluster
(hint: cluster)
2015 EMC Proven Professional Knowledge Sharing 11
Building a Table with a Simple Expansion
To handle Jordan’s request, we’ll need to use
Edit Mode. Once in Edit Mode, we’ll want to
add our custom report in the section called
‘My Reports’. As you might guess from the
name, My Reports is a private sandbox. The
reports built here are visible only to you. It’s a
place where experimentation (and mistakes)
are encouraged.
To add a new node, click the ‘New Node’
button. The other buttons along the bar shown are respectively Cut, Copy, Paste, Paste as Link,
and Remove nodes.
Once the node is added, let’s do the following to
populate it.
1. In the Filtering and Expansion tab
a. Give the report a friendly name
b. Set a filter for devices of type
VirtualMachine and source is the
VMware Collector
2. In the Report Configuration tab
a. Change report type to Standard Table
3. In the Report Details tab
a. Add a property column called ‘Virtual Machine’ with the property ‘device’
2015 EMC Proven Professional Knowledge Sharing 12
Once this is done, let’s go back to Browse Mode and take a look at our handiwork.
As you can see from Figure 8, something is amiss. My lab
environment with less than 200 Virtual Machines (VMs)
generated nearly 15,000 lines. What went wrong?
As we learned while using Search, the data in ViPR SRM is
inherently unstructured. This means that unless we impose a
structure, the engine will simply assume we wish to show
one Time-Series metric per table line.
The way we impose this structure is with an expansion. An expansion groups metrics based on
common properties. For example, an expansion on deviceid (which is the VM UUID) groups all
of the metrics with a common deviceid property together on the same row.
2015 EMC Proven Professional Knowledge Sharing 13
To fix the report:
4. Add a Child Node to the report
5. On the Child Node in Filtering and
Expansion
a. Add deviceid as an Expansion
Property
Now when we view the report, it will have just one
Virtual Machine per line.
Since Jordan also wanted to know what particular vSphere host each VM is running on, we’ll
need to know the names of the property with that data. A quick search of Management of
Database Metrics reveals that hypervsr is correct here. Adding this as an additional Property
column just as we did for device will produce the desired table.
2015 EMC Proven Professional Knowledge Sharing 14
More on Simple Expansions
Consider a sample dataset based on the filter devtype==’Array’ & parttype=’LUN’ for a moment.
Suppose that each LUN just has time-series Metrics (name) entries for IOPS and Capacity.
What happens if different expansions are applied?
Of these, only the device expansion would be considered conventionally useful. It is more
common with large datasets to see expansions on combinations of properties, as in Figure 13.
2015 EMC Proven Professional Knowledge Sharing 15
It is also very common to see multiple levels of expansion used for drilling down. When the user
clicks on a table row, the next level down is filtered to just the data associated with the row
(Figure 14).
Explore: Try this yourself. Copy/paste a
report and try changing the expansion. Try to
build a simple drill-down.
2015 EMC Proven Professional Knowledge Sharing 16
Adding Related Disks
When building out report nodes, filters are
additive. A common practice is to start at a top
node with a very wide filter, such as all data
associated with the VMware Collector
(source==’VMWare-Collector’). As the report
drills into details of the data the filter gradually
becomes more restrictive.
Expansions are part of this process. Besides
grouping common metrics, they also introduce a filter based on those common metrics. This can
be seen visually by comparing the report tree in Edit Mode to the report tree in Browse Mode. A
node with a simple expansion may become many nodes, each one a unique data set within that
expansion.
This means by adding a child node underneath the node with the device expansion, we can
filter for components particular to a given device. We will add a child node with a filter for
parttype==’Disk’. The filter scope will change to ‘expansion and selection’ because we will also
add a simple expansion on part to this node.
Exploring the tree in Browse mode will reveal that each VM can now be expanded into a list of
Disks associated with that VM.
2015 EMC Proven Professional Knowledge Sharing 17
Since the VMware Administrator wants to see a
count of Disks, we’ll need to add a Formula
using the Formula tab. In this case we’ll use the
ChildCount formula, which returns a count of
child nodes.
When you create a Formula, the formula result has a scope. This is a fancy way of saying the
result is only visible from certain places. Formula results in ViPR SRM always have a scope
which includes the node on which they were created, plus the parent of that node.
In this case, we’ve created the Formula on the ‘device’ node (because we wish to count its
children), and we want to display it on the ‘VM List with Disk Count’ node, both of which are in
the scope by default. However, this will be an important fact to remember later when we wish to
display a formula result outside its default scope.
Explore: Take a moment to look at the
available formulas.
2015 EMC Proven Professional Knowledge Sharing 18
Once this is done, a Disk Count column will show the number of disks associated with the VM.
Clicking on a VM will show a list of those disks by name.
2015 EMC Proven Professional Knowledge Sharing 19
Adding Physical Hosts and Disk Capacity
There are a number of ways to add Physical Hosts to this table alongside the Virtual Machines.
The simplest would be to expand the filter to include devtype==’Host’. This will work because
the tree structure we’ve used in the report so far is the same: both VMs and Physical Hosts
have associated Disk parttypes.
A slightly more complex way starts the same: we expand the top-level filter to encompass
Physical Hosts. However, we would then restrict the filter on the existing child node (the device
and part nodes above) to apply only to devtype==’VirtualMachine’ and build a second child node
to apply to only devtype==’Host’.
In this example, we will choose the second option. This method can be useful for a number of
reasons:
It allows data sets with different structures to be displayed in a normalized way
o For example, performance statistics on VMDKs appear on VirtualDisk parts,
whereas those same statistics for a Raw Device Mapping (RDM) appear on Disk
parts. Using different nodes for each allows us to pass up these metrics in a
common way.
o This is also important if different parts of the dataset require different expansions
– In this case the VMs need to be expanded on their unique deviceid (VMware
UUID), but the Hosts should be expanded on device, the hostname.
It allows different data to be displayed when the user clicks to drill down.
o For example, drilling down to a VM might display reports about the associated
HyperVisor and Datastore, whereas drilling down to a Physical Host would not
need these reports
It allows us a finer degree of control over how certain items are displayed.
2015 EMC Proven Professional Knowledge Sharing 20
In the screenshot above we’ve added a column which shows the property devtype, which
displays ‘VirtualMachine’ for VMs and ‘Host’ for physical hosts.
Suppose instead we wish to use the labels ‘VM Guest’ and ‘Bare Metal’. To do this, we will use
a new type of formula called a Nop along with the Value-to-String formatter. This sort of
customization can help make reports much more readable.
Adding the capacity of all of the disks will involve another new formula, the Spatial Sum. To do
this we need to think again about scope. The capacity should be summed on each VM or Host.
However, the Capacity metric resides a level below this in the tree. This means a Nop will be
required. Start by adding a Nop formula to the lowest child nodes. This Nop will filter for
name==’Capacity’.
2015 EMC Proven Professional Knowledge Sharing 21
When you add the sum formula (math.Spatial.Sum) to the Host and VM nodes, take a moment
to look at the formula configuration. It follows a pattern common among many formula types.
First, the formula requests an input
parameter. This could be a metric
selected from the filtered/expanded data
available (‘Filter on this node’), a Formula
Result, a Property Value, or a set of
Combined Parameters (which can
combine any of the above). For this
example we will choose the result of the Nop formula on the child.
The second configuration group on the sum formula is ‘Settings’. In this case, they will be left
default. However, on many other formulas these settings are central to the desired operation.
The third configuration group consists of the output settings. The only thing we will set in this
case is a name. The other two options are used in other circumstances. ‘Show in Graphs’ allows
the result to be shown in a simple chart on the same node. ‘Default result’ allows that formula
result to be displayed as part of a stack chart or TopN report.
Once the Nop and Spatial Sum formulas are added, we are ready to display the result in the
table. This will be a Value and will use the Formula Result from the Sum. With this column we
will need to do something new: Modify the column’s Time Management settings.
2015 EMC Proven Professional Knowledge Sharing 22
The Basics of Time Management
The Time Management settings on a column are there to solve a problem which exists on every
ViPR SRM table:
There are many different ways that one might wish to summarize a set time-series data:
Display the average over the period
Display the maximum or minimum over the period
Display the sum of all values in the period (numerical integration)
Display the last value in the period
Some of these operations might be computationally intensive. Imagine asking ViPR SRM to
display the average IOPS on a LUN over the last six months. If it used real-time data collected
every five minutes, this would involve retrieving and averaging over 50,000 values for every
LUN displayed.
To simplify this, the ViPR SRM engine continuously calculates values such as the average,
minimum, maximum, and sum for various periods (each hour, day, and week). Rather than
average all of the real time values, we would average the rolled-up one-week averages. This
produces the same answer with far less computation – 26 values versus 52,560, or about 2000
times more efficient.
The Time Management settings which can be configured to produce these results are:
Sampling Period – Should the calculation use real time data or one of the statistical
aggregates (one hour, one day, one week)?
Sampling Type – Which sort of statistical aggregate (average, mix, max, sum, last, or
count) should be used?
2015 EMC Proven Professional Knowledge Sharing 23
Column Time Range(s) – ‘Inherit from report’ will use the time range from the ‘Report
Configuration’ tab. Optionally, a particular period may be chosen, or multiple columns
can be generated for multiple time ranges.
Recover… – Should all values in the time range be used for calculation, or should we
simply display the last one?
Temporal Aggregation – If ‘Recover…’ was set to All Values, this specifies whether the
set of values should be averaged, summed, or if the min, max, or count should be
displayed.
Time Threshold – If ‘Recover…’ was set to Last, this specifies how far to go back
looking for a last value.
2015 EMC Proven Professional Knowledge Sharing 24
Recipes for Success: Common Time Management Configurations
Recipe #1
Useful for: Showing the last value of a
metric, such as with LUN or Pool
Capacities. Also good for metrics which
don’t change much, such as CPU counts.
Recipe #2
Useful for: Showing the average of a
value over a period with minimal Front-end
load. Common when displaying average
IOPS, CPU utilization, or Memory usage.
Recipe #3
Useful for: Showing the peak value over a
period.
Recipe #4
Useful for: Estimating total change over
time. For example, this might be used to
estimate the required size of a
RecoverPoint journal or the space required
for snapshot deltas.
2015 EMC Proven Professional Knowledge Sharing 25
Adding Disk Capacity
To review, the steps to add a Disk Capacity column are:
This will provide an accurate value for each Host and VM. However, as configured it is still
missing some possible points for style. Since all of the values are in gigabytes (GB), some of
the very high and very low values can be difficult to read.
To improve readability, we will use the Scaling feature in the column’s Value Settings. Scaling
can be simple multiplication or division. It can also be unit-aware. For this case, unit auto-
scaling will work perfectly.
With these settings, Capacity will be
scaled to the most appropriate unit
(Figure 24).
2015 EMC Proven Professional Knowledge Sharing 26
Adding LUN and Array Information
Using the skills we’ve already developed, it’s now fairly simple to turn the boring drill-down list
into a table of disk names.
All of the information about the associated LUNs is locked up in a completely different dataset.
Fortunately, one of ViPR SRMs strengths is making connections between different datasets. To
do this, we’ll need to learn to use a new tool: Complex Expansions.
Complex expansions
extend the metaphor we
explored previously with
simple expansions:
2015 EMC Proven Professional Knowledge Sharing 27
In a complex expansion, a new filter is created to find
the new dataset. The complex expansion itself:
Splits the data based on common properties
(just like a simple expansion)
Connects the split data to parent nodes with matching source data
In the report we’ve built so far, this existing source data (on the lowest child node associated
with a physical host) would be for a particular Disk. Looking in Management of Database
Metrics, we can see this Disk has a property called partsn which is a LUN World Wide Name
(WWN). A complex expansion can use this to find the parttype LUN with the same WWN.
Let’s take a moment to review the steps to build a complex expansion that joins datasets based
on common properties:
In this step we choose a template for the complex expansion. This
limits the future steps to a smaller set of features to simplify the
configuration process.
In this case we are primarily considering ‘Join properties having a
different name’. To open up all of the options, choose ‘Manually
configure the complex expansion’.
Here we choose the source property on the existing data set (which
resides on the parent node) which we want to use to connect to some
other data set.
Here we select a target property which resides on the new data set
we wish to find.
Another common configuration item is Level Up, which removes
previous filter constraints. The most common selection is to level up to
Maximum, which allows you to create a fresh filter on this node.
This step captures any other Complex Expansion modifiers, such as
wildcard or regex matching, or splitting a property on a separator.
2015 EMC Proven Professional Knowledge Sharing 28
Recipes for Success: Common Complex Expansions
This table describes a number of commonly encountered expansions. It is important to note that
source and target are arbitrary. Any of the expansions below can work in both directions.
Note as well that the filters have been simplified. In writing real-world reports it is a best practice
to specify source and vstatus properties. The particular circumstances of your report may also
lead you to add additional restrictions.
Some of the expansions below join on multiple properties. This is to ensure that a unique match
is made. Just as every switch has a Port 1, nearly every array will have a LUN 0.
Connection Source Data Filter Target Data Filter Expansion
Configuration
Host Disk to
Array LUN
devtype==’Host’&
parttype==’Disk’
devtype==’Array’&
parttype==’LUN’
Join partsn to
partsn
VM RDM to
Array LUN
devtype==’VirtualMachine’&
parttype==’Disk’& rdmname&
partsn
devtype==’Array’&
parttype==’LUN’
Join partsn to
partsn
VM Virtual Disk
to VMDK File
devtype==’VirtualMachine’&
parttype==’Disk’& part=’HARD
DISK%’
devtype==’VirtualMachine’&
parttype==’File’
Join partdesc to
part
VMDK File to
Datastore
devtype==’VirtualMachine’&
parttype==’File’
devtype==’Datastore’ Join linkedto to
device
Datastore to
Array LUN
devtype==’Datastore’ devtype==’Array’&
parttype==’LUN’
Join partsn to
partsn
VMAX Storage
Group to LUN
Step #1
parttype==’Storage Group’ parttype==’StorageGroupToLUN’ Join part to
sgname
Join device to
device
2015 EMC Proven Professional Knowledge Sharing 29
VMAX Storage
Group to LUN
Step #2
parttype==’StorageGroupToLUN’ parttype==’LUN’ Join lunname to
part
Join device to
device
Host HBA to
Switch Port
(devtype==’Host’|
devtype==’Hypervisor’)&
parttype==’Port’
devtype==’FabricSwitch’&
iftype==’fibreChannel’
Join partsn to
portwwn
Explore: What other discovered
components might have connections?
Use Management of Database metrics
to find their common properties.
2015 EMC Proven Professional Knowledge Sharing 30
Configuring the Expansion for Physical Hosts
Executing these steps to connect to an Array LUN looks something like this:
Now the device and part properties from the LUN can be displayed on the top level table (or the
drill-down into the list of Disks). The device is the name of the storage array, whereas part is the
name of the LUN.
2015 EMC Proven Professional Knowledge Sharing 31
Configuring the Expansion for Virtual Machines
This only solves the Physical Host half of the equation. We still need to make the same
connections for Virtual Machines.
For Disks which are RDMs, this connection can be made the same way as with Disks attached
to Physical Hosts. The RDM has an associated WWN stored in the partsn property which can
be linked to an Array LUN using a Complex Expansion.
For Disks which are VMDKs, the process has a number of additional steps, depicted below.
To get Array and LUN data for Virtual Machines, we’ll need to start by splitting the original
simple expansion on part into two pieces. One will have a filter (on Expansion and Selection) to
capture only RDMs. The other will have a filter (also on Expansion and Selection) to capture
only VMDKs.
On the RDM node, we can copy/paste the complex expansion node from the Physical Host tree.
As stated before, this connection will work exactly the same as with a Physical Host. Copying
and pasting the LUN Properties Nop formulas to make this data available is left as an exercise
for the reader.
2015 EMC Proven Professional Knowledge Sharing 32
To complete the connection on the VMDK node, we will add a set of Complex Expansions
based on the chart above: Disk to File, File to Datastore, Datastore to LUN.
Once the Complex Expansions have been added, the device and part properties can be passed
up the tree using Nop formulas, just as in the previous two examples.
2015 EMC Proven Professional Knowledge Sharing 33
Extreme Time Management
Earlier we established some simple recipes for Time Management. These can be scaled up to
accomplish things that would be very difficult in a traditional reporting tool. To demonstrate this,
we will build a report which finds SAN ports which are connected but have not passed any traffic
in the past three months.
The link status of a port is available in the property partstat. Depending on whether the switch
environment is Brocade or Cisco, a connected port will have a partstat value of either ‘online’ or
‘up’, respectively.
Finally, to reduce the number of entries in our table (and eliminate the need for computationally
expensive sorting), we will take advantage of the value filtering feature. This allows us to only
display a table row if the resulting value matches a Boolean expression. This can be found in
the Advanced settings for a table value.
These steps are summarized in Figure 34.
2015 EMC Proven Professional Knowledge Sharing 34
Explore: What other under-utilized
components could we detect through
filtering and time management?
2015 EMC Proven Professional Knowledge Sharing 35
2015 EMC Proven Professional Knowledge Sharing 36
Data Enrichment
Opening up any given metric in Management of Database Metrics reveals a wealth of metadata.
The Data Enrichment process allows us to add custom metadata which is meaningful to a
particular business. Some common uses include tagging discovered components with:
Location
Business application
Business purpose
Installation, lifecycle, or maintenance dates
Business or IT contact
Cost data
Service Catalog Assignments (Gold, Silver, Bronze, etc.)
There are two interfaces in Centralized Management which allow custom metadata tagging.
The older, traditional one is Data Enrichment. This interface is the most flexible. In ViPR SRM
3.5, the Groups Management interface was added. Groups Management is intended to allow
simple, wizard-driven tagging for a set of common use cases. The tags it populates are often
referenced in built-in reports. By contrast, almost any tag added in Data Enrichment requires
some reporting customization.
The Groups Management interface is self-explanatory, especially to a user who has a basic
comfort level with the ViPR SRM data model. For this reason, we’ll focus on Data Enrichment,
which can be more powerful (and tuned more finely) in the hands of an experienced user.
Registering a Collector
The first step in using Data Enrichment is registering a collector module. This can be found by
browsing to ‘Data Enrichment’ in the Centralized Management interface. At the top level of the
tree, a ‘Register a new module’ button is available.
2015 EMC Proven Professional Knowledge Sharing 37
A given Collector host may have many Collector modules – one or more for each type of
infrastructure it is collecting data from. To avoid the management overhead of registering all of
these, a best practice is to register the “Load-Balancer :: DataEnrichment” module. This ensures
that any configured enrichment rules will act on all data which passes through the collector host.
Configuring a Tag Set
Once you register a module, drill into that module to configure Tag Sets with the ‘New Tagging’
button.
Each ‘New Tagging’ which is added consists of a list of keys and properties. These keys and
properties follow a basic template:
2015 EMC Proven Professional Knowledge Sharing 38
When adding a new key, there are a few choices to make:
Column order is important because a tagging ruleset can be imported from an Excel worksheet
or a CSV file. Setting this order correctly tells ViPR SRM what to expect when looking through
the file.
The type of match is also very important. For
example, choosing ‘String’ will allow you to
match exact strings of characters. One of the
most flexible options is ‘Regex’, which is short
for ‘Regular Expressions’. Regex is a language
built for pattern matching, common across many operating systems and computer languages.
Here we’ll explore a few common regex recipes.
Regex Matches
.*FOO.* Matches when ‘FOO’ is anywhere in the string, such as ‘FOOBAR’
or ‘EATFOOD’
^FOO.* Matches when ‘FOO’ is at the beginning of the string. ‘FOOBAR’
would match but ‘EATFOOD’ would not.
^.{3}FOO.* Any string where the letters ‘FOO’ are characters 4, 5, and 6.
‘EATFOOD’ would match but ‘FOOBAR’ would not.
Explore: The examples below just
scratch the surface of what is possible
with regular expressions. Check the
Internet for in-depth tutorials.
2015 EMC Proven Professional Knowledge Sharing 39
.*[fF][oO][oO].* Makes the match case-insensitive. Both ‘foobar’ and ‘FOOBAR’
would match.
.*FOO\d.* Will match when ‘FOO’ is followed by a number. ‘FOO1BAR’ would
match, but ‘FOOBAR’ would not.
Configuring a property is much simpler. It is only necessary to set a property name and column
position, and optionally a default value. Here are a few tips to keep in mind when choosing a
property name.
Property names can only be up to eight characters long. Any extra characters will get
truncated. This means homeaddress will become homeaddr.
Property names are case sensitive. HomeAddr is different from homeaddr.
Always check for collisions. If your metric already contains a property called lunname,
overwriting it is likely to have unintended consequences, such as reports not working.
Saving a Tag Set
When you click Save, the resulting dialog will list all registered modules. This is an opportunity
to apply this tagging configuration to a larger part of the environment. This can be a very
convenient feature. Suppose you want to apply the same Data Enrichment rules to every
VMware Collector: just check each one and choose ‘Update’.
These steps are shown on Figure 38.
2015 EMC Proven Professional Knowledge Sharing 40
Checking for Updates
Once you’ve implemented a set of Data Enrichment rules, it will take some time for them to be
applied. Typically, two things must happen before the new properties will be available:
The collector on which the rules are applied must complete a collection cycle
The property store must be updated on the Frontend host
If there is a need to iterate quickly, each of these can be manually initiated.
You can restart the registered collector to ensure a new cycle starts quickly. This can be done in
the GUI from Centralized Management. Find the appropriate collector-manager on the Collector
host(s) and choose to restart the service. It can also be done by SSHing to the Collector host(s)
and using the manage-modules script. This is typically located at /opt/APG/bin/manage-
modules.sh.
To update the property store, browse to the Frontend host in Centralized Management and run
the import-properties task. This can also be done in the terminal using the /opt/APG/bin/manage-
tasks.sh script. This task normally runs on a nightly basis, but can be run at any time.
2015 EMC Proven Professional Knowledge Sharing 41
Either of the shell scripts referenced will print help text describing their syntax when run with no
arguments.
Once these tasks are complete, the enriched properties can be seen in Management of
Database Metrics.
Fun fact: Running the import-properties
task is known colloquially among ViPR
SRM engineers as “Kicking the property
store”.
2015 EMC Proven Professional Knowledge Sharing 42
Using Data Enrichment for Application Chargeback
At this point, we will assume you have used Data Enrichment to apply a set of appli
(Application) tags to the hosts and virtual machines in your environment.
The report we’ve already developed for displaying Array, LUN, and total Capacity for each host
can be re-used here. We can create a new node and copy/paste the existing report as a child.
By adding a simple expansion on appli to this child node, we can aggregate per-application
capacity using a Sum formula. This is the same process we followed when we aggregated per-
LUN capacity on the nodes which had expansions on device.
We can also use a ChildCount formula on the newly pasted node to get a count of all Hosts and
Virtual Machines associated with an application.
This will make the configuration of an application-focused report very simple:
2015 EMC Proven Professional Knowledge Sharing 43
The result is a straightforward application-focused capacity report. It can be drilled into to view
per-host information.
Explore: What out-of-the-box reports
could be improved with tags
customized to your business?
2015 EMC Proven Professional Knowledge Sharing 44
2015 EMC Proven Professional Knowledge Sharing 45
Alerting on Reports
The alert engine in ViPR SRM is very flexible. It can generate emails or SNMP traps, make log
entries, as well as execute arbitrary actions. This can be in response to different stimuli:
Incoming alerts from other sources (Alert Consolidation)
Simple analysis of incoming collection data (APG Values Socket Listener)
Results in scheduled reports (APG Report Data)
In this case, the Director of Cloud has requested changes to provisioning policy based on array
performance. It probably makes the most sense to base such a decision on the combination of
multiple metrics rather than a single value. Based on that, we will use the APG Report Data
Adapter.
The rule we will implement is
The formula above (Figure 42) is an artificial ‘score’ describing how busy a VMAX FA (Front-
End Adapter) processor is.
Cloud storage provisioning in this environment is managed by the ViPR Controller. To stop new
provisioning, we will de-register the busy FA port from the vArray it is associated with. This will
not affect capacity which is already allocated, but will prevent the ViPR Controller from using
this FA in the future.
2015 EMC Proven Professional Knowledge Sharing 46
Building the Report
This report will have a structure similar to those we have built in the past. The primary difference
is the addition of the Math.Spatial.Average, Math.Spatial.Max, and Math.Spatial.Pecentile
formulas. These allow us to apply concepts such as the Average and Maximum of a set without
using Time Management. The Percentile formula provides a capability that isn’t possible with
normal Time Management rules.
Scheduling the Report
This can be done from the Tools menu. Choose ‘Schedule this Report’. After configuring a
schedule, the most important thing is checking the ‘Local Manager’ box on the ‘Alert’ tab.
2015 EMC Proven Professional Knowledge Sharing 47
In the background,
this works by
sending the XML output of the report to this folder on the Alerting Backend host:
/opt/APG/Backends/Alerting-Backend/Default/custom/Adapter/Watch4net Report Data Adapter/APG Report Data
Configuring the Alert
To find the Alerting interface, browse to Administration (for
full admin user) or Modules (for normal users).
Alternatively, browse to
http://[frontend-host]:58080/alerting-frontend/
Notes on the Report Data Adapter
First, ensure that the Report Data Adapter is installed. It
should appear in the ‘Adapters’ section of the tree. If it is
not installed, this process can be started by clicking the ‘Create a New Element’ button, which
can be found in the same location as the ‘New Node’ button in Edit Mode.
It is also possible to edit the Report Data Adapter settings here. Temporarily reducing the Time
Check value can speed up troubleshooting, allowing fast iteration when configuring and testing
a new alert.
Explore: Check the report data folder on
the Backend Host to ensure Scheduled
Reports are coming across. Note the format
of the XML output.
2015 EMC Proven Professional Knowledge Sharing 48
Explore: Properties have an eight-
character limit. How would this impact
column names on a report meant to be
parsed by the Alerting Backend?
The name of the Report Data Adapter instance is also significant, as it will be referenced in a
filter. In environments with a wide variety of report-based alerts, multiple adapter instances may
be configured with different names and complementary file masks.
The Alerting Definition
In the ‘Alerting Definitions’ section, a new definition will be created, similar to creating a new
node when in Edit Mode.
Once in the alert configuration section, different blocks can be dragged and dropped to create
the alert logic. A typical alert will consist of at least three blocks:
A Filtered Entry, which defines which data the alert acts on
A Condition such as a Comparator, which checks the filtered value against a logical test
An Action which occurs when the Condition is met (or not met)
The Filtered Entry for this alert will
check for three properties:
adapterName, reportName, and name.
The first and second are
straightforward. The adapterName
property will match the name of the
Report Data adapter. The reportName
property will match the name of the
scheduled report.
The third has some special rules.
When a table report is parsed, each
numerical column is turned into a
metric. The name property of the metric is based on the name of the column, but it is important
to note that the parser removes all spaces. This means that a column named “Utilization Score”
would get a name property of
“UtilizationScore”.
Non-numerical columns are turned into
properties for the numerical columns. For
example, the report scheduled above has
2015 EMC Proven Professional Knowledge Sharing 49
non-numerical columns named “VMAX” and “FA”. This means the numerical metric on the
“Utilization Score” column will get tagged with properties VMAX and FA.
Once a Filtered Entry is configured,
we can connect it to a Comparator.
This will be a Constant Comparator
Operation with a ">" operator and a
constant value of “400”.
Finally, we can define an action.
The change in provisioning policy can wait a moment – for now, let’s just configure an email
alert.
Alert actions (such as Emails) will accept certain keywords:
TMST – Timestamp of the alert
VALUE – The value that triggered the alert
PROP.’xxxx’ – Displays property xxxx
Once this alert is configured to send an email in the desired format, we can move on to the
provisioning policy change.
Automating the Policy Change
We’ll be modifying a parameter in ViPR Controller. To do this, the ViPR Controller CLI will need
to be installed on the ViPR SRM Alerting Backend. The exact procedure for doing so can be
found at http://www.emc.com/techpubs/vipr/installing_the_vipr_cli-1.htm.
Once the CLI is installed, we can create an External Process action in the ViPR SRM alert
definition. This action expects a command, a set of command paramters, as well as any
2015 EMC Proven Professional Knowledge Sharing 50
environment parameters. The command will be run
as user ‘apg’ – this means the environment variables
required for the ViPR CLI will need to exist for this
user as well.
The simplest way to do this is likely to define these
variables in the ‘Environment Parameters’ box when creating the External Process action.
These will be placed on one line, separated by commas. The required variables for executing a
ViPR CLI command are typically:
PATH=/opt/ViPR/cli/bin:$PATH
PYTHONPATH=/opt/ViPR/cli/bin:$PYTHONPATH
ViPR_HOSTNAME=[ViPR Controller FQDN]
ViPR_PORT=4443
We also need to define the command and its arguments. The ultimate command we wish to run
would be
viprcli storageport deregister -name [FA Port] -type vmax -serialnumber [Array Serial]
To configure this in the External
Process action, first we enter the
binary in the ‘Command’ box. This
must include the full path, typically
‘/opt/ViPR/cli/viprcli’.
In the command parameters box, the arguments must be entered one at a time, comma-
separated.
The Array Serial number and FA Port name can
be populated using the PROP keyword. These will
take on the column header names, PROP.’VMAX’
and PROP.’FA’.
Note in the screenshot that ‘:0” has been
appended to the FA property. This is because the
port must be specified to match ViPR Controller’s
Explore: Try issuing different commands.
For example, the command ‘/bin/date’ with
parameters ‘>>,/home/apg/date.txt’ will create
a simple log of alert times.
2015 EMC Proven Professional Knowledge Sharing 51
nomenclature. There is an assumption here that only the VMAX’s zero ports are in use. If both
the one and zero ports are in use, it will be necessary to create two actions to deregister each
one individually.
Testing Port Deregistration
It is possible to re-register ports in ViPR Controller after they have been deregistered. To quickly
test that the process is correctly configured end-to-end, try the following:
Add part filter to the Filter Entry for a single VMAX Controller
Set the Comparator very low to ensure it will trigger
Manually run the Scheduled Report
Once the action has been observed to fire correctly, re-register the port in ViPR
Controller (Physical Assets / Storage Systems / [Select System] / Storage Ports /
[Register Port])
Note: Between the writing and publication of this paper, the ViPR Controller 2.2 release
implemented a feature similar to what is described above which does not require ViPR SRM.
As such, consider this as an example of what can be done with ViPR SRM by treating it as a
platform rather than a simple monitoring dashboard.
2015 EMC Proven Professional Knowledge Sharing 52
Conclusion
During the course of this article, we’ve started simple and built up to a number of fairly complex
reports. These tutorials are intended to allow an administrator to explore ViPR SRM reporting.
Users that invest time in developing these skills will find themselves much more confident the
next time they encounter a Hard Question.
The skills presented here are only the beginning. ViPR SRM is a deep platform, with capabilities
and intricacies that cannot be plumbed in a single article. Consider this article a first step – a
Hello World app for a new language. As you continue on your journey, consider some of the
additional resources below for learning more.
The official documentation for ViPR SRM can be found online at:
https://community.emc.com/docs/DOC-35810
The EMC Community supporting the broader ViPR portfolio can be found at:
https://community.emc.com/community/products/vipr
This community includes videos, demos, and even posts by users that include the custom
reports they’ve created.
Your humble author maintains a blog which regularly discusses ViPR SRM (and other
enterprise-technology-adjacent topics):
http://eastsidegeek.typepad.com/
The EMC Community post referenced in the introduction will also be updated as needed with
more information related to this document:
https://community.emc.com/people/DannoOfNashville/blog/2015/05/03/knowledge-sharing-2015-vipr-srm-reports
For users who wish to go deeper in a hands-on setting, EMC Education Services offers a ViPR
SRM Advanced Reporting class, as well as classes around the installation, maintenance, and
general use of the environment:
https://education.emc.com/
Finally, I encourage everyone to become actively involved in the online communities above.
Questioning, exploration, sharing, and diversity of opinion make the entire ecosystem stronger.
The more you participate, the more the community will become the one you want to see.
2015 EMC Proven Professional Knowledge Sharing 53
EMC believes the information in this publication is accurate as of its publication date. The
information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION
MAKES NO RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO
THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an
applicable software license.
“If this machine gave you the truth immediately, you would not recognize it, because
your heart would not have been purified by the long quest”
– Umberto Eco, Foucault’s Pendulum