As a part of the GUEST BLOGGERS initiative on vXpress, here is the second post by Anand Vaneswaran. In this article he is giving us a dope on using vRealize Custom Dashboards to showcase the Capacity of Virtual Desktops managed by Horizon View and running on the vSphere platform.
With this post, he has used the concept of capacity management in vROps mostly used with server infrastructure and applied the same to Virtual Desktops. Hence this will provide you with a 360 degree view into the capacity of your Virtual Desktop Infrastructure using vRealize Operations Manager. Here is what Anand has to say:
In my previous post, I provided instructions on
constructing a high-level “at-a-glance” VDI dashboard in vRealize Operations
for Horizon, one that would aid in troubleshooting scenarios. In the second of
this three-part blog series, I will be talking about constructing a custom
dashboard that will take a holistic view of my vSphere HA clusters that run my
VDI workloads in an effort to understand current capacity. The ultimate
objective would be to place myself in a better position in not only
understanding my current capacity, but I better hope that these stats help me
identify trends to be able to help me forecast capacity. In this example, I’m
going to try to gain information on the following:
comes to RAM utilization. I always want to leave a little buffer. So my utilization factor will be 80 percent or .8.
want to incorporate this N+1 cluster configuration design in my formula.
comes to RAM utilization. I always want to leave a little buffer. So my utilization factor will be 80 percent or .8.
want to incorporate this N+1 cluster configuration design in my formula.
· Total number of running hosts
· Total number of running VMs
· VM-LUN densities
· Usable RAM capacity (in a N+1 cluster configuration)
· vCPU to pCPU density (in a N+1 cluster configuration)
· Total disk space used in percentage
You can either follow my lead and recreate
this dashboard step-by-step, or simply use this as a guide and create a
dashboard of your own for the most important capacity metrics you care about.
In my environment, I have five (5) clusters comprising of full-clone VDI
machines and three (3) clusters comprising of linked-clone VDI machines. I have
decided to incorporate eight (8) “Generic Scoreboard” widgets in a two-column
custom dashboard. I’m going to populate each of these “Generic Scoreboard” widgets
with the relevant stats described above.
Once my widgets have been imported, I will
rearrange my dashboard so that the left side of the screen occupies full-clone
clusters and the right side of the screen occupies linked-clone clusters. Now,
as part of this exercise I determined that I needed to create super metrics to
calculate the following metrics:
· VM-LUN densities
· Usable RAM capacity (in a N+1 cluster configuration)
· vCPU to pCPU density (in a N+1 cluster configuration)
· Total disk space used in percentage
With that being said, let’s begin! The
first super metric I will create will be called SM – Cluster LUN Density. I’m going to design my super metric with
the following formula:
sum(This
Resource:Deployed|Count Distinct VM)/sum(This Resource:Summary|Total Number of
Datastores)
In this super metric I will attempt to find
out how many VMs reside in my datastores on average. The objective is to make
sure I’m abiding by the recommended configuration maximums of allowing a
certain number of virtual machines to reside on my VMFS volume.
The next super metric I will create is
called SM – Cluster N+1 RAM Usable.
I want to calculate the usable RAM in a cluster in an N+1 configuration. The
formula is as follows:
(((sum(This
Resource:Memory|Usable Memory (KB)/sum(This Resource:Summary/Number of Running
Hosts))*.80)*(sum(This Resource:Summary/Number of Running Hosts)-1))/10458576
Okay, so clearly there is a lot going on in
this formula. Allow me to try to break it down and explain what is happening
under the hood. I’m calculating this stat for an entire cluster. So what I will
do is take the usable memory metric (installed) under the Cluster Compute
Resource Kind. Then I will divide that number by the total number of running
hosts to give me the average usable
memory per host. But hang on, there are two caveats here that I need to
take into consideration if I want an accurate representation of the true overall
usage in my environment:
1)
I
don’t think I want my hosts running at more than 80 percent capacity when it
2)
I
always want to account for the failure of a single host (in some environments,
you might want to factor in the failure of two hosts) in my cluster design so
that compute capabilities for running VMs are not compromised in the event of a
host failure. I’ll
So, I will take the result of my overall
usable, or installed, memory (in KB) for the cluster, divide that by the number
of running hosts on said cluster, then multiply that result by the .8
utilization factor to arrive at a number – let’s call it x – this is the amount of real usable memory I have for the
cluster. Next, I’m going to take x,
then multiply the total number of hosts minus 1, which will give me y. This will take into account my N+1
configuration. Finally I’m going to take y,
still in KB, and divide it by (1024x1024) to convert it to GB and get my final
result, z.
The next super metric I will create is
called SM – Cluster N+1 vCPU to Core
Ratio. The formula is as follows:
sum(This
Resource:Summary|Number of vCPUs on Powered On VMs)/((sum(This Resource:CPU
Usage|Provisioned CPU Cores)/sum(This Resource:Summary|Total Number of
Hosts))*(sum(This Resource:Summary|Total Number of Hosts)-1))
In this formula, I want to know my vCPU to
physical core ratio. Now, its great to
know this detail under normal operational circumstances when things are fine
and dandy, but what would happen in the event of a host failure? How would that
affect the vCPU-pCPU ratio? To that end I want to incorporate this condition in
my super metric. My formula will attempt to find out the overall number of
vCPUs on my powered-on VMs and divide that number by my total number of hosts
minus 1 (for N+1), multiplied by the number of physical cores per host.
The next super metric I will create is
called SM – Cluster HD Percent Used
(Datastore Cluster). This is for my full clone VDI Clusters, which make use
of the datastore clusters feature. The formula is as follows:
sum(This
Resource:Capacity|Used Space (GB)/sum(This Resource:Capacity|Total Capacity
(GB) * 100
This formula is fairly self-explanatory.
I’m taking the total space used for that datastore cluster and dividing that by
the total capacity of that datastore cluster. This is going to give me a number
greater than 0 and less than 1, so I’m going to multiply this number by 100 to
give me a percentage output.
Once I have the super metrics I want, I
want to attach these super metrics to a package called SM – Cluster SuperMetrics.
The next step would be to tie this package
to current Cluster resources as well as Cluster resources that will be
discovered in the future. Navigate to Environment
> Environment Overview > Resource Kinds > Cluster Compute Resource.
Shift-select the resources you want to edit, and click on Edit Resource.
Click the checkbox to enable “Super Metric
Package, and from the drop-down select SM – Cluster SuperMetrics.
To ensure that this SuperMetric package is
automatically attached to future Clusters that are discovered, navigate to Environment > Configuration > Resource
Kind Defaults. Click on Cluster Compute Resource, and on the right pane
select SM – Cluster SuperMetrics as the Super Metric Package.
Now that we have created our super metrics
and attached the super metric package to the appropriate resources, we are now
ready to begin editing our “Generic Scoreboard” widgets. I will tell you how to
edit two widgets (one for a full-clone cluster and one for a linked-clone
cluster) with the appropriate data and show its output. We will then want to replicate
the same procedures to ensure that we are hitting every unique full clone and
linked clone cluster. Here is an example of what the widget for a full-clone
cluster should look like:
And here’s an example of what a widget for
a linked-clone cluster should look like:
Once we replicate the same process and
account for all of our clusters, our end-state dashboard should resemble
something like this:
And we are done. A few takeaways from this
lesson:
· We delved into the concept of super metrics
in this tutorial. Super metrics are awesome resources that allow you the
ability to manipulate metrics and display just the data you want to. In our examples we created some fairly
involving formulas, but a very simple example for why a super metric can be
particularly useful would be memory. vRealize Operations Manager displays
memory metrics in KB, but how do we get it to display in GB? Super metrics are
your solution here.
· Obviously, every environment is configured
differently and therefore behaves differently, so you will want to tailor the
dashboards and widgets according to your environment needs, but at the very
least the above examples can be a good starting point to build your own widgets/dashboards.
In my next tutorial, I will walk through
the steps for creating a high-level “at-a-glance” VDI dashboard that your
operations command center team can monitor. With most organizations, IT issues
are categorized on a severity basis that are then assigned to the appropriate
parties by a central team that runs point on issue resolution by coordinating
with different departments. What happens
if a Severity 1 issue happens to afflict your VDI environment? How are these
folks supposed to know what to look for before placing that phone call to you?
This upcoming dashboard will make it very easy. Stay tuned!!
Thanks once again to Anand for sharing his experiences around usage of vROps for monitoring Virtual Desktop Environments. Please leave your comments, thoughts or questions if you have any. To be a guest blogger at vXpress see this page.
SHARE & SPREAD THE KNOWLEDGE :-)