Sunday, December 21, 2014

Part 7 : Recommendation Intelligence in vRealize Operations Manager 6.0!

One of the unique feature of VMware's Operations Management solution has been the analytics engine which has the ability to learn the behaviour of every metric and create a dynamic threshold which reflects the usage patterns. These usage patterns are the key to register anomalies or abnormal behaviour of a metric which helps in proactive detection of an issue which might hit that metric source.

While all this sounds fantastic, one thing which always worried me was the one track approach of this learning behaviour. In other words, if you have an environment which has issues and you drop in vCenter Operations Manager 5.x in the same environment, chances are that you would wait for a period of 1 or 2 months before looking into the alerts shown by vCenter Operations Manager. In usual terms this is defined as a cooling off or first time collection period and we tend to sit back and relax while the analytics engine is crunching away the numbers.

A risk with this approach is that you might end up telling vCOps 5.x that some of the issues or bad behaviour is actually a normal behaviour in your virtual infrastructure, by not taking any actions against the early alerts you get from the system. While data collection for a longer period is good to learn cyclical behaviour, it is important that you iron out all these early alerts which you get from vCOps 5.x to ensure that the system is not learning bad behaviour as usual or normal behaviour. While I say this, I must also admit that doing this as soon as you deploy the product and assuming you don't have any expertise might be a difficult task. 



With the release of vRealize Operations Manager, the product engineering group has done a great job of taking this weakness of the earlier versions and making it a strength of this new release. This was done by using the years of knowledge base created by troubleshooting VMware environments, feeding it into the new release and churning out RECOMMENDATIONS to solve issues which are detected as per the recommended best practices. This recommendation engine ensures that you get immediate recommendations about the issues which vROps thinks are not normal and you should act upon them either using your own intelligence or on the advice given by this recommendation engine itself.

Now this sounds ABSOLUTELY AWESOME. We all know that the proof lies in the pudding, hence without further a do, let us see what happened as soon as I deployed vRealize Operations 6.0 in my lab and migrated the data from the earlier install of version 5.8.2.

As soon as you launch vROps 6.0, you will be automatically directed to the Recommendations Dashboard. Here is the screenshot from my lab which shows this dashboard.



You will immediately notice that I have got a bunch of issues highlighted in my infrastructure. I will click on the message which says - "Virtual Machine has Disk I/O write latency problem" and what do I see.


I see all the virtual machines which are experiencing Disk I/O read latency. In my case I have a single datastore running all the virtual machines hence the latency. As my next step, I will click on one of the virtual machine (win2k8temp) to see what recommendations I get for this virtual machine.



I can immediately see a recommendation coming out of the system stating that I should enable Storage I/O control to introduce quality of service on the datastore. This will ensure that I can provide the required IOPS to virtual machines which are important to me. I can also see that the VM has low CPU swap wait indicating towards another symptom which could lead to a performance issue.

If you notice, their is another option to click on which says "Other Recommendations". Let us expand that option to see if we have more recommendations from the tool about the issue.


The other recommendation shows some more options which could help you resolve the issue. If you notice, these recommendations are intelligent in nature and are based on recommended best practices. This recommendation engine is smart and as I said based on experience. This is truly next generation. I almost forgot to tell you that you can write your own recommendations, and if are using third party management packs, then they would come with their own recommendations for devices / applications etc you are monitoring using those management packs.

With this, i will close this post. Hopefully you will enjoy the read and implement some of the learnings within your operations manager deployment. Will come back soon with deep dive into other new features.

Till Then....


Share & Spread the Knowledge 




Part 6: vRealize Operations 6.0 - Am I missing the Badge Scores?




With the release of vRealize Operations Manager 6.0, a number of new things have been introduced. While I continue to add content around what's new in my series on this new release, I always wanted to share some major changes int he way data is displayed and should be interpreted with the new release.
If you have worked on vCenter Operations Manager 5.x, you would know about the badge scores which you see with each badge which represents a particular Super Metric, see the highlighted scores in the screenshot below from the old version:




Now let's look at the same badges on the home screen of vRealize Operations Manager 6.0.



While the shape of the badges remain the same, you will notice that the scores are GONE. Even if you dig down further into minor badges you will notice that the scores are gone. While the scores are no longer displayed, the badge colours are still driven by the thresholds. Infact, if you move your mouse pointer on one any of the badge, you will see the score popping up. See the screenshot below:-



While, the scores are not available on the badges on the Recommendations (HOME DASHBOARD), if you dig down deeper into badges, you will start seeing this score for specific objects in the inventory. See the screenshot below where, I am checking the Anomalies score for a custom group in the vROps inventory and I can see the scores right on the badges.



I believe the reason for taking out the scores from the Home page is to give more visualization to the screen than calculation. I like it since I can now look at the scores if I need to and only worry about colour changes when they breach the thresholds which I have defined in the policies of my environment.

In the posts to come, I will talk about more such enhancements and changes to ensure that your learning curve of this new release is fast paced :-)

Till then...

Share & Spread the Knowledge

Saturday, December 20, 2014

Part 5 - What migrates from vCOps 5.x to vROps 6.0?

I have received questions from a number of people around migration of data from vCenter Operations Manager 5.x to the latest and the greatest vRealize Operations Manager 6.0. I will not re-invent the wheel by writing about the migration process since my buddy Lior Kamrat has already explained the process in this article.

A quick thing which I wanted to add to what Lior has mentioned in his article is the data that gets migrated in the process. Here are the list of things that get migrated in the process.


  • CUSTOM GROUP SETTINGS
  • FSDB DATA
  • CIQMETRICS
  • POLICY
  • NOTIFICATION PLUGIN
  • USER DATA
  • DASHBOARD
  • FSDB ROLLUP
  • FSDB CATCHUP




While most of the things mentioned in the list are self-explanatory I would like to highlight that while the custom Dashboards get migrated in the process, it is possible that you not see data in your custom dashboards since the layout of some of the metrics in vROps 6.0 is different from how they were listed in the older version of the product. This is also highlighted in the release notes here.


Although the migration plan is simple, it is recommended that you Dry Run the migration once to see if you lose anything in the process and plan to re-create the same in the new version.


Share & Spread the Knowledge 


Tuesday, December 9, 2014

vRealize Operations Manager 6.0 Released. Go get it!

I am glad to share that vRealize Operations Manager 6.0 is out today. I have been writing a series about this release of the operations management suite in a series here. The release will now allow me to take you deep into the great new features and the recommended best practices to design, deploy and use the solution for effective performance management and capacity planning for EVERYTHING in your datacenter.



Here is what the release notes say about the product:-

vRealize Operations Manager 6.0 is the latest release of VMware's integrated operations suite, converging performance, capacity, and configuration management. This release introduces the following enhancements.
  • Scale-Out Deployment Architecture
    This release provides distributed deployment with elastic scale and higher scalability.
  • Unified User Interface
    This release introduces a single user interface to manage vSphere as well as non vSphere domains. You can create powerful, flexible custom dashboards enabling you to bring any information you want into the management console.
  • Licensing Management in 6.0 
    vRealize Operations Manager 6.0 has an independent license-management GUI that provides enhanced administration of license keys specific to vRealize Operations. Customers can now deploy mixed editions in a single vRealize Operations 6.0 instance.
  • Smart Alerts
    Smart alerts combine multiple symptoms to generate a single alert that focuses on the underlying issue with clear recommendations and option to take action for remediation.
  • Enhanced Reporting
    Enhanced reporting provides several out-of-the-box reports with the ability to generate fully customizable reports.
  • Capacity Planning and Project Management Capabilities
    New capacity planning and project management capabilities extend beyond vSphere and across physical and application level metrics. Flexible capacity models can be adjusted to meet different business needs.
  • Custom Policies
    Custom policies can applied for specific workload types, applications or clusters enabling more advanced monitoring of performance, capacity and configuration standards.
  • Automated Remediation of Problems
    Integrated action and remediation capabilities with the ability to apply actions according to the recommendation for the alerts.
  • User Access Control Management
    Improved user access control, including granular role-based access control.
  • Unified Storage Visibility
    New storage visibility shows the correlation between the application group and the storage infrastructure supporting it, including HBA's, Fabric and Arrays, along with the ability to trace operational issues all the way to storage.

Here are the links to the Product Documentation and Download.


vRealize Operations Manager 6.0

vRealize Operations Manager 6.0 | 9 DEC 2014 

Product Download
 Visit the vRealize Operations Manager 6.0 Documentation Center to learn more about the product
New Features and Release Notes
vRealize Operations Manager 6.0 Release Notes
Compatibility and Configuration Limits
Hardware, Host Operating System, and Guest Operating System Compatibility Guides
VMware Product Interoperability Matrix
vRealize Operations Manager Product Documentation
vRealize Operations Manager Virtual Application Installationhtmlpdfepubmobi
vRealize Operations Manager Linux and Windows Installationhtmlpdfepubmobi
Connecting vRealize Operations Manager to Data Sourceshtmlpdfepubmobi
Configuring Users and Groupshtmlpdfepubmobi
Monitoring Objects in Your Managed Environmenthtmlpdfepubmobi
Planning the Resources for Your Managed Environmenthtmlpdfepubmobi
Customizing How vRealize Operations Manager Displays Your Datahtmlpdfepubmobi
Customizing How vRealize Operations Manager Monitors Your Environmenthtmlpdfepubmobi
Maintaining and Expanding vRealize Operations Managerhtmlpdfepubmobi
Metric Definitions in vRealize Operations Managerhtmlpdfepubmobi
vRealize Operations for Horizon Documentation
vRealize Operations for Horizon is compatible with VMware vRealize Operations Manager version 6.0.
vRealize Operations Manager for Horizon Installationhtmlpdfepubmobi
vRealize Operations Manager for Horizon Administrationhtmlpdfepubmobi
vRealize Operations Manager for Horizon Securityhtmlpdfepubmobi

So go ahead, download your copy and start relishing the NextGen operations management solution for your Datacenter (Note: Not only VIRTUAL but PHYSICAL as well) :-)



Share & Spread the Knowledge 




Tuesday, December 2, 2014

Configuring Policies in vCenter / vRealize Operations Manager!

A few days back, I had this opportunity to discuss vCOps policies with my colleagues during an internal enablement event at VMware. I thought that it would be useful to others if they can have a look at the deck I presented and can get cues from it as to how one should configure policies with vCenter Operations Manager 5.x. All of this would also be applicable to vRealize Operations Manager 6.0, which was announced at VMworld this year and would be release by VMware in the near future. You can checkout my series on vRealize Operations Manager 6.0 here in case you missed those articles.

Let's have a look at the configuring vCOps policy now:-



Configuring policies in vROps from Sunny Dua

Unfortunately, I do not have a video recording, however would be happy to answer any questions you might have using the comments section.

Hope this helps you configure the right policies with vCOps/vROps for your virtual infrastructure.


Share & Spread the Knowledge



Wednesday, November 26, 2014

Creating a vROps Capacity Dashboard for your Virtual Desktop Infrastructure!

As a part of the GUEST BLOGGERS initiative on vXpress, here is the second post by Anand Vaneswaran. In this article he is giving us a dope on using vRealize Custom Dashboards to showcase the Capacity of Virtual Desktops managed by Horizon View and running on the vSphere platform.

With this post, he has used the concept of capacity management in vROps mostly used with server infrastructure and applied the same to Virtual Desktops. Hence this will provide you with a 360 degree view into the capacity of your Virtual Desktop Infrastructure using vRealize Operations Manager. Here is what Anand has to say:

In my previous post, I provided instructions on constructing a high-level “at-a-glance” VDI dashboard in vRealize Operations for Horizon, one that would aid in troubleshooting scenarios. In the second of this three-part blog series, I will be talking about constructing a custom dashboard that will take a holistic view of my vSphere HA clusters that run my VDI workloads in an effort to understand current capacity. The ultimate objective would be to place myself in a better position in not only understanding my current capacity, but I better hope that these stats help me identify trends to be able to help me forecast capacity. In this example, I’m going to try to gain information on the following:

comes to RAM utilization. I always want to leave a little buffer. So my utilization factor will be 80 percent or .8.
want to incorporate this N+1 cluster configuration design in my formula.

·       Total number of running hosts
·       Total number of running VMs
·       VM-LUN densities
·       Usable RAM capacity (in a N+1 cluster configuration)
·       vCPU to pCPU density (in a N+1 cluster configuration)
·       Total disk space used in percentage


You can either follow my lead and recreate this dashboard step-by-step, or simply use this as a guide and create a dashboard of your own for the most important capacity metrics you care about. In my environment, I have five (5) clusters comprising of full-clone VDI machines and three (3) clusters comprising of linked-clone VDI machines. I have decided to incorporate eight (8) “Generic Scoreboard” widgets in a two-column custom dashboard. I’m going to populate each of these “Generic Scoreboard” widgets with the relevant stats described above.




Once my widgets have been imported, I will rearrange my dashboard so that the left side of the screen occupies full-clone clusters and the right side of the screen occupies linked-clone clusters. Now, as part of this exercise I determined that I needed to create super metrics to calculate the following metrics:

·       VM-LUN densities
·       Usable RAM capacity (in a N+1 cluster configuration)
·       vCPU to pCPU density (in a N+1 cluster configuration)
·       Total disk space used in percentage

With that being said, let’s begin! The first super metric I will create will be called SM – Cluster LUN Density. I’m going to design my super metric with the following formula:

sum(This Resource:Deployed|Count Distinct VM)/sum(This Resource:Summary|Total Number of Datastores)




In this super metric I will attempt to find out how many VMs reside in my datastores on average. The objective is to make sure I’m abiding by the recommended configuration maximums of allowing a certain number of virtual machines to reside on my VMFS volume.

The next super metric I will create is called SM – Cluster N+1 RAM Usable. I want to calculate the usable RAM in a cluster in an N+1 configuration. The formula is as follows:

(((sum(This Resource:Memory|Usable Memory (KB)/sum(This Resource:Summary/Number of Running Hosts))*.80)*(sum(This Resource:Summary/Number of Running Hosts)-1))/10458576




Okay, so clearly there is a lot going on in this formula. Allow me to try to break it down and explain what is happening under the hood. I’m calculating this stat for an entire cluster. So what I will do is take the usable memory metric (installed) under the Cluster Compute Resource Kind. Then I will divide that number by the total number of running hosts to give me the average usable memory per host. But hang on, there are two caveats here that I need to take into consideration if I want an accurate representation of the true overall usage in my environment:

1)     I don’t think I want my hosts running at more than 80 percent capacity when it
2)     I always want to account for the failure of a single host (in some environments, you might want to factor in the failure of two hosts) in my cluster design so that compute capabilities for running VMs are not compromised in the event of a host failure.  I’ll

So, I will take the result of my overall usable, or installed, memory (in KB) for the cluster, divide that by the number of running hosts on said cluster, then multiply that result by the .8 utilization factor to arrive at a number – let’s call it x – this is the amount of real usable memory I have for the cluster. Next, I’m going to take x, then multiply the total number of hosts minus 1, which will give me y. This will take into account my N+1 configuration. Finally I’m going to take y, still in KB, and divide it by (1024x1024) to convert it to GB and get my final result, z.

The next super metric I will create is called SM – Cluster N+1 vCPU to Core Ratio. The formula is as follows:

sum(This Resource:Summary|Number of vCPUs on Powered On VMs)/((sum(This Resource:CPU Usage|Provisioned CPU Cores)/sum(This Resource:Summary|Total Number of Hosts))*(sum(This Resource:Summary|Total Number of Hosts)-1))



In this formula, I want to know my vCPU to physical core ratio.  Now, its great to know this detail under normal operational circumstances when things are fine and dandy, but what would happen in the event of a host failure? How would that affect the vCPU-pCPU ratio? To that end I want to incorporate this condition in my super metric. My formula will attempt to find out the overall number of vCPUs on my powered-on VMs and divide that number by my total number of hosts minus 1 (for N+1), multiplied by the number of physical cores per host.

The next super metric I will create is called SM – Cluster HD Percent Used (Datastore Cluster). This is for my full clone VDI Clusters, which make use of the datastore clusters feature. The formula is as follows:

sum(This Resource:Capacity|Used Space (GB)/sum(This Resource:Capacity|Total Capacity (GB) * 100



This formula is fairly self-explanatory. I’m taking the total space used for that datastore cluster and dividing that by the total capacity of that datastore cluster. This is going to give me a number greater than 0 and less than 1, so I’m going to multiply this number by 100 to give me a percentage output.

Once I have the super metrics I want, I want to attach these super metrics to a package called SM – Cluster SuperMetrics.




The next step would be to tie this package to current Cluster resources as well as Cluster resources that will be discovered in the future. Navigate to Environment > Environment Overview > Resource Kinds > Cluster Compute Resource. Shift-select the resources you want to edit, and click on Edit Resource.


Click the checkbox to enable “Super Metric Package, and from the drop-down select SM – Cluster SuperMetrics.



To ensure that this SuperMetric package is automatically attached to future Clusters that are discovered, navigate to Environment > Configuration > Resource Kind Defaults. Click on Cluster Compute Resource, and on the right pane select SM – Cluster SuperMetrics as the Super Metric Package.




Now that we have created our super metrics and attached the super metric package to the appropriate resources, we are now ready to begin editing our “Generic Scoreboard” widgets. I will tell you how to edit two widgets (one for a full-clone cluster and one for a linked-clone cluster) with the appropriate data and show its output. We will then want to replicate the same procedures to ensure that we are hitting every unique full clone and linked clone cluster. Here is an example of what the widget for a full-clone cluster should look like:



And here’s an example of what a widget for a linked-clone cluster should look like:




Once we replicate the same process and account for all of our clusters, our end-state dashboard should resemble something like this:



And we are done. A few takeaways from this lesson:

·      We delved into the concept of super metrics in this tutorial. Super metrics are awesome resources that allow you the ability to manipulate metrics and display just the data you want to.  In our examples we created some fairly involving formulas, but a very simple example for why a super metric can be particularly useful would be memory. vRealize Operations Manager displays memory metrics in KB, but how do we get it to display in GB? Super metrics are your solution here.

·       Obviously, every environment is configured differently and therefore behaves differently, so you will want to tailor the dashboards and widgets according to your environment needs, but at the very least the above examples can be a good starting point to build your own widgets/dashboards.

In my next tutorial, I will walk through the steps for creating a high-level “at-a-glance” VDI dashboard that your operations command center team can monitor. With most organizations, IT issues are categorized on a severity basis that are then assigned to the appropriate parties by a central team that runs point on issue resolution by coordinating with different departments.  What happens if a Severity 1 issue happens to afflict your VDI environment? How are these folks supposed to know what to look for before placing that phone call to you? This upcoming dashboard will make it very easy. Stay tuned!!

Thanks once again to Anand for sharing his experiences around usage of vROps for monitoring Virtual Desktop Environments. Please leave your comments, thoughts or questions if you have any. To be a guest blogger at vXpress see this page.


SHARE & SPREAD THE KNOWLEDGE :-)