Friday, June 30, 2017

Part 7 - vROps 6.6 : Virtual Machine Inventory Summary Report


Welcome back to the what's new series for vRealize Operations Manager 6.6. While in the previous parts, we have been talking about the dashboards, with this post, I do want to highlight some of the out of the box reports which can help administrators quickly with their day to day activities. A very basic use case of reporting involves a VI Admin to keep a track of all the virtual machines in the environment and share a regular report with people within the organization about the basic configuration and whereabouts of the Virtual Machines.

With vROps 6.6, you can find a useful out of the box report for this purpose. The report is called :

Virtual Machine Inventory Summary

You can find this report under the Reports section of the new Clarity User Interface. Here are the steps to fetch this report.

1- Log in to your vROps instance with a user with privileges to run reports.

2- Click on Dashboards Menu.

3- On the Navigation Pane click on Reports.

4- Use the search filter to search for a report using the name - "Virtual Machine Inventory Summary".


5- Once you click on the Run Report button highlighted in red, you get a choice to run this report against a particular construct. This could the vSphere World, a vCenter Server, a particular cluster or even a custom group you want to report on.



6- Click on OK to generate the report.

7- You can view the report in the Generated Reports section and download a PDF version or a CSV file based on how you would like to present this data.



Once you download the report, you will be able to see all the fields which would provide you all the crucial details about your virtual machines. Here is a quick look at all the fields which are generated in this report. Click to view the image in full screen:




With this report you can cater to multiple use cases by scheduling these reports for target audiences using the simple report scheduling method in vROps. Some of the basic use cases where this report can be used are:


Use Cases:

a) Keeping an inventory of all or subset of the virtual machines in your environment.

b) Reporting this data to VM or Application Owners.

c) Virtual Machine Life-cycle Management

d) Keep a track of configuration inconsistencies. For eg old VMware Tools Versions

e) Tracking OS versions for licensing needs.

and many more.......


So go ahead pull up the inventory summary easily with vROps 6.6 and use the data to improve your virtual infrastructure and make your day to day work easier.



More to come.. Stay Tuned!!



Wednesday, June 28, 2017

Part 6 - Configuration & Compliance Dashboards in vRealize Operations 6.6.
























Welcome to the next post of the series on What's New with vRealize Operations Manager 6.6. In the last few parts of this series, I have been writing about the out of the box dashboards available in vROps 6.6.

In this post I will talk about the last out of the box category of Configuration & Compliance. I have skipped Workload Balance for now as it is more than a dashboard in vRealize Operations 6.6. I will share a series of post on that topic in the days to come.

Let us focus on the category of Configuration & Compliance. Here is how Configuration & Compliance shows up on the Getting Started Page:




The Configuration and Compliance category caters to the administrators who are responsible to manage configuration drifts within a virtual infrastructure. Since most of the issues in a virtual infrastructure are a result of inconsistent configurations, dashboards in this category highlight the inconsistencies at various levels such as Virtual Machines, Hosts, Clusters and Virtual Networks. You can view a list of configuration improvements that helps you to avoid problems that are caused because of misconfigurations.

Your IT security teams can also measure your environment against the vSphere hardening best practices to ensure that your environment is fully secured and meets all the compliance standards.

Key questions these dashboards help you answer are :

  • Are the vSphere clusters consistently configured for high availability and optimal performance?
  • Are the ESXi hosts consistently configured and available to use?
  • Are the Virtual Machines sized and configured as per recommended best practices?
  • Are virtual switches configured optimally?
  • Is the environment configured in accordance with the vSphere Hardening Guide?


Let us look at each of these dashboard and I will provide a summary of what these dashboards can do for you along with a quick view of the dashboard:


Cluster Configuration

The Cluster Configuration Dashboard provides you a quick overview of your vSphere cluster configurations. It highlights the areas which are important to deliver performance and availability to your virtual machines. The dashboard quickly highlights if there are clusters which are not configured for DRS, HA or Admission Control to avoid any resource bottlenecks or availability issues in case of a host failure.

The heatmap on this dashboard, quickly identifies if you have hosts where vMotion was not enabled as this would not allow the VMs to move from or to that host. This could cause potential performance issues on the VMs living on that host if the host gets too busy. The dashboard also provides you a quick view of how consistently your clusters are sized and whether the hosts on each of those clusters are consistently configured. 

The Cluster Properties view in this dashboard allows you to easily report on all these parameters by simply exporting the data and share the same with relevant stakeholders within your organization.





Host Configuration

The Host Configuration dashboard provides you a quick overview of your ESXi host configurations and capture inconsistencies to take corrective actions. Along with configurations, the dashboard measures the ESXi hosts against the vSphere best practices and calls out if it finds a deviation which can impact the performance or availability of your virtual infrastructure.

While you can always view this data using the dashboards, the ESXi Configuration view on this page allows you to export this data and share the same with administrator responsible to manage the hosts. 



Network Configuration

The Network Configuration dashboard provides a detailed view of virtual switch configuration and utilization. On selecting a virtual switch you can see the list of ESXi hosts, DV port Groups and virtual machines which are being served by the select switch.

You can easily identify any misconfigurations within various network components by reviewing the properties listed in the views within the dashboard. The drill down to the virtual machine levels allows you to track important information such as IP address and MAC address assigned to the virtual machines.

A network administrator can use this dashboard to get a visibility into the virtual infrastructure network configuration.



VM Configuration

The Virtual Machine Configuration dashboard focuses on highlighting the key configurations of the virtual machines in your environment. The goal of this dashboard is to help you find inconsistencies of configuration within your virtual machines in order to take quick remediation measures. This helps you safeguard the applications which are hosted on these virtual machines by avoiding potential issues due to misconfigurations. 

Some of the basic issues the dashboard focuses on includes identifying VMs running on older VMware tools versions, VMware tools not running or virtual machines running on large disk snapshots. VMs with such symptoms can lead to potential performance issues and hence it is important to ensure that they do not deviate from the defined standards.

This dashboard is complimented with an out of the box report named "Virtual Machine Inventory Summary" which can be used to report the configurations highlighted on this dashboard for quick remediation. 
 




vSphere Hardening Compliance

The vSphere Hardening Compliance dashboard measures you environment against the vSphere Hardening Guide and lists down the objects which are non-compliant. You can see the trend of High Risk, Medium Risk and Low Risk violations and see the overall compliance score of your virtual infrastructure.

The dashboard also allows you to drill down into various components to check compliance for your ESXi hosts, Clusters, Port Groups and virtual machines using heatmaps.

Each non-compliant object is listed in the dashboard with recommendations on remediation required to secure your virtual infrastructure.


     

     In case you are like me, and don't like to READ. You can see the dashboards in action in this video playlist:

See all dashboards in action here.


More to come.. Stay Tuned!!


Monday, June 26, 2017

Part 5 - Performance Troubleshooting Dashboards in vRealize Operations 6.6.

Hope you are enjoying the What's New with vROps 6.6 Series. I am having a great time writing this, since my experience as a user of vROps has completely turned around with this release. In this post, we will continue talking about the rich & use case driven out of the box content available in the form of dashboards.

The Getting Started page in the product acts an anchor for showcasing all the use cases. The focus of this post would be on Performance Troubleshooting.

Here is how Performance Troubleshooting shows up on the Getting Started Page:


The Performance Troubleshooting category caters to the administrators responsible for managing the performance & availability of the virtual machines running in the virtual infrastructure. This category runs your through a guided workflow to answer questions which will help you with the troubleshooting process. The dashboards in this category identify and isolate problems that may impact your applications. They provide a line of sight into the full stack to isolate and identify the root cause quickly.


Key questions these dashboards help you answer are :

Is application performance impacted due to virtual infrastructure?

Are noisy neighbors impacting multiple virtual machines and corresponding applications?

Are there active alerts which require action?

Any known issues impacting the performance & availability of a vSAN cluster?


Let us look at each of these dashboard and I will provide a summary of what these dashboards can do for you along with a quick view of the dashboard:



Troubleshoot a VM

The Troubleshoot a VM dashboard helps a VI Administrator to deal with day to day troubleshooting of issues in a virtual infrastructure. While most of the IT issues in an organization are reported at the application layer, this dashboard provides a guided workflow which can help investigate an ongoing or a suspected issue with virtual machines supporting the impacted applications.

You can easily search for a virtual machine by its name or can sort the list of VMs with active alerts on them to start your troubleshooting process. As soon as you select a VM, you can view its key properties to ensure they are configured as per your virtual infrastructure design. Any deviation from standards could cause potential issues. You can view any known alerts, the workload trend of the VM over the past week and if any of the resources serving the virtual machine has any ongoing issue.

The next step in the troubleshooting process allows you to eliminate the major symptoms which might impact the performance or availability of a virtual machine. You can drill down further into the key metrics to find out if the VMs utilization patterns are abnormal or it is contending for basic resources such as CPU, Memory or Disk.




Troubleshoot a Cluster

The Troubleshoot a Cluster Dashboard provides you a guided workflow to identify issues and isolate them easily. You can either start with a cluster which happens to have an issue by using the search option or you can simply sort your clusters with the number of active alerts on them.

On selecting a specific cluster you want to work with, you can see a quick summary of number of hosts participating in that cluster and the VMs being served by them. The dashboard provides you the current and past utilization trends of how hard your cluster is working and what are the known problems on the cluster in form of alerts.

You can easily view the hierarchy of objects related to the cluster and review their status to identify if they are impacted due to the current health of the cluster. You can quickly identify any contention issues by looking at the max and avg. contention faced by the virtual machines on the selected cluster. The dashboard allows you to drill down to specific virtual machines which might be a victim of resource contention and take your next steps in the troubleshooting process to cater to those victims and avoid issues proactively. 



Troubleshoot a Datastore

Troubleshoot a Datastore dashboard helps provide a guided workload to an administrator in order to quickly identify storage issues and act on them. Based on your troubleshooting style you can either start with a Datastore which might be in trouble due to high latency and is showing red on the heatmap or you can search for a Datastore which you have in mind. You can also sort all the datastores with active alerts and start working your way with a Datastore with known problems.

On selecting a datastore you see its current capacity and utilization along with a count of VMs served by that Datastore. The metric charts helps you to view historical trends of key storage metrics such as latency, outstanding IOs and throughput.

The dashboard also lists the virtual machines served by the selected datastore and help you analyze the utilization and performance trends of those virtual machines. If the virtual machines are suffering, the VI administrator can migrate these virtual machines over to other datastores to evenly spread out IO load.



Troubleshoot a Host

Since ESXi servers are the main source of providing resources to a virtual machine, they become extremely critical when it comes to performance and availability. With Troubleshoot a host dashboard, you can either search for specific Host which you have in mind or sort the hosts with active alerts to start your investigation.

As soon as you select a host, you can see the key properties of each of the host to ensure thy are configured as per your virtual infrastructure design. Any deviation from standards could cause potential issues. You can answer some key questions around current and past utilization, workload trends over the last week and if virtual machines served by the host are healthy.

Hardware faults with the hosts can be easily surfaced on this dashboard since it lists all the critical events which might affect the availability of the hosts. If you find a host which is running hot, the next logical step in the troubleshooting process would be to find out the villain virtual machines which might be consuming resources from that host. You can find a list of top 10 virtual machines which are demanding CPU and Memory Resources from the identified host.




Troubleshoot vSAN

The Troubleshoot vSAN dashboard is designed to help a vSAN administrator step through a guided workflow to investigate potential issues with each layer of vSAN. The dashboard allows you to start with looking at key properties of your vSAN cluster along with the active alerts on any of the cluster components such as hosts, disk groups or the vSAN datastores.

Once you select a cluster, you can list all the known problems with all the objects which are associated to that cluster. This includes, clusters, datastores, disk groups, physical disks and most importantly the virtual machines which are being served by the selected vSAN cluster.

The dashboard then drills down into the key utilization and performance metrics and shows you a trend of how the cluster has been used and has performed over the past 24 hours. You can easily go back in time if you are dealing with historical issues. While most of the problems would be surfaced up at the cluster level, a drill down analysis can be done at the host, disk group or down to the physical disk level.

Heatmaps within the dashboard help you answer questions around write buffer usage, cache hit ratio, host configurations and physical issues with capacity and cache disks such as drive wearout, drive temperature and read-write errors.





 Troubleshoot with Logs

The Troubleshoot with Logs dashboard can be used when you want to investigate an ongoing issue within your virtual infrastructure using the logs. This dashboard helps you to look at predefined views created within Log Insight to answer common questions which can be surfaced through pre-defined queries within Log insight.

With this dashboard, you can correlate metrics and queries within vRealize Operations Manager on a single pane of glass to troubleshoot issues across applications and infrastructure.





In case you are like me, and don't like to READ. You can see the dashboards in action in this video playlist:

See all dashboards in action here.



More to come.. Stay Tuned!!




Thursday, June 22, 2017

Part 4 - Capacity & Utilization Dashboards in vRealize Operations 6.6.

Welcome back to the vROps 6.6 What's New Series. In the last post of this series we took a tour of the dashboards which help you run production IT Operations.

In this post we will move a step forward and cater to the personas in an organization who are responsible for managing existing capacity and plan for future capacity for the Software Defined Datacenter. The set of dashboards which can help these roles in an oragnization are categorized under the "Capacity & Utilization" category:

Here is how Capacity & Utilization shows up on the Getting Started Page:


As mentioned before, Capacity and Utilization category caters to the teams responsible for tracking the utilization of the provisioned capacity in there virtual infrastructure. The dashboards within this category allow you to take capacity procurement decisions, reduce wastage through reclamation, and track usage trends to avoid performance problems due to capacity shortfalls.

Key questions these dashboards help you answer are :

  • How much capacity I have, how much is used and what are the usage trends for a specific vCenter, datacenter or cluster?
  • How much Disk, vCPU or Memory I can reclaim from large VMs in my environment to reduce wastage & improve performance?
  • Which clusters have the highest resource demands?
  • Which hosts are being heavily utilized and why?
  • Which datastores are running out of disk space and who are the top consumers?
  • How is the storage capacity & utilization of my vSAN environment along with savings achieved by enabling deduplication and compression?

Let us look at each of these dashboard and I will provide a summary of what these dashboards can do for you along with a quick view of the dashboard:


Capacity Overview

The Capacity Overview Dashboard provides you a summary of the total physical capacity available across all your environments being monitored by vRealize Operations Manager. The dashboard provides a summary of CPU, Memory and Storage Capacity provisioned along with the resource reclamation opportunities available in those environments.

Since capacity decisions are mostly tied to logical resource groups, this dashboard allow you to assess Capacity and Utilization at each resource group level such as vCenter, Datacenter, Custom Datacenter or vSphere Cluster. You can quickly select an object and view it's total capacity and used capacity to understand the current capacity situation. Capacity planning requires you to have visibility into the historical trends and future forecasts, hence the trend views within the dashboard provide you this information to predict how soon you will run out of capacity.

If you plan to report the current capacity situation to others within your organization, you can simply expand the Cluster Capacity Details view on this dashboard and export this as a report for sharing purposes.




Capacity Reclaimable

The Capacity Reclaimable Dashboard provides you a quick view of resource reclamation opportunities within your virtual infrastructure. This dashboard is focused on improving the efficiency of your environment by reducing the wastage of resources. While this wastage is usually caused by idle or powered off virtual machines another biggest contributor to this wastage is oversized virtual machines.

This dashboard allows you to select an environment and quickly view the amount of capacity which can be reclaimed from the environment in form of reclaimable CPU, Memory and Disk Space.

You can start with the view which lists down all the virtual machines running on old snapshots or are powered off. These VMs provide you the opportunity of reclaiming storage by deleting the old snapshots on them or by deleting the unwanted virtual machines. You can take these action right from this view by using the actions framework available within vRealize Operations Manager.

The dashboard provides you recommended best practices around reclaiming CPU and Memory from large virtual machines in your environment. Since large and oversized virtual machines can increase contention between VMs, you can use the phased approach of using aggressive or conservative reclamation techniques to right size your virtual machines.




vSAN Capacity Overview

The vSAN Capacity Overview dashboard provides an overview of vSAN storage capacity along with savings achieved by enabling dedupe and compression across all your vSAN clusters.

The dashboard allows you to answer key questions around capacity management such as total provisioned capacity, current and historical utilization trends and future procurement requirements. You can view things like capacity remaining, time remaining and storage reclamation opportunities to take effective capacity management decisions.

The dashboard also focuses on how vSAN is using the disk capacity by showing you a distribution of utilization amongst vSAN disks. You can view these details either as an aggregate or at individual cluster level.




Datastore Utilization

The Datastore Utilization dashboard is a quick and easy way to identify storage provisioning and utilization patterns in a virtual infrastructure. It is a best practice to have standard datastore sizes to ensure you can easily manage storage in your virtual environments. The heatmap on this dashboard plots each and every datastore monitored by vRealize Operations Manager and groups them by clusters.

The utilization pattern of these datastores is depicted by colors, where grey represent an underutilized datastore, red represents a datastore running out of disk space and green represents an optimally used datastore.

By selecting a datastore, you can see the past utilization trend and forecasted usage. The view within the dashboard will list all the virtual machines running on the selected datastore and provide you with the opportunity to reclaim storage used by large virtual machines snapshots or powered off VMs.
You can use the vRealize Operations Manager action framework to quickly reclaim resources by deleting the snapshots or unwanted powered off VMs.





Cluster Utilization

The Cluster Utilization dashboard allows you to identify the vSphere clusters that are being heavily consumed from a CPU, memory, disk, and network perspective. High or unexpected resource usage on clusters may result in performance bottlenecks. Using this dashboard you can quickly identify the clusters which are struggling to keep up with the virtual machine demand.

On selecting a cluster with high CPU, Memory, Disk or Network demand, the dashboard provides you with the list of ESXi hosts that are participating in the given cluster. If you notice imbalance between how the hosts within the selected clusters are being used, you might have an opportunity to balance the hosts by moving virtual machines within the cluster.

In situations where the cluster demand has been historically chronic virtual machines should be moved out of these clusters to avoid potential performance issues using Workload Balance. If such patterns are observed on all the clusters in a given environment, it indicates that new capacity might be required to cater to the increase in demand.




Heavy Hitter VMs

The Heavy Hitter VMs dashboard helps you identify virtual machines which are consistently consuming high amount of resources from your virtual infrastructure. In heavily overprovisioned environments, this might create resource bottlenecks resulting in potential performance issues.
With the use of this dashboard you can easily identify the resource utilization trends of each of your vSphere clusters. Along with the utilization trends, you are also provided with a list of Virtual Machines within those clusters based on their resource demands from CPU, Memory, Disk and Network within your environment. The views also analyze the workload pattern of these VMs over the past week to identify heavy hitter VMs which might be running a sustained heavy workload (measured over a day), or bursty workloads (measure using peak demand).

You can export the list of offenders using these views and take appropriate actions to distribute this demand and reduce potential bottlenecks. 




Host Utilization

The Host Utilization dashboard allows you to identify the hosts that are being heavily consumed from a CPU, memory, disk, and network perspective. High or unexpected resource usage on hosts may result in performance bottlenecks. Using this dashboard you can quickly identify the hosts which are struggling to keep up with the virtual machine demand. The dashboard also provides you with the list of top 10 virtual machines to easily identify the source of this unexpected demand and take appropriate actions.

Since the demand of resources fluctuates over a period of time, the dashboard allows you to look at demand patterns over the last 24 hours to identify hosts which might have a chronic history of high demand. If such cases virtual machines should be moved out of these hosts to avoid potential performance issues. If such patterns are observed on all the hosts of a given cluster, it indicates that new capacity might be required to cater to the increase in demand.



VM Utilization

The VM Utilization Dashboard helps the VI Administrator to capture the utilization trends of any virtual machine in their environment. The primary use case is to list down the key properties of a virtual machine and the resource utilization trends for a specific time period and share the same with the VM/Application owners.

The VM/Application owners often want to look at the resource utilization trends at specific time periods where they are expecting high load on applications. Activities like, batch jobs, backup schedules, load testing etc. could be a few examples. The application owners want to ensure that VMs are not consuming 100% of the provisioned resources during these periods as that could lead to resource contention within applications causing performance issues.




In case you are like me, and don't like to READ. You can see the dashboards in action in this video playlist:

See all dashboards in action here.


More to come.. Stay Tuned!!


Monday, June 19, 2017

Part 3 - Operations Dashboards in vRealize Operations 6.6.

In my last post I gave you an overview of the new user interface of vRealize Operations 6.6 along with some other important enhancements. Do go through that post to get a context of what we are going to discuss here.

With the introduction of Getting Started page within dashboards, one of the categories which is available out of the box is called the "Operations".

Here is how operations shows up on the Getting Started Page:




The Operations category is most suitable for roles within an organization who require a summary of important data points to take quick decisions. This could be a member of a NOC team who wants to quickly identify issues and take actions, or executives who want a quick overview of their environments to keep a track of important KPIs.

Key questions these dashboards help you answer are :
  • What does the infrastructure inventory look like?
  • What is the alert volume trend in the environment?
  • Are virtual machines being served well?
  • Are there hot-spots in the datacenter I need to worry about?
  • What does the vSAN environment look like and are their optimization opportunities by migrating VMs to vSAN?

Let us look at each of these dashboard and I will provide a summary of what these dashboards can do for you along with a quick view of the dashboard.

Datastore Usage Overview


The Datastore Usage Dashboard is suitable for a NOC environment. The dashboard provides a quick glimpse of all the virtual machines in your environment using a heatmap. Each virtual machine is represented by a box on the heatmap. Using this dashboard, a NOC administrator can quickly identify virtual machines which are generating high IOPS since the boxes representing the virtual machine are sized by the number of IOPS they are generating.


Along with the storage demand, the color of the boxes represents the latency experienced by these virtual machines from the underlying storage. A NOC administrator can take the next steps in his investigation to find the root cause of this latency and resolve it to avoid potential performance issues. 



Host Usage Overview



The Host Usage Dashboard is suitable for a NOC environment. The dashboard provides a quick glimpse of all the ESXi hosts in your environment using a heatmap. Using this dashboard the NOC administrator can easily find resource bottlenecks in your environment created due to high Memory Demand, Memory Consumption or CPU Demand.


Since the hosts in the heatmap are grouped by clusters, you can easily find out if you have clusters with high CPU or Memory Load. It can also help you to identify if you have ESXi hosts with the clusters which are not evenly utilized and hence an admin can trigger activities such as workload balance or enable DRS to ensure that hotspots are eliminated.



Operations Overview



The Operations Overview dashboard provides a high level view of objects which make up you virtual environment. It provides you an aggregate view of virtual machine growth trends across your different datacenters being monitored by vRealize Operations Manager.

The dashboard also provides a list of all your datacenters along with inventory information about how many clusters, hosts and virtual machines you are running in each of your datacenters. By selecting a particular datacenter you can zoom into the areas of availability and performance. The dashboard provides a trend of known issues in each of your datacenters based on the alerts which have triggered in the past.

Along with the overall health of your environment, the dashboard also allows you to zoom in at the Virtual Machine level and list out the top 15 virtual machines in the selected datacenter which might be contending for resources.




Optimize vSAN Deployments



The Optimize vSAN deployments dashboard is an easy way to device a migration strategy to move virtual machines from your existing storage to your newly deployed vSAN storage. The dashboard provides you with an ability to select your non vSAN datastores which might be struggling to serve the virtual machine IO demand. By selecting the VMs on a given datastore, you can easily identify the historical IO demand and latency trends of a given virtual machine.

You can then find a suitable vSAN datastore which has the space and the performance characteristics to serve the demand of this VM. With a simple move operation within vRealize Operations Manager, you can move the virtual machine from the existing non vSAN datastore to the vSAN datastore.

Once the VM is moved, you can continue to watch the utilization patterns to see how the VM is being served by vSAN.


















vSAN Operations Overview


The vSAN Operations Overview Dashboard provides an aggregated view of health and performance of your vSAN clusters. While you can get a holistic view of your vSAN environment and what components make up that environment, you can also see the growth trend of virtual machines which are being served by vSAN.

The goal of this dashboard is to help understand the utilization and performance patterns for each of your vSAN clusters by simply selecting one from the provided list. VSAN properties such as Hybrid or All Flash, Dedupe & Compression or a Stretched vSAN cluster can be easily tracked through this dashboard.

Along with the current state, the dashboard also provides you a historic view of performance, utilization, growth trends and events related to vSAN.






















In case you are like me, and don't like to READ. You can see the dashboards in action in this video playlist:

See all dashboards in action here.



More to come.. Stay Tuned!!