Wednesday, October 5, 2016

Did You Know #2 - Leveraging vROps Remote Collectors for Local Adapters!

In this part of the "Did You Know" series, I will talk about a small architectural tip which will not only help you enhance the performance of your vRealize Operations Manager cluster, it will also save you from up-sizing the cluster from let's say, medium to large nodes and at the end of the day save a ton of CPU & Memory in the process.

Did you know that vRealize Operations Manager uses Remote Collectors for collecting data from a Remote Datacenter and send it over to the centralized vROps cluster. The diagram below shows the actual purpose for which a remote collector was introduced in vRealize Operations Manager:

In the above example, we have a vROps Cluster in Site A. This cluster consists of 2 or more nodes which have a local collector module on them. This collector module collects the data from the local data sources, which are also known as adapter instances. Some examples of an adapter instances would be vCenter Adapter, NSX Adapter, MPSD (management pack for Storage Devices) etc.

The Nodes of the cluster here have multiple roles to play. They not only collect the data from the data sources, they also have to crunch this data using the analytics engine, calculate dynamic thresholds, run the capacity engine and host all the data through the CASA and Web UI. 

On the other hand in Site B, we have a remote collector group with 2 or more remote collectors (in a HA mode). Their role is to collect the data from the Site B data sources using the Collector framework on each node and send that data over to the centralized cluster in Site A. The Remote Collectors are small form factor of the vROps appliances which are stateless and the only role thy have is to collect data. Here are a few facts which make them great for playing the role of a collector from a sizing standpoint.

The come in 2 form factors ***: 

SMALL: 2 vCPU / 4 GB RAM. A small RC can collect 1500 Objects (an object can be a VM, Datastore, ESXi Host, LUN, etc) and upto 600,000 metrics. (1 VM usually creates around 250 metrics).

LARGE: 4 vCPU / 16 GB RAM. A small RC can collect 12000 Objects and up-to 3,500,000 metrics.

We all know that the main cluster nodes of vROps can also do collection and as per the sizing guidelines, a medium node can collect up to 7000 Objects, while a large cluster node which is 16 vCPU / 48 GB of ram can also collect up to 12000 Objects.

***Reference VMware KB -


Now imagine a scenario, where you have a 4 Medium Node cluster with 4 node in Site A. You have a vCenter Adapter instance which has more than 7000 Objects (5000 VMs, 2000 Datastores, 200 ESXI Hosts etc). In such a situation, in order to collect data from this vCenter Adapter instance, you would have to up size your cluster node to a large node. Since vROps cluster nodes have to be symmetrical, you would have to up-size all your cluster nodes to LARGE NODES. In this situation you would have to invest on 32 vCPU (8 per cluster node to reach 16 vCPUs) and 64 GB of RAM (16 per node to reach 48 GB per cluster node). This in most cases is a huge change since you would have to ensure you have enough resources in the under lying cluster. In some cases you might also go beyond the NUMA boundary which we all know has some performance impact from a CPU standpoint.

With all these concerns in place, it would an excellent opportunity to leverage the Remote Collectors in the local Site A as well. While the name says REMOTE, it is not necessary that remote collectors are deployed only on remote sites. They can also be utilized in a local site to collect data from adapter instances which can be large in size. Taken our example into consideration, we would just need 2 Remote Collectors ( 2 for high availability, in case one fails) to collect from the Site A vCenter. These 2 appliances will only cost as 8 vCPUs and 32 GB of RAM in total). This will reduce the resource requirements by more than half and also ensure that your cluster nodes have no pressure on collector and hence all that CPU and RAM can be utilized by the other roles on the vROps nodes which will eventually give better performance.

So here would be the new architecture with Remote Collectors Everywhere!!!

With this model, we have better performance, more scale and less hardware requirement for deploying large vROps Deployments. Another important thing to note is that you can always migrate from Design 1 to Design 2 or from Design 2 to Design 1 without any downtime or data loss. Hence if you are on the way to scale the environments being monitored by vROps, this tech tip would be very useful for you.

Hope this helps with day to day datacenter operations using vRealize Operations Manager.

Stay tuned for more goodies!

1 comment:

  1. I'm thinking about re-engineering our vROps environment exactly like this. Allows you to run much smaller nodes and should hopefully create a more responsive console as well.

    We currently have 5 VC's being monitored by 2 massive nodes, I'm thinking of creating a data collector node for each VC and then have 1 management node.