Thursday, February 6, 2014

Part 6 (Final) - Architecting vSphere for Business Critical Apps - A Scoop From my vForum Prezo!

This article throws light on Architecting vSphere Infrastructure for running workloads which are critical to your business. With this article I will end this series which I have been writing on Architecting vSphere Infrastructure. I thought I would conclude this with something which I am interested in evangelizing in the days to come as I not only see a huge learning & business opportunity in this area, but I also see the next major shift which virtualization is going to bring. 

Before we begin, here are the links to the first 5 articles in this series in case you are interested:







While creating my vForum Presentation (See Part 1 for more details), I realized that I would have a number of people in the crowd who will be interested in discussing the virtualization of their business critical workloads, since they have already exhausted all of their Tier 2 & Tier 3 applications running on physical x86 boxes, by converting them into virtual machines.

With this article we will look at some of the grey areas which we should be aware of when we are planning to Virtualize business critical applications on a vSphere Environment. At this point let me also highlight the fact that these are specific areas which may or may not suit all the applications/workloads. These are general recommendations which will help you iron out a lot of issues which I have seen organizations facing while taking this task on hand.

As always, let us first look at the single slide which I used during my presentation and then we will discuss each point as we move along.



While the above slide is quite self-explanatory, I would like to elaborate on a few of the pointers which might need some more explanation and clarification to put my point across to you and also due to recent changes in the vSphere 5.5 platform.

FOLLOW ASTBUL - This is not a technical jargon or a industry standard term. I coined this term to ensure that we do not rush into virtualization of business critical workloads. ASTBUL stands for ASSESS, SETUP, TEST, BENCHMARK, UAT & LIVE. When organizations begin with virtualization, they chose the low hanging fruits which are not only easy to virtualize, but can also live with downtime if it happens. We all know that it is not a rocket science to run a P2V conversion tool like converter to virtualize physical workloads. Similarly all the organizations with a decent virtualization footprint would have a Virtual First policy leading to virtualization of most of the new workloads as well. What does this mean to the C-Level people? Savings, infact let me say HUGE SAVINGS

While this was all good with Tier 2 and Tier 3 type of workloads, such an approach CANNOT be followed for applications/servers which are critical to an organization. What I mean to say is that you cannot, rather should not follow the approach of throwing business critical workloads on your virtual platform which is not architected or ready for them. You are actually making sure that all the hell will break lose and your whole idea of agility, savings, mobility etc with virtualization would be deemed foolish.

With BCA (Business Critical Applications) you need to ensure that you follow the strategy of ASTBUL. You begin with ASSESSMENT of the workload with tools like VMware Capacity Planner, SETUP a parallel to production setup in an isolated environment after inputs from application owners and hardware vendors, TEST the workloads in this environment to the maximum limit to see the Highs & Lows of compute, storage & network performance, BENCHMARK & publish the performance results to the business to get a buy in, run a User Acceptance Test (UAT) to get a buy in from the end users of the application and then finally pull the plug out of that EXPENSIVE physical server and  make your Virtual Workload LIVE. With this approach you will be successful 99% of the time. I believe in luck so would leave the rest of 1% on LUCK :-)


PLAY SMARTLY - The credit for the second point on the slide goes Michael Webster of Long White Virtual Cloud fame. Michael needs no introduction for the people who are active in the Virtualization Community, but for those who are not, he is the first VCDX in New Zealand and one of the finest craftsmen when it comes to Architecting vSphere Environments for running Business Critical Workloads. I would highly recommend you visit his blog to see all the great work he has done around Virtualization. Michael in one of his posts says that if you are virtualizing applications which are network intensive and they have a need to talk to other VMs in the same VLAN, one should try and place such VMs in an affinity. This will ensure that they will run on the same ESXi host and be a part of the same PortGroup/dvSwitch. In this way their network traffic would never leave the ESXi hosts and would transmit at the speed of memory between the VMs. This will not only save your network from choking but would also remove any performance bottlenecks which might result from sharing physical up-links. 

Their are a number of other tips & tricks which can help you get the best results for your business critical workloads. It is important that you TEST & BENCHMARK them as mentioned in the last point.


DE-MYSTIFYING NUMA - NUMA a.k.a. Non Unified Memory Access is another critical area to consider especially when you are virtualizing BCA. Their are a number of articles out their which discuss NUMA and I will not re-invent the wheel here by writing another one. Let's first look at the articles here and then I will add my 2 cents to this concept by trying to simplify & summarize this for you.

To begin with you should read this article from Wikipedia to understand what is NUMA. Then you should understand what is vNUMA and then read the following 3 articles:-




All the above mentioned articles are amazing as they clearly explain the concept of NUMA, vNUMA & the affects of assigning cores instead of sockets. The bottom line is this :-

  • Ensure that you do not cross the physical NUMA boundaries if possible. In-case you need to, GO AHEAD, but ensure that you have all the pre-requisites for vNUMA to do the smart scheduling for you.
  • Upto vSphere 5.1 vNUMA wont kick in if you use more than 1 core per socket while allocating vCPU AND you are crossing the NUMA boundary with your overall allocation. So do not use this setting unless you need to save on CPU license for the application and you can take the performance hit.
  • The point above nullifies with vSphere 5.5 as vNUMA in 5.5 is smart enough to ensure that it will allocate the number of vCPU required with the best possible combination of cores and sockets as per the architecture of the physical NUMA. Hence even if you goof up while allocation, vSphere will take care of it. For eg. if you assign 1 CPU with 16 cores, it will automatically size the machine as 2 CPU with 8 Cores each.


CRITICAL DOES NOT MEAN COMPLEX - When you speak to people about virtualizing business critical applications, the first thing which comes to their mind is "This has to be a complex architecture". This is where things start to become a little overwhelming as architects try to make use of each and every BEST PRACTICE and feature available. A classic example would be to run OS & Application clustering over and above vSphere HA. You need to go to the business and ask them the up-time requirement vs the cost of providing the up-time and you will always see that they are okay with the restart of HA rather than putting complex and expensive clustering technology which is difficult to deploy, manage and troubleshoot. Keep things as simple as possible & you will get the most optimum results from your efforts of running business critical applications in a virtual machine.


VIRTUALIZING BCA- IT'S NOT ABOUT THE MONEY HONEY - While the prime attraction towards virtualization was always monetary benefits, this mindset has to change with BCA (Business Critical Applications). In simpler terms, you might be saving money by moving your business critical workloads from a UNIX to LINUX platform and then virtualize it with vSphere, what you should NOT do is to practice things live OVER-COMMITMENT or HIGH CONSOLIDATION. All the money which you might be saving should be re-invested to ensure that you are successful at virtualizing business critical workloads. This investment should be done in running parallel Test Setups, Running Benchmarks, Zero Commitment of CPU & Memory, High Performance Storage platforms etc. Remember savings is not the only benefit of virtualization. Things such as Ease of Upgrade, scaling up of RAM and CPU on the fly, options to take Snapshots before Upgrading, quickly Rolling out new services, options to Clone from templates, easily moving from TEST to UAT to PROD and vice-versa, Zero-Downtime during hardware maintenance without having to invest in complex clustering solutions are some of the major benefits you get from virtualizing Business Critical Applications and this is what you should AIM for.


I would want to conclude by saying that it is an Art and not a Science, so its important that you INVOLVE THE EXPERTS. Its a fact that you need more experience than skills to architect a vSphere environment suitable for running business critical applications which is stable, secure & highly available.

With that note I will close this article and the entire series. I hope this has and will help you while you architect better for your organization or a customer. If you have any questions, comments or feedback, feel free to use the comments section and I would love to have a constructive debate on this and other topics.


Share & Spread the Knowledge!!



No comments:

Post a Comment