Thursday, January 31, 2013

ESXi/ESX host becomes Inaccessible/Not-Responding in vCenter after Un-Presenting Storage Luns

Today, I was responding to an e-mail from a system engineer from the VMware Partner Community. This was about an issue which they recently faced in a customer's VMware Infrastructure. This is reference with handling of Storage LUNS and Datastores on vSphere 4.x and 5.x platforms.

Here is the issue:-

"We have some unused data stores so we un-mounted those from all ESXi hosts and deleted the LUNs from storage. After that, the deleted datastores goes in a dead path state. One of the hosts in a cluster automatically disconnected from the vCenter. We re-scan HBA and  management services; however the dead paths still appear in the vCenter. We are not able to do vMotion and our HA master agent is also not working. How do we fix this??"

I have been asked similar questions time and again from customers, partners and fellow co-workers, hence I thought I should share this information with the larger community with this blog post.

How do you land up in this situation:-

Once you create a VMFS Datastore on a storage LUN, the state of this Datastore and the associated LUN is saved in the storage configuration of the ESXi Kernel. From here on, it is the responsibility of the vmkernel to ensure that the VMFS Datastore and the associated LUN is always available to the ESXi Hosts.

To ensure availability, the vmkernel sends I/O requests (SCSI Commands) to each of the Datastore after every few seconds and receives a response which ensures that the Datastores are up and running. This mechanism is to ensure that any transient storage conditions can be resolved and the LUNS/Datastores are available to the virtual machines.

Now coming to the situation mentioned above, where you have un-mounted, un-presented and destroyed the LUNS. In this situation, ESXi is not aware of the fact that whether you have un-presented these LUNS yourself or ESXi has lost them due to a technical failure. Now, with the default safety feature where-in ESXi will try to recover these LUNS, it will start sending requests to bring the state of the missing LUNS as ONLINE. Unfortunately, these LUNS are not available anymore; hence the requests of the ESXi hosts would not be honored. 

Since these request would go into a stale state, they would cause the Hostd service to hang. Hostd keeps a track of all the agent based services and resources available to ESXi. As soon as the Hostd service is in trouble,  the VPXA agent, HA Agents and the Hostd service itself will start falling apart, causing the host to disconnect from the vCenter Server. No vCenter means no vMotion etc.

Form above description, you can see the cascading effect of un-presenting LUNS from the ESXi servers without following a proper procedure which is available in a VMware KB article. Please note that this issue can happen in a day, weeks or even a month’s time of un-presenting the storage Luns.

How to solve this

Unfortunately, you are in a scenario where you have already hit the situation wherein the hostd service is crashing. You would have to perform a reboot of all the the ESXi servers which are showing the APD (All Paths Down) or PDL (Permanent Device Lost) warning messages in the vmkernel / Messages logs.

To read more on this you can refer to the following KB Article - Permanent Device Loss (PDL) and All-Paths-Down (APD) in vSphere 5.0

How to avoid this in the future

To avoid getting into such a situation in the future, please follow the proper procedure mention in the KB article – Un-mounting a LUN or Detaching a Datastore/Storage Device from multiple ESXi 5.x hosts


Hope this helps you to understand the basic concepts of how Storage LUNs need to be treated on an ESXi Server to ensure that storage operations do not affect the way how your VMware Infrastructure performs.

Sunday, January 27, 2013

Managing a VMware vCenter Server running on a Virtual Machine!!

Just wanted to share a couple of pointers which came up during a vSphere design review process for a customer. This is in reference with VMware vCenter Physical vs. Virtual. I have written about this topic before in one of my posts (VMware vCenter Server - Physical vs. Virtual).

During my discussions there were arguments around tracking of the vCenter Virtual machine in a big environment and getting on to it for troubleshooting in case the vCenter Server service or the VM is down can be a little time consuming.

Therefore, some of the organizations prefer a physical vCenter to have more control and a single point to look at and troubleshoot in case of issues. I would say this has more to do with comfort and mindset of the admin, that the application managing the virtual environment itself is not virtual and isolated from the virtual infrastructure.

I would not say that these points are not valid, since no one would like to search for there vCenter VM in case of vCenter downtimes. If you have not planned the initial placement of the vCenter VM, then you might end up logging on to each ESXi server directly via vSphere Client and search for your vCenter VM. This can be a cumbersome and time consuming process. This might actually affect services such as VMware view or vCloud Director for a longer duration in case of vCenter Downtimes, given that you do not use vCenter Heartbeat in your infrastructure.

There are a couple of things which every vSphere Design with a Virtual vCenter should consider:-

a) Separate Management Cluster - In slightly bigger setups where you might end up having multiple clusters of ESXi servers and multiple different management virtual machines, such as storage management appliances, vCloud director or SRM machines, you should have a separate management cluster of 2 to 3 ESXi servers (size them as per your requirement). Here is where you place for vCenter Server as well. Isolated from your production environment and also easy to track and troubleshoot in case of vCenter server   fails due to any issues.

b) DRS rules for vCenter VM - You may or may not have the liberty of creating a separate management cluster. However, it is absolutely recommended to use DRS rules to control the placement of your vCenter Virtual Machine. 

You should use the DRS rule of "Virtual Machine to Hosts" in order to place the vCenter VM on the FIRST host of the FIRST Cluster in your vCenter. This is possible with the DRS rules and this will ensure that your vCenter server is always running on the same ESXi server and only in case of that ESXi server failing, the VM powers onto the next host in the cluster using vSphere HA. This method will ensure that you have only one ESXi server to look at in case your vCenter Server is acting up and you can trace the VM easily.

This is how you can achieve this:-

1- Right click on the first cluster of your vCenter Server (Assuming vCenter VM is a part of this cluster).
2- Click on Edit Settings.
3- Under DRS > Click Rules > Add.
4- Click the DRS Groups Manager tab.
5- Click Add under Host DRS Groups to create a new Host DRS Group containing the first host of the cluster.
6- Click Add under Virtual Machine DRS Groups to create a Virtual Machine DRS Group for the vCenter VM
7- Click the Rule tab, from the Type drop-down menu, click Virtual Machines to Hosts.
9- Select the Virtual Machine DRS Group which you created in the previous steps and the Host Groups which you created and you are done.

After saving this setting the vCenrer VM will automatically migrate to the host which you selected using vMotion and would stay there, making it easy and simple for you to locate in case of vCenter downtime.

Just for a recap here are the settings:-

DRS Groups Manager
Specification
Virtual Machine Group Name
 <vCenter VM Name>
Virtual Machine Group Member
<vCenter VM>
Host DRS Group Name
<First ESXi Hostname in Cluster>
Host DRS Group Member
<First ESXi Host in Cluster>
Rules
Specification
Name
 <vCenter VM Name> on <ESXi Hostname>
Type
 Virtual Machines to Hosts
Cluster VM Group
<Virtual Machine Group Name>
Rule
Should run on hosts in group
Cluster Host Group
<Host DRS Group Name>



Last but not the least you need to ensure that you keep the Virtual Machine Restart Policy for vCenter Server in case of an HA event, as the highest priority so that the vCenter VM is up as soon as possible.

Note:- Another important factor which I would like to put across at this time is a best practice which I read about in the following book  - "VMware vSphere 5.0 Clustering Technical Deepdive" by Duncan Epping and Frank Denneman. (An awesome and a must read).

Duncan and Frank in there book mentioned a valid point:-

"Although HA is configured by vCenter and exchanges virtual machine state information with HA, vCenter is not involved when HA responds to failure. It is comforting to know that in case of a host failure containing the virtualized vCenter Server, HA takes care of the failure and restarts the vCenter Server on another host, including all other configured virtual machines from that failed host.

There is a corner case scenario with regards to vCenter failure: if the ESXi hosts are so called “stateless hosts” and Distributed vSwitches are used for the management network, virtual machine restarts will not be attempted until vCenter is restarted. For stateless environments, vCenter and Auto Deploy availability is key as the ESXi hosts literally depend on them."

Hence, it is important you ensure that vCenter comes back up on high priority in case of an HA event. This will get the management network going in case of a Distributed vSwitch and Auto Deploy to work... However with vSphere 5.1, you do have an option to boot the ESXi server with a backup copy of ESXi which you can save on the local drive if available on the server.

Hopefully the above illustrations would be good enough for you to go ahead and Virtualize your vCenter and also have control over the application.


Sunday, January 20, 2013

Running Scripts to Automate SRM Recovery Plans using Powershell / PowerCli !


In my last post, I took a deep dive on how to run commands within the Recovery Plan, which would call out scripts on the recovered Virtual Machines. I would highly recommend that you read that article before reading this one. Here is the link - Using scripts to automate VMware Site Recovery Manager workflows & recovery steps!

Now is the time to take this one level up and see how SRM can integrate with other powerful tools such as PowerShell and PowerCLI to create a centralized script repository on the SRM server. This repository can contain scripts which can be executed by the PowerShell scripting engine with the help of PowerCLI installed on the SRM Server.

I think a better approach would be to go ahead and take a scenario which needs such kind of scripting and I will showcase step by step configuration which would help use solve the problem through scripting.


Situation:-

An organization is using vSphere Replication to replicate the virtual machines from Site A (Protected Site) to Site B (Recovery Site). They have the SRM automation engine to perform Recovery or Test.

There are a few virtual machines protected at Site A with an application which generates a lot of garbage logs. These garbage logs are a part of the OS and the VM , hence this data is also replicated to the DR site using vSphere Replication.

Since this data is not required at the recovery site, we are simply wasting the replication bandwidth to send these logs which could be in GB’s in some cases per virtual machine. The organization wants to efficiently use the Network between sites and want to remove any replication overheads which might not be required in the Recovery Site.

Solution:-

The solution to this issue is that we keep these logs on a separate log partition (which is on a different VMDK). vSphere Replication allows us to go to the granularity of choosing the VMDK’s which you want to replicate from a Virtual machine.

Let’s say the name of the VM which we want to replicate without the logs is called “vmsunny” which has 2 Virtual Disks.

Disk 1 – vmsunny.vmdk
Disk 2 – vmsunny_1.vmdk

The Disk 1 has the OS, Application and Data while the Disk 2 has the logs which are of no use in the DR site.

In this case, we will just replicate the Disk 2, once to create a copy on the DR site and then when we configure vSphere Replication for this VM, we would not replicate the Disk 2 to save the replication bandwidth from next time onwards.

Now we will use the SRM scripting capabilities to add the disk 2 on the DR virtual machine whenever we run a Test Recovery or a Disaster Recovery. We would need to script this task so that it happens automatically with the Recovery Workflow.

How to Implement

Here are the logical steps which I would follow in this case:-

a)      Replicate the ‘vmsunny’ vm from Site A to Site B, replicate both Disk 1 and Disk 2

b)      Once both the disks are created and the replication is complete, pause the replication, go the Site B datastore where Disk 1 and Disk 2 are located and move the Disk 2 out of the VM folder (This is to ensure that Disk 2 is preserved when you un-replicate the Disk 2 in next step).

c)      Now go back to vSphere Replication and configure replication for vmsunny once again. This time, DO NOT replicate Disk 2 i.e. vmsunny_1.vmdk. This will remove this disk from replicating from now on.

d)     Remove this disk from the Protection group as well by using the Detach option.

e)      Now go back to the location where you moved the Disk 2 in the DR site. Move the Disk 2 back to the VM folder from where you moved it in Step 2.

f)       Resume the replication which you paused in Step 2.

Now we would configure the Recovery Plan for ‘vmsunny’ VM and add a command to call a script from the SRM Server. This script would add the Disk 2 on the VM in the DR site as soon as the Test Recovery or Disaster Recovery option is selected for that recovery plan. This would be a pre-power on script which would add the disk and then the VM would power on.

Preparing the SRM server for running scripts & Creating the Recovery Plan

Before you could call scripts from the SRM server, you need to configure Powershell and PowerCLI on the SRM server. This would allow the SRM server to run powershell scripts using the VIX API and call the virtual center to add a disk to a virtual machine.

Please follow the steps in the following article written by Alan Renouf:-

Back to Basics: Part 1 – Installing PowerCLI


This will help you prepare your SRM Server for running PowerCli scripts which would call the following script kept on a folder in the SRM server:-

++++++++++++++++++START OF SCRIPT++++++++++++++++++
add-pssnapin VMware.VimAutomation.Core
Connect-VIServer -Server <vCenter Server Name> -WarningAction SilentlyContinue
New-HardDisk -VM "Virtual Machine Name" -Persistence IndependentPersistent -DiskPath "PATH TO  VMDK AT DR SITE"
++++++++++++++++++END OF SCRIPT++++++++++++++++++

Sample Example:-
add-pssnapin VMware.VimAutomation.Core
Connect-VIServer -Server mylabvcenter.lab.com -WarningAction SilentlyContinue
New-HardDisk -VM "vmsunny" -Persistence IndependentPersistent -DiskPath "[EVA DISK1] vmsunny/vmsunny_1.vmdk"

This script would be placed on a folder in the SRM Server, let’s say it is located on the SRM server on c:\scripts folder. In order to call this script we would add this to the SRM recovery plan as a pre-power on script as shown in the screenshot below:-



Let us assume that the script which we need to call is located on C:\scripts with the name as sunnyvm.ps1. We will call this script as shown in the screenshot below:-





Here we are calling the powershell application which is calling sunnyvm.ps1 located on the SRM server.

sunnyvm.ps1 contains the following script as stated before:-

add-pssnapin VMware.VimAutomation.Core
Connect-VIServer -Server mylabvcenter.lab.com -WarningAction SilentlyContinue
New-HardDisk -VM "vmsunny" -Persistence IndependentPersistent -DiskPath "[EVA DISK1] vmsunny/vmsunny_1.vmdk"


This would add vmsunny_1.vmdk to the sunnyvm and we then the vm will boot up. Now all the logging can be done on this disk when the VM is either recovered at the DR site or is powered on in a Test Recovery Mode.

Using this engine, you can do other powerful things in your SRM environment, such as Re-sizing vCPU or Memory allocations, add/remove disks or any  automation task which would make your recovery automatic and error free. Please feel free to share this article and your thoughts using the comment window.



Wednesday, January 16, 2013

Changing the SSH Port on the ESXi server for Cyber-Ark Integration!!

In one of my recent implementation, I got a request from a client to change the default SSH Port on the ESXi server from Port 22 to Port 63022.

This was a requirement since they have a password management system from Cyber-Ark which would store and reset the root and other user passwords on the ESXi server for security reasons. Cyber-Ark works with any Unix or Linux operating system by using the SSH port. Since ESXi also uses SSH for remote access, we had to integrate Cyber-Ark on SSH port with the ESXi server. Cyber-Ark uses SSH however the integration happens on port 63022 for SSH.

Let's see how I went about changing the SSH Port to 63022 sand made it consistent across ESXi reboots.

We would need to update this configuration in 2 locations for this to work:-

a) /etc/vmware/firewall/ - In this location we would have to place a new firewall rule for SSH port which me manually define. This would be done by creating an XML file which would be saved in this location. Here are the contents if the xml file:-

<ConfigRoot>
<service>
<id>SSH 63022</id>
<rule id = '0000'>
<direction>inbound</direction>
<protocol>tcp</protocol>
<porttype>dst</porttype>
<port>63022</port>
</rule>
<enabled>true</enabled>
<required>false</required>
</service>
</ConfigRoot>

For ease we will call this file ssh63022.xml

We would need to refresh the firewall policies after placing this file in the given location on the ESXi server. Here is the command which will be using:-


#esxcli network firewall refresh

b) /etc/services - The second change would be to create a new services file where we can define the SSH port as 63022 instead of 22. For this we would need to create a new services file. You can copy this file from the default location and place it on a SAN Data-store and then edit it with the new port information. Here is how you can do it:-

# cp /etc/services /vmfs/volumes/EMC-SANLUN-01/ssh

I have created a folder names SSH on my SAN Datastore EMC-SANLUN-01. Then, I am copying the services file to my EMC SAN VMFS datastore which is visible to all my hosts in the cluster. 

Now lets check if the file has moved there:-

~ # cd /vmfs/volumes/EMC-SANLUN-01/ssh
/vmfs/volumes/50f5e6fd-6fa36a6c-8339-000c29c4df2b/ssh # ls -ltrh
-rw-r--r-T    1 root     root        20.3k Jan 16 00:16 services

Now that we have a copy of the services file, lets edit it to change the ssh port. Run the following command:-

/vmfs/volumes/50f5e6fd-6fa36a6c-8339-000c29c4df2b/ssh # vi services

Locate the ssh port setting as shown in the screenshot below:-


Now edit this file and change the port 22 to 63022 as shown below:-


Save the change on this file and run the following command to replace the original file with this file.

~ # cp vmfs/volumes/EMC-SANLUN-01/ssh/services /etc/services

This will change the default ssh port from 22 to 63022.

Now to make it consistent across the reboots, it is important that you perform these 2 steps every time the ESXi server reboots. It is not practical to run these steps manually, hence a better way would be to automate this using the rc.local file which can run simple scripts on the ESXi server during start-up.

Similar to services file in the following location - /vmfs/volumes/EMC-SANLUN-01/ssh, copy the ssh63022.xml which we created in STEP A to this location as well. You can use Datastore Browser on vSphere Client or a utility such as winscp. See screenshot below:-

















Now that you have both the files in a shared datastore, update the rc.local file to copy these files to the respective locations everytime the server reboots. You would need to make the following entry in the rc.local file:-

Note - rc.local is located in /etc directory.

Edit the file and update it with the following script:-

#Copy the new firewall rule from vmfs place holder to file system
cp /vmfs/volumes/EMC-SANLUN-01/ssh/ssh63022.xml /etc/vmware/firewall/
#refresh firewall rules
esxcli network firewall refresh
#Copy the modified services file from vmfs place holder to file system
cp /vmfs/volumes/EMC-SANLUN-01/ssh/services /etc/services
#Restart inetd to get the changes
kill -HUP `cat /var/run/inetd.pid`

See screenshot below:-

Run the following command:-

~ # vi /etc/rc.local


Lastly, save this file and Reboot the ESXi host. Now you would have the SSH port set to 63022 and you can easily integrate with Cyber-Ark.

Hope this helps you to make changes to ESXi default ports for 3rd party software integration if needed.




Monday, January 14, 2013

Dividing Bandwidth of a 10 GB CNA Adapter for ESXi Networking andStorage!!


With most of my recent projects customers are moving towards the 10G converged adapters to achieve the benefits of consolidation of network and storage especially on Blade Server Architecture.

I am writing this post to provide you guidelines on how you can divide a 10GB CNA card on your ESXi server to meet all the network and storage requirements. Before that, let’s have a look at what is the 10Gig CNA and what are the brands available in the market available for this technology.

A CNA card a.k.a "Converged Network Adapter" is an I/O card on a X86 server, that combines the functionality of a host bus adapter (HBA) with a network interface controller (NIC). In other words it "converges" access to, respectively, a storage area network and a general-purpose computer network. As simple as it sounds, it makes things simple in the datacenter as well. Instead of running down those cables from each NIC card/FC HBA or iSCSI cards, you can just use a single cable to do all these tasks for you. This is because the CNA card is converged and can carry all the traffic on a single physical interface.

There are a number of manufacturers of such cards who either manufacture these cards themselves or just re-brand them with their logo and custom firmware.. Here are a few examples:-

- Cisco
- Dell
- HP
- Q-Logic
- Emulex
- IBM etc..

So as a customer you have a number of choices and it is important that you choose what fits your existing infrastructure or the new hardware if it is a Greenfield site.

Let's say you bought a CNA which gives you 4 virtual ports per physical port, let’s see how we can divide the bandwidth of this physical port amongst the virtual port to both Storage and Network Communication.

On the physical card the bandwidth can be divided like how it is shown in the figure below:-



Here, the CNA card has 2 physical port each with 10 GB bandwidth. I have further divided this card into 3 network cards and 1 FC HBA per physical port. Hence, I will have a total of 6 Network cards and 2 FC HBA per CAN card. If you like the concept of No Single Point of Failure (SPOF) and can afford another card, and then you would end up having 12 NIC Cards and 4 FC HBA Ports per Blade server.

Isn't that cool?? A Blade server which so many NICs. Well this can be used on Rack servers as well as it will also reduce the back-end cabling!

Now a last look at how I would use these NICs and FC Ports to configure the networking for the ESXi Server. The diagram below shows how I would configure the networking on my ESXi server to get the best possible configuration out of the available hardware resources.



The Diagram above clearly shows how we have divided this bandwidth amongst all the required port groups. If you have 2 such cards, you will have high resiliency in your design and the number of ports would double up providing better performance as well.

Remember you are free to toggle around the bandwidth for the Virtual NICs and Virtual FC HBA’s basis how much you want for your port groups. The bandwidths which I have mentioned above are a guideline and can be used as they fit in most the bills.

Hope this helps you design the network and storage with the 10GB Adapter without issues.

**************************************************

Update - 3rd April - Look at this new article - Dividing Bandwidth of a 10 GB CNA Adapter for ESXi Networking and Storage using Network I/O Control (NIOC) which talks about using Network IO Control to do the network segregation.