Today was the SRM day at office, so I thought lets get all the discussions together and put them in an article which would help others as well. I will do it in a question and answer format to make it easier for the audiences of this article...
Note: - This is primarily focused on how VMware Site Recovery Manager along with vSphere Replication helps customers create a BCP/DR solution for them with some amazing and easy features available in this product from VMware.
----------------------------------------------------------------------------------------------------------------------------------
Question - Will SRM 5 work with different hardware at the DC and DR site? As I understand, in case of failover with SRM VMs cold-boot. So there shouldn’t be a hardware compatibility issue.
Answer - You can have different models of x86 server hardware on 2 sites as the Virtualization layer of vSphere would make it seamless for the virtual machines to boot up on the DR Site and then failback as well.
The identical hardware comes into question when you are implementing Storage Array based replication for SRM to use. In this case, your storage arrays should be identical for the replication technology to successfully replicate Luns from Protected to Recovery Site and back.
However, if you are using
vSphere Replication, you would not have to worry about that as SRM then uses
the host based replication instead of Storage array based replication.
----------------------------------------------------------------------------------------------------------------------------------
Question - If
total replication data size is around 20 TB how can we achieve this using host
based replication? It is huge data.
Answer - That’s a great questions and
probably a challenge which any organization would face, whether they use, Host
based replication or vSphere Replication. Now if you are using vSphere
replication, there are 2 ways to solve this issue:-
a) Use the Full
Replication option, which literally means that you start the replication on
each VM and then let it finish before you could start creating and configuring
your recovery plans on the SRM interface. So for a total of 20 TB of data it
could take days (depending on the available bandwidth, distance and latency)
before this would complete. I do not recommend this method to customers who are
using vSphere Replication as they have an easier way of making these images
available on the Recovery (DR) Site.
b) Let’s talk about the
easier way now. Instead of replicating the entire data over the wire, we can
use Physical Couriering method which would save time and bandwidth for the
customers. It is as simple as taking a backup / clone of the Virtual Machines which
need to be protected on the Protected Site, dump this copy on a USB Drive /
Tape or multiple drives/Tapes in your case. Create a MD5 Checksum on these
images to ensure consistency(optional step) and ship them across to the
Recovery Site. Now, you can seed these images on the target LUNS. Once you are
done with this process, configure replication on the Protected Site and point
to the seeds as vSphere replication gives you that option during configuration
and you are done. Now vSphere Replication will run its magic and just replicate
the Delta changes to the Protected site which would be a minimal amount of data
as compared to what you would do in the first approach.
----------------------------------------------------------------------------------------------------------------------------------
Question - Suppose we replicate
10 VMs to the DR site and 5 go down at DC, in this case do we have to manually
make DR VMs up or it does that automatically?
Answer -The SRM solution
primarily looks at providing you a framework to design your BCP/DR environment,
hence you would use it in a case where you see either a Disaster coming or a
disaster which has already destroyed your primary site(Protected Site). This
will allow you to either do a planned migration (if you know that disaster is
coming) or a site failover (if the damage is already done). Hence, it is not a
VM level recovery solution, but a Site Level recovery solution. However, in
your case, you can create 2 recovery plans with 5 VM’s each and execute only
one recovery plan instead of failing over the entire site. So you can actually
have a recovery plan for each VM and execute the one which you want to recover
on the DR site in case of a disaster. However, I would revisit my statement
here and recommend that we should first look at high availability solutions
such as VMware HA, FT or third party clustering solutions for recovering from
VM failures. If it is impossible to recover VM’s at a site then we should look
at SRM as a Site failure solution.
----------------------------------------------------------------------------------------------------------------------------------
Question - Suppose customer has
a VM in which we do some configuration like configuring IP of DB server
so that it can be in sync with DB server. Now if that VM goes down then at DR
it will come up and will try to contact the DB server which is in DC. So in
that case how can we utilize SRM feature effectively.
Answer - SRM
gives you the automation to recover virtual machines at the Recovery Site,
however it is important that you provide all the components which are required
for the applications running on those virtual machines. Taking the database
example, you should either have the database available with the same IP address
on the DR site or if the database has a different IP address then you can
either script the changes using VB Script to change that setting in the ODBC
connection after the VM powers on or you can do it manually. If you script it,
then SRM Recovery Plan workflow can accommodate that script for you and execute
it. For the IP address of the virtual machines, we can use Guest IP
customization and that would do the IP changes on the fly using API’s on
Windows. You can use Bulk IP customization if you want to change IP’s on a
number of virtual machines at one go and this can all be configured while
setting up SRM for the first time. Regarding the DB, you can either Virtualize it so that SRM can replicate it and make it available at the recovery site or
you can use DB replication technologies to have the database available at the
recovery site.
----------------------------------------------------------------------------------------------------------------------------------
These features really
differentiates vSphere Replication from other host based replication
technologies and helps the customers implement DR which they were unable to do
before.
Alright, I know that might open the "Can of Worms" and make you ask more questions, if that's the case, feel free to use the comment field and we can have some more discussions around this topic.
In addition to this, I would highly recommend the following links from VMware Technical Marketing teams who have done a great job to deep dive into the vSphere Replication technology and discussed "Behind the Scenes" of this feature :-
Interesting and useful sharing...
ReplyDeleteThanks Deval..
ReplyDelete