Quantcast
Channel: VMware Communities : Discussion List - Availability: HA & FT
Viewing all 845 articles
Browse latest View live

FT Testing

$
0
0

I just enabled fault tolerance on one of my servers, I had to format the drives to Thick eager and change the CPU/MMU to Intel VT-x for MMU but after that it turned on successfully

 

When I tried testing it today by shutting off the primary protected machine it did not switch over to the secondary

 

when I right click the primary and go to FT and choose test failover that seems to work

 

my question is shouldn't I be able to shut down the primary simulating a fail? I tried shutting down gracefully first and then I tried just a hard shutdown by powering it off both of these ways did not trigger a failover.

 

I was going to try and disconnect the nic but when I do try that I get an error " Hot plug of device '0' is not supported for this virtual machine." Reconfigure virtual machine
server Unsupported virtual machine configuration-on for Fault Tolerance.View details...and it won't let me disconnect


What happens to a VM when the sole uplink port on a vSwitch goes down?

$
0
0

Hi all,

 

I have a question regarding HA of a VM when it's network is lost on the host it is running on.

 

In this scenario there is a 3 node ESXi 5.1 cluster.  Each host has a vSwitch with a single uplink port to a dedicated network (there are also other vSwitches in use for other networks with multiple uplinks) HA and DRS are enabled on the cluster.

What happens to VMs running on the vSwitch with the single uplink port if this uplink goes down?  Do they vmotion to another host which has the same network or do they stay as they are, just disconnected from the network?

Is there any way (other than multiple uplinks) to protect this VM from permanent network outage?

 

Thanks.

HA test: some VMs are disconnected

$
0
0

Hi,

 

I ran a test to check HA on my 2 hosts cluster by powering off one of the hosts.

Some VMs got restarted on the second host but 5 of them are now shown up and disconnected on the failed host :-(

The message I got in the Events log was "Not enough resources to failover....vSphere HA will retry when resources become available".....The same message I got for the VMs that finally restarted on the second host.

I'm running ESXi 5.5.0.

 

Any idea ?

Thanks !

Eric

vCenter Disconnects from Master HA Agent

$
0
0

After a recent vCenter migration we are seeing the following events in our cluster taks & events:

 

vCenter Server is connected to a master HA agent running on host ...

vCenter Server is disconnected from a master HA agent running on host ...

 

These two entries always appear together at just about the same time every 5 minutes.  The server in question for each cluster is the master node.

 

vCenter Version / Build: 5.0.0 623373

Cluster Version / Build: ESX v4.1.0 (702113) / ESXi v5.0.0 (623860)

 

The clusters all appear to be functioning normally and the events are just info / warning - however it would be nice to get to the bottom of them.  We recently migrated our vCenter instance to a new VM on a different subnet.  We did go through and ensure that all of the hosts were properly disconnected / re-connected after the IP change.

 

I have started looking through the local logs on the master nodes in question and have found one entry that appears to align with the timing of the above events is:

 

vpxa.log:

 

2012-06-12T10:04:25.374Z [FFCBFAC0 error 'SoapAdapter.HTTPService'] HTTP Transaction failed on stream TCP(local=127.0.0.1:0, peer=127.0.0.1:61618) with error N7Vmacore15SystemExceptionE(Connection reset by peer)

 

I have read through some KBs regarding issues with DNS servers etc - I have confirmed that all DNS servers are reachable etc (this looks like the localhost address anyways...nost sure what DNS would do with that.)

 

Anyone see this? -- More importantly, anyone resolve this!?

After HA Event, VM's are shutdown

$
0
0

We have been experiencing an ongoing issue where VM's are not migrated to a new host during an HA event. Here are our current cluster settings:

 

Admission Control is set to Disable

VM restart priority is set to Medium

Host Isolation response is set to Leave powered on

 

The hosts are 4.1, within a 5.0 vCenter. We do have enough failover capacity for a single host failure. The problem we have is if a host with 5 VM's has a host failure (hardware or other), 3 of the VM's will migrate to a new host without issue. The remaining 2 VM's do not migrate, and are left in a powered off state until they are manually powered back on.

 

Not sure if it is relevant or not, but one key difference in the VM's that do and do not failover are the ones that do not failover are configured to use RDM's.

 

Is there any reason why some VM's would migrate fine, while others will not? Is there anything we can check out? Thank you in advance.

vSphere Application Monitor GuestSDK with kdump

$
0
0

Hi,

 

 

 

I'm trying to run vSphere Application Monitor GuestSDK with kdump.

It seems that "das.iostatsInterval" parameter (default value=120s) doesn't work well in this case.

If "das.iostatsInterval" is available, vSphere HA should wait until kdump completes.

Is this the expected behavior?

 

 

 

Here is what I tried;

I changed vSphere HA's application monitor interval from 30s (default) to 3s temporarily.

 

 

 

[root@node01 ~]# ls -l /opt/GuestSDK

total 20

drwxr-xr-x 4 201 201 4096 Aug 18 02:50 bin

drwxr-xr-x 3 201 201 4096 Aug 18 02:50 docs

drwxr-xr-x 2 201 201 4096 Aug 18 02:50 include

drwxr-xr-x 4 201 201 4096 Aug 18 02:50 lib

drwxr-xr-x 3 201 201 4096 Aug 18 02:50 vmGuestLibJava

 

[root@node01 ~]# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/GuestSDK/lib/lib64/

[root@node01 ~]# export BIN=/opt/GuestSDK/bin/bin64

[root@node01 ~]# $BIN/vmware-appmonitor enable

[root@node01 ~]# $BIN/vmware-appmonitor getAppStatus

green

 

 

 

Call vmware-appmonitor repeatedly;

[root@node01 ~]# while true; do $BIN/vmware-appmonitor markActive; sleep 1; done &

[1] 2429

 

 

 

[root@node01 ~]# date

Fri Jan 10 18:08:56 JST 2014

 

 

[root@node01 ~]# echo 1 > /proc/sys/kernel/sysrq

[root@node01 ~]# echo c > /proc/sysrq-trigger

 

 

 

... rebooting VM ...

 

 

 

[root@node01 ~]# ls -lh /var/crash/127.0.0.1-2014-01-10-18\:09\:08/

total 63M

-rw-r--r-- 1 root root 78K Jan 10 18:09 vmcore-dmesg.txt

-rw------- 1 root root 99M Jan 10 18:09 vmcore-incomplete <--- incomplete 

Is FT for multi vCPUs coming soon?

$
0
0

Is anyone aware if VMware plans to support multi vCPU FT in the coming future?  (vSphere 5.5?, if so what is the release date?)  If not, does anyone know of a software plugin that can be used to support Multi vCPU Fault Tolerance within vCenter / vSphere?

Checking Ha DRS enabled

$
0
0

How  do we check that HA DRS configuration is enabled ? I am looking for some sought of file where this information is available.


When is event 'VM restarted on alternate host' generated?

$
0
0

I am trying to setup an alarm which would be triggered when event 'VM restarted on alternate host' is generated.

 

My expectation is that this event would occur when a VM is failed over to the alternate host in a HA setup. However this is not the case.

 

I have tried the following -

 

1. Reboot one of the hosts in my 2-host HA cluster setup. - VMs failover to the alternate host in the setup however above event is not generated.

2. Try to simulate VM failure (by stopping heartbeat / kernel panic etc) but this results in HA just resetting the VM without a failover.

 

Is my understanding that the event would be generated when VM fails over to alternate host correct?

How do I generate this event?

Configuring HA help

$
0
0

Hello,I'm relatively new to VMware but hoping that someone can help me or point me in the right direction for the information i need to assist with configuring HA for selected VMs.

 

In our current setup we have two relatively small VMware hosts that are mainly using local storage for the VMs. These two hosts are on the same LAN but in different comms rooms seperated by a dark fibre link.

 

We have recently added a vCenter server with the intention to use vMotion to move servers between the two hosts and vSphere replication for the more important VMs. But we understand that to do this the vCenter server has to always be available.

 

To overcome this we have connected the two VMware hosts to some SAN storage via FC that we have available in each of the comms rooms and presented this as storage to the two hosts. So for example in Comms room A we have VM Host A connected to SAN A and in Comms room B we have VM Host B connected to SAN B. The vCenter server VM is currently hosted on VM Host A & SAN A. This VM is replicated to SAN B at the hardware level by the SAN. Whilst SAN B is attached to Host B the vCenter VM data will only be read only until we tell the SAN controller what is the primary data. Does vSphere interact with the SAN to do this automatically and where is this configured? Is this part of the HA configuration?

 

Basically we want to use this small amount of SAN storage to keep our vCenter server always available if we were to lose one of the Hosts or one of the Comms rooms. Is this something that someone has configured and could point me in the right direction?

 

I have added the two hosts into a vCenter cluster (although i haven't enabled DRS or HA yet) but we only want to selectively keep the vCenter server VM as the only VM that is using HA for the time being. Is this possible and how?

 

We are using vSphere 5.1 U1 and vCenter Server 5.5.

 

Thanks a lot for any help.

Why VMs do not auto power on when shutdown Esxi 5.5 by Console (F2) in HA clustering.

$
0
0

Dear all .

- I have problem when testing in Vcenter HA . Details

1. When shutdown Esxi by Vcenter . The Vms machine auto move to another host and can auto turn on .

2. But when shutdown Esxi by login to console press F2 Login F12 to shutdown host . the Vms machine can move to another host but can not auto turn on power .

How to configuration Vms machine auto turn on when shutdown by console (F2-F12). ?

Thanks and best regards !

FT not working due to CPU compatibility and EVC mode

$
0
0

Hi guys,

 

1) can you explain me why FT doesn't work? even if I have running EVC mode. I thought, EVC ensure CPU compatibility for FT.

 

I have two servers with NFS and vSphere 55 in my lab. One server is

Intel Core i7-4770 Haswell 3.4GHz LGA 1150 84W

and second one is

Intel Xeon E3-1275 V2 Ivy Bridge 3.5GHz (3.9GHz Turbo) 8MB L3 Cache LGA 1155 77W

 

EVC is running in "Sandy Bridge" mode.

 

FT is enabled and first is on i7 and second is on E3, when I try start vm tu run this error is issued:

 

Cannot power on the Fault tolerance Secondary VM for virtual machine.

- 192.168.1.200 The CPU pair is not compatible for fault tolerance:

CPU model does not match

See KB 1008027.

- 192.168.1.201 Virtual machines in the same Fault Tolerance pair cannot be on the same host.

 

 

2) what do you thing if I change first CPU (i7) for Intel Intel Xeon E3-1230V3 Haswell 3.3GHz LGA 1150 80W Quad-Core Server Processor

 

should it work?

 

 

Thank you,

 

Tomas

Anti-Affinity Rules with vSphere 5.1

$
0
0

We are researching the ability to use Anti-Affinity rules to ensure the same Host is not running VM's running a N+1 pair of the same applications, we found some contradicting references (VMware KB: Affinity or anti-affinity DRS rules are not applied during a virtual machine power-on vs http://frankdenneman.nl/2012/02/06/sdrs-anti-affinity-rule-types-and-ha-interoperability/) that indicate that the anti-affinity rules would not apply with HA activated and that only once the VM was moved to a new Host would the DRS Anti-Affinity rule kick in. We are looking for a best practice recommendation on whether we need 2 clusters or we can use 1 cluster with Anti-Affinity rules to ensure we don't have the same application N+1 Virtual Machine running on the same host. Does someone have some best practices or field experience to share on this topic?

 

Thanks,

Frank

The vSphere HA availability of this host has changed to Unreachable

$
0
0

We recently deployed a new vSphere 5.0 U1 HA-enabled cluster.  I noticed that a couple of times on an otherwise healthy cluster, after taking a host out of maintenance mode, the last HA event for the host says "The vSphere HA availability of this host has changed to Unreachable" as opposed to it becoming a slave.  The fix has been to disable and re-enable HA on the cluster.  Trying to figure out why this occasionally happens.  Known issue?

 

As a side note - Since this does not produce an alert and can easily get missed, I wanted to setup a custom alarm for it.  Setup the custom alarm using "The vSphere HA host availability state changed", but not sure how exactly to filter to only see the "unreachable" events.

Storage DRS - can it be used for fault tolerant storage?

$
0
0

I am using vSphere 5 and have implemented HA and vMotion on a two-node test cluster comprised of Dual-Xeon blades. On one of these blades (Blade A), I have a VM configured as a virtual storage node which uses multiple drives on the blade, implements ZFS and exports the volume over NFS to both ESXi instances. This volume is where my VMs reside. The issue, of course, is that if the Blade that I am using for storage goes down, I will be in a pickle. I am wondering if I can get some fault tolerance for my storage if I implement a second virtual storage node in a VM running on Blade B, use its drives as a second ZFS volume exported via NFS and mounted on both Blades, and then implement Storage DRS on both these NFS volumes. Everything I've read about Storage DRS suggests that it only places VMs optimally on clustered volumes based on performance criteria. Is there a way to use it to implement redundancy and hence, fault tolerance?


Fault Tolerance with Multiple NICs

$
0
0

Apologize if this has been asked before.

 

I am setting up a vSphere 5.5 environment and I have enough 1GB NICs per host to dedicate two NICs per host for Fault Tolerance.  I was wondering the best way to set this up for maximum network throughput.  Do I set it up like Multi-NIC Vmotion?  Like stated here: VMware KB: Multiple-NIC vMotion in vSphere 5. Basically using two Port Groups like FT-1 and FT-2.  Or do I have to do it like Multi-NIC iSCSI with separate vswitches and each vswitch with a one of the Port Groups.

 

Thanks in advance.

What scenarios are you/customers setting the VM restart priority to "Disabled"?

$
0
0

The HA dev team would like to better understand why customers are using VM restart priority of "disabled".

 

Some specific questions:

  1. In what scenarios are you/customers setting the VM restart priority to "Disabled"?
  2. In these scenario(s), is VM/App monitoring ever enabled? Note, VM/App monitoring is turned off by default
  3. If you/customers are setting restart-priority=disabled to keep a VM on a specific host, could a required VM-to-host rule be used instead? This would allow HA to restart the VM when the host comes back up after a failure.

 

BACKGROUND:

What is "Vm Restart Priority" setting? In a vsphere HA cluster, if a host fails and its VMs need to be restarted, the restart order can be controlled using "Vm Restart Priority"Setting. The values for this setting are: Disabled, Low, Medium (the default), and High. If you select "Disabled", vSphere HA is disabled for the virtual machine, which means that it is not restarted on other ESXi hosts if its host fails. The Disabled setting is ignored by the vSphere HA VM/Application monitoring feature since this feature protects virtual machines against operating system-level failures and not virtual machine failures.

 

Reference:

http://pubs.vmware.com/vsphere-55/index.jsp#com.vmware.vsphere.avail.doc/GUID-FA8B166D-A5F5-47D3-840E-68996507A95B.html

configured failover capacity n/a

$
0
0

Hello,

 

My HA cluster setup don't seem to work.

Failover of the VM's is not working.

 

I noticed "configured failover capacity n/a" on the summary tab of the cluster.

 

I been searching across the internet but don't find a solution.

 

Can somebody help me please.

 

I am testing out for my boss.

We are currently working with linux (drbd and KVM) the virtualisize.

But we want to change to a better solution.

 

Thanks !

 

John L.

Potential failover solutions

$
0
0

Hi,

fairly new to VMware, so bare with me... Hope this is the right forum...

 

We have a site remote from the main datacentres, it has one host (ESXi 5.5) with 3 VMs. Will call this site H.

 

One of the data centres is a HyperV environment - can I say that on here /joke! - will call this site L.

The other data centre is going to have all kit replaced, existing equipment is being replaced at some stage in 2014. Will call this site I.

 

I would like to have some form of failover in the event of a hardware failure, automatically would be great.

 

 

Site connectivity is not ideal...

 

Site H is connected to Site I on a 10MB MPLS connection, that is getting towards congested.

Site H is connected to Site L on a 100MB P2P connection, that is operating nicely.

 

 

The servers are

1. Domain controller - file shares, DHCP, DNS, printing, AV distribution

2. SQL production

3. SQL beta/development/training environments

 

Any thoughts or directions to research will be appreciated!

getting vSphere HA errors please help

$
0
0

We have 4 ESXi hosts

I have ensured that they can all ping each other successfully using vSwitch0 (which purpose is the MGMT + vMotion)

 

I am seeing the following errors:

 

"The vSphere HA agent on this host cannot reach some of the management network addresses of other hosts, and HA may not be able to restart VMs if a host failure occurs: server.domain.local; 192.168.x.x"

 

"The vSphere HA availability state of this host has changed to Slave"

 

 

"vSphere HA detected that this host is in a different network partition than the master to which vCenter Server is connected"

 

Any help is appreciated.

Viewing all 845 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>