Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

HA error

HA error

For the past few weeks we have been getting a weird HA error.

There are errors on Events tab:
Ha agent on XXXESX1 in cluster XXX in cluster XXX in datacenterXXX has an error.

After that there is messages of:
Insufficient resources to satisfy HA failover on cluster XXX in datacenterXXX
Ha agent on XXXESX2 in cluster XXX in cluster XXX in datacenterXXX has an error.
Unable to contact a primary HA agent in cluster XXX in datacenterXXX

-after this the virtual machines are disconnected (they keep running though)

ssh console connection to esx gives an error:
resource temporary unavailable

It also seems that this happens on some sort of regular intervals.
Atleast the times when the first ESX has reported the error seems to be roughly the same (at 4:50 am).
There are roughly 4-5 days between, before the error reoccurs.

Currently the only fix we had, is to reboot both ESX servers.

We have a setup of 1 virtual center (ver 2.5) and 2 ESX servers (ver 3.5).
We have only few virtual machines currently running.

Im happy to provide any additional information.
Thanks for all your help.

RE: HA error

Have you got the required entries in the ESX host files on both ESX servers?  Also if you disable and re-enable HA, what happens?

"Insert funny comment in here!"

RE: HA error

Host entries are as follows:
ESX1 has: localhost.localdomain localhost esx1.domain.com esx1

ESX2 has basically the same, but with ESX2 information.

We have tried to reconfigure HA host. It gives the following events:
1. Ha is being disabled on ESX2 in cluster XXX in datacenter XXX.
2. Error detected on ESX2 in XXX : cmd remove failed
3. Unable to contact a primary HA agent in cluster XXX in XXX


RE: HA error

Stop ha on both esx servers, Then reenable. Both. I had a similar problems. Also make sure you can ping each by name.

RE: HA error

Just a follow-up...
We have called vmware support and it seems the problemwas caused by Pegasus.
According th vmware Pegasus is something that is used for ESX Health Status feature. It seems that Pegasus was launcing new processes on the ESX servers. Stopping the pegasus service and disabling it (from autostarting) allowed us to restart the HA and it seems to be working OK for the moment.

The reason for the pegasus odd behavior is currently unknown and their are looking into it. If and when it gets resolved ill post results here.

RE: HA error

Can your ESX servers ping each other by short name and long name (ie by esx1 and by esx1.domain.com as well)?

VMWare recommend that if you are using the hosts file for resolution, you must include both short name and long name (as you have done) but include entries for all ESX servers

"Insert funny comment in here!"

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close