Chủ Nhật, 8 tháng 4, 2012

Remove the “host currently has no management network redundancy”

Remove the “host currently has no management network redundancy” warning from your whitebox HA enabled ESX cluster

When you completed building your ESX cluster from so called whitebox machines, you might see a warning sign at the cluster level. It will tell you the management network has no redundancy. This is probably correct because whitebox clusters usually don’t have 2 NIC’s for the management network.

To loose this irritating warning message do the following.
  1. Go to the properties of your cluster
  2. Select HA from the left pane
  3. Click the ‘Advanced Options’ button
  4. Fill in the first column of the first row by double clicking and typing the value ‘das.ignoreRedundantNetWarning’
  5. Fill in the second column of the same row by double clicking and typing the value ‘True’
  6. Close the Advanced Options window
  7. Now deselect the option ‘Enable HA’ and press OK
  8. HA will be disabled, this will take some time
  9. Go back to the options and select ‘Enable HA’ and press OK
  10. HA will be enabled and the warning will be gone

 

Thứ Sáu, 6 tháng 4, 2012

Vmware HA

Vmware HA: The number of heartbeat datastores for host is 1, which is less than required: 2

 

das.ignoreinsufficienthbdatastore – Disables configuration issues created if the host does not
have sufficient heartbeat datastores for vSphere HA. Default
value is false.
To disable the message, you will need to add this new advanced setting under the “vSphere HA” Advanced Options second and set the value to be true.



Supported vSphere 5.0 Advanced Options

There are several advanced configuration options that one can use to modify the operation of vSphere HA 5.0.  The following lists the supported options that relate to vSphere HA 5.0.  As the use of these options can significantly impact the operation of vSphere HA and hence the availability protection provided, it is highly recommended that users fully understand the use and ramifications of using these options.
A KB Article will be released soon for reference.
 HA Advanced Options
The following advanced options can be configured on a per-cluster basis through the use of the HA Advanced Options section of the user interface.  In some cases, the use of an option requires you to reconfigure vSphere HA on all hosts before the option takes effect.
Name
Default
Valid values
Description
Reconfig?
das.isolationAddressX


Sets the address to ping to determine if a host is isolated from the network. This address is pinged only when heartbeats are not received from any other host in the cluster. If not specified, the default gateway of the management network is used. This default gateway has to be a reliable address that is available, so that the host can determine if it is isolated from the network. You can specify multiple isolation addresses (up to 10) for the cluster: das.isolationaddressX, where X = 1-10. Typically you should specify one per management network. Specifying too many addresses makes isolation detection take too long.
Y
das.allowNetworkX


This option is only recommended for ESXi 3.5 hosts. To control the network selection for ESXi 4.0 and more recent hosts, use the UI to specifiy the port groups that are to be used for management. Port group names of management networks to use for HA communication. X should be replaced by 0-9. If not used, all appropriate networks will be used.
Y
das.useDefaultIsolationAddress

true/
false
By default, vSphere HA uses the default gateway of the console network as an isolation address. This attribute specifies whether or not this default is used
Y
das.isolationShutdownTimeout
300

The period of time in seconds the system waits for a virtual machine to shut down before powering it off. This only applies if the host's isolation response is Shut down VM.
N
das.maxvmrestartcount
5

Defines the maximum number of times a HA master agent will try restart a VM after a failure before giving up and reporting it was unable to restart the VM.
N
das.maxftvmrestartcount
5

Defines the maximum number of times a HA master agent will try to start a secondary VM of an vSphere Fault Tolerance VM pair before giving up and reporting it could not.
N
das.ignoreRedundantNetWarning
false
true/
false
Suppress the host config issue about lack of redundant management networks on a host in a HA enabled cluster
N
das.vmMemoryMinMB
0

Defines the default memory resource value assigned to a virtual machine if its memory reservation is not specified or zero. This is used for the Host Failures Cluster Tolerates admission control policy
N
das.vmCpuMinMHz
256

Defines the default CPU resource value assigned to a virtual machine if its CPU reservation is not specified or zero. This is used for the Host Failures Cluster Tolerates admission control policy. If no value is specified, the default is 256MHz.
N
das.slotCpuInMHz


Defines the maximum bound on the CPU slot size. If this option is used, the slot size is the smaller of this value or the maximum CPU reservation of any powered-on virtual machine in the cluster.

das.slotMemInMB


Defines the maximum bound on the memory slot size. If this option is used, the slot size is the smaller of this value or the maximum memory reservation plus memory overhead of any powered-on virtual machine in the cluster.

das.includeFTcomplianceChecks
true
true/
false
Controls whether vSphere Fault Tolerance compliance checks should be run as part of the cluster compliance checks. Set this option to false to avoid cluster compliance failures when Fault Tolerance is not being used in a cluster.
N
das.maxFtVmsPerHost
4
0 means no limit
Defines the maximum number of vSphere Fault Tolerance primary of secondary VMs that can be placed on a host during normal operation.When a value greater than zero is defined, attempts to power on more than the specified number of FT VMs on a the same host will fail. Further, vSphere DRS, if enabled, won’t exceed this limit. However, vSphere DRS won’t correct any violations of the limit and vSphere HA will ignore the limit when responding to a failure.
N
das.ignoreInsufficientHbDatastore
false
true/
false
Suppress the host config issue that the number of heartbeat datastores is less than das.heartbeatDsPerHost
N
das.heartbeatDsPerHost
2
2-5
Defines the number of required heartbeat datastores per host. vCenter Server will attempt to chose the specified number and if it cannot, will report a configuration issue on the host. This issue can be suppressed using the das.ignoreInsufficientHbDatastore option.
Y

HA Agent (FDM) Configuration Options

The following options are set on a per-host basis by editing the fdm.cfg file on the host.  Alternately, these can also be set on a per-cluster basis through the vSphere Client by prepending ‘das.config.’ to the option name.  Use of any of these options requires a restart for them to take effect.

Name
Default
Description
Cluster Manager
fdm.deadIcmpPingInterval
10
ICPM pings are used to determine whether a slave host is network accessible when the FDM on that host is not connected to the master. This parameter controls the interval (expressed in seconds) between pings.
fdm.icmpPingTimeout
5
Defines the time to wait in seconds for an ICMP ping reply before assuming the host being pinged is not network accessible.
fdm.hostTimeout
10
Controls how long a master FDM waits in seconds for a slave FDM to respond to a heartbeat before declaring the slave host not connected and initiating the workflow to determine whether the host is dead, isolated, or partitioned.
fdm.stateLogInterval
600
Frequency in seconds to log cluster state.
fdm.nodeGoodness
0
When a master election is held, the FDMs exchange a goodness value, and the FDM with the largest goodness value is elected master. Ties are broken using the host IDs assigned by VC. This parameter can be used to override the computed goodness value for a given FDM. To force a specific host to be elected master each time an election is held and the host is active, set this option to a large positive value.  This option should not be specified on a per-cluster basis.
Inventory Manager
fdm.ft.cleanupTimeout
900
When a vSphere Fault Tolerance VM is powered on by vCenter Server, vCenter Server informs the HA master agent that it is doing so. This option controls how many seconds the HA master agent waits for the power on of the secondary VM to succeed. If the power on takes longer than this time (most likely because vCenter Server has lost contact with the host or has failed), the master agent will attempt to power on the secondary VM.
fdm.storageVmotionCleanupTimeout
900
When a storage vmotion is done in a HA enabled cluster using pre 5.0 hosts and the home datastore of the VM is being moved, HA may interpret the completion of the storage vmotion as a failure, and may attempt to restart the source VM. To avoid this issue, the HA master agent waits the specified number of seconds for a storage vmotion to complete. When the storage vmotion completes or the timer expires, the master will assess whether a failure occurred.
Policy Manager
fdm.policy.unknownStateMonitorPeriod
10
Defines the number of seconds the HA master agent waits after it detects that a VM has failed before it attempts to restart the VM.
FDM Service
fdm.event.maxMasterEvents
1000
Defines the maximum number of events cached by the master
fdm.event.maxSlaveEvents
600
Defines the maximum number of events cached by a slave.

VPXD Configuration Options
The following options are configure the behavior of vpxd.  Editing the vpxd.cfg file, which will affect all clusters in the inventory of vCenter Server, sets them.  All of these options require a restart of vpxd in order for them to take effect.
Name
Default
Description
vpxd.das.reportNoMasterSec
120
How long to wait in seconds before appending a cluster config issue to report that vCenter Server was unable to locate the HA master agent for the corresponding cluster.
vpxd.das.sendProtectListIntervalSec
60
Minimum time (in seconds) between consecutive calls by vCenter Server to the HA master agent to request that it protect a new VM.
vpxd.das.aamMemoryLimit
100
Memory limit in MB for AAM resource pool (used for FDM)
vpxd.das.electionWaitTimeSec
120
When configuring HA on a host, how long to wait in seconds after sending the host list to a new host for the FDM to become configured (change to master or slave state)
vpxd.das.heartbeatPanicMaxTimeout
60
Defines the value HA uses (in seconds) when configuring the host Misc.HeartbeatPanicTimeout advanced option
vpxd.das.slotMemMinMB
0
Default value in MB to use for memory reservation if no user value is set on any VM. Use to compute the slot size for HA admission control.
vpxd.das.slotCpuMinMHz
32
Default value in MHz to use for cpureservation if no user value is set on any VM. Use to compute the slot size for HA admission control

Thứ Ba, 3 tháng 4, 2012

Automatic Update NTP At Boot Time


Resetting the clock at boot time with ntpdate

The system clock is initialized from the hardware clock by the /etc/init.d/boot.clock script. (At least this is true in SuSE 8.1 and later; when booting Red Hat Linux 6.x, the system clock is initialized by /etc/rc.d/rc.sysinit instead.) Unfortunately, when this runs, the network has not yet been fully initialized, so it is not possible to query servers. Instead, it is better to run ntpdate from the /etc/init.d/boot.local script (called /etc/rc.d/rc.local in older versions), which is run as the very last thing when booting up (in run levels 2, 3, and 5, which is appropriate). Setting the system clock twice like this may leave it off by a bit during boot, but the amount by which it is off can be limited for the next boot by resetting the hardware clock after initializing the system clock from the NTP servers. You can do all of this by adding the following two lines to the end of your /etc/init.d/boot.local script (after replacing the server names):
    ntpdate -sb server1 [server2 ...]
    hwclock --systohc
The next time you boot, the hardware clock will only be off by the amount of drift between boots. (You should first check to see that you have hwclock on your system.)

The "-b" option forces the system clock to be set in one jump, rather than attempting to slew it gradually, and is recommended by the ntpdate documentation page when booting.
Note that this double clock setting procedure is essentially equivalent to what the standard boot-time startup scripts for ntpd do, so making the clock jump back and forth at boot time can't be all that bad. (It can make the log files harder to decipher, though.)
Unfortunately, the system boot scripts are very vendor-dependent, so this recipe may not work for your configuration. If there is an /etc/init.d/rc.local 2or /etc/init.d/boot.local script, it probably works the same way; otherwise, you will need to figure out something different for your flavor of Linux/*BSD/etc.