Intel
®
NetStructure
TM
ZT 7102 Chassis Management Module
CMM Redundancy, Synchronization and Failover
42 Technical Product Specification
Note: The frequency of the ping to the first trap address can vary from one second to approximately 20
seconds.
2. Unhealthy Ethernet Switch:
A condition is asserted and a health score is computed if the active CMM’s corresponding
Ethernet switch is not healthy or not present. The switch health is determined by the state of
the HEALTHY# hardware signal coming from the Ethernet switch. Refer to the chassis
specification to see which switch corresponds to the CMM. If both CMMs have unhealthy
switches or are not present in the chassis, then a failover can still occur based on other failover
conditions depending on the CMM health scores.
3. Critical events on the active CMM:
A condition occurs if the active CMM has critical events for any of the CMM sensors (not
chassis or blade sensors). Critical events are events associated with crossing an upper or lower
nonrecoverable threshold of a sensor. If both CMMs have critical CMM events, then the
number of major and minor CMM events is examined to decide if a failover should occur. The
number of major events is compared, and if they are equal, the number of minor events is used.
4.4.4 Scenarios That Failover to an Equally Healthy CMM
The following conditions will cause a failover only if the health score of the standby CMM is equal
to that of the active CMM:
1. The ejector latch on the active CMM is opened.
2. A manual failover is executed on the active CMM.
4.4.5 Failover Timing
Times required to detect different possible failover conditions and perform data synchronization
vary. For example, detecting network connection loss can take up to approximately 20 seconds.
Complete synchronization typically takes 7 to 30 seconds to occur, assuming both CMMs are fully
booted and a healthy Ethernet network connection and IPMB connection exist between the two
CMMs). Synchronization with a newly inserted CMM can take two minutes, since a newly inserted
CMM needs that time to boot and initialize.
Once the CMM data is initially synchronized, failover happens instantaneously at the hardware
level. However, the CMM software requires some time to initialize various components following
a failover. Software-based remote management applications accessing the CMM will need to
reconnect to the newly active CMM. The newly active CMM may respond with unexpected errors
while initializing.
4.4.6 Manual Failover
The following command can be issued to the active CMM to cause a failover manually to the
standby CMM:
cmmset -l cmm -d failover -v 1
A manual failover can only be initiated on the active CMM. A failover will only occur if the
standby CMM is at least as healthy as the active CMM. Once the command executes, the former
standby CMM immediately becomes the active CMM.
If the failover could not occur, the CLI will indicate the reason why the failover could not occur,
and a SEL event will be recorded.