Post credited to:
The Tekmart Support Team.
When an EqualLogic Controller Fails
The procedures may vary when a control module (member) is removed or fails depending on the number and type of control modules, the network cabling configuration, and the cache mode settings. Please see the Hardware Maintenance manual for your array model for detailed information about array-specific control module and network failover behavior. Use this guide at your own risk. SPS Pros is not responsible for faults, failures, data loss, etc. as a result of following this guide.
In a dual control module array, if the secondary control module is removed or fails, the remaining control module may enter write-through mode depending on the cache mode policies.
In a dual control module array, if the active control module fails, the secondary control module automatically takes over and become active. If there is a network connection to the active control module, the control module failover will be transparent to applications; however, iSCSI initiators must reconnect to the group IP address.
If the only functioning control module fails, the controller will be inaccessible from the network and data loss is possible.
If a control module fails, replace it as soon as possible with the same control module type.
Caution: Do not mix control module types in an array.
For information about replacing a control module, see the Hardware Maintenance manual for your array model or contact your array support provider.
Do not remove a failed control module until you have a replacement.
NOTE: For proper cooling, do not leave a control module slot empty. If an array will operate for a long time with only one control module, you must install a blank control module in the empty slot. You can order a blank control module from your PS Series array service provider. If you remove the active control module, there will be a short interruption as failover to the secondary control module occurs.
How to Tell if Dell EqualLogic Controller has Failed
You can identify a failed control module by:
- LEDs – failed control module may appear as ACT LED no color, ERR LED is red, PWR LED is Off
- Messages – A message on the LCD panel (located behind the bezel), on the console, in the event log, or in the Group Manager GUI Alarms panel describes a control module failure
- Group Manager GUI and CLI Output – The Member Controllers window or the member select “show controllers command output” shows the control module status “not installed”.
If a controller/control module has failed you must replace it.
How to Replace a Dell EqualLogic Controller
- Follow safety and/or ESD protocols.
- Make sure the faulty controller is the SECONDARY controller. If not, you must fail over the array (see below).
- With faulted controller in the secondary position, disconnect all of the cables (noting their location).
- Remove the controller by operating latches. The array should continue to function on the active controller.
- Remove the flash card from the faulty controller. (May be a compact flash card or micro SD card)
- Insert the flash card into the replacement controller.
- Correctly orient and insert the replacement controller and ensure it is properly seated.
- Reconnect cables to the replacement controller.
- Check LEDs and GUI to ensure that the replacement controller has come online.
NOTE: After replacing the controller if the array reports a critical hardware error with no hardware issues showing, remove the new controller following the steps above, and wait 60 seconds to reseat it. This should clear the error.
If two control modules are installed but only one appears in the GUI or CLI, the control module may not be properly installed. Re-install the control module. If both control modules still do not appear in the GUI or CLI, they may not be running the same firmware. Contact your array support provider
How to Fail Over Dell EqualLogic Controller Array
To fail over one control module to another, you must “restart” the Active control module, which will force the Secondary controller to take over and make the “currently active” controller the original Secondary controller.
- In the GUI (Group Manager) click on “Members” and select the array in question.
- On the tabs on the right side of the GUI, click on “Maintenance”.
- Locate and click on the “Restart” button.
- You will be prompted for the grpadmin password.
NOTE: You will see alerts such as “Unable to communicate to the other control module, active fail over cannot occur”. This should resolve after a minute or two once the restart has completed.