Hard drives are the lifesource of your business, whether on your computers and laptops, or within the network infrastructure of your business through servers, SANS, RAIDs and more.
No matter how well-designed or sturdy a hard-drive may be; all hard drives will eventually fail. Sometimes a drive will show symptoms of an impending fail, allowing users time to back up their data and search for a replacement.
Signs of Hard Drive Failure Include:
Abnormal heat output
Whirring, clicking, or other sounds
Other times, hard drives will fail without warning – and that total failure can result in the loss of all data from that particular drive. Data recovery process can be expensive, time-consuming, and result in loss of business, and ultimately may not be successful in recovering hundreds of rands’ worth of digital media, thousands of rands’ worth of customer records, financial records, processes and training documents, or more.
Considerations When Buying a Used Hard Drive
If your business is using legacy equipment, replacement parts may have be EOL by the original manufacturer such as EMC, IBM, Dell, Equallogic, Sun, etc. When this happens, if you do not have any spares on-hand, the used/refurbished market is your best bet for finding a replacement.
If you do not have any contacts in the used market, you may be tempted to turn to eBay. Many reputable used companies sell on eBay, but many parts listed on eBay are sold by liquidation companies who do not have the means to test the equipment they acquire, and possess little knowledge about what they’re listing outside of information presented on the label itself.
The danger from buying hard drives on eBay include:
1. Item may not function 2. Item may be listed incorrectly. 3. Hard drive may not have been wiped. 4. Seller may be overseas, or care little for returning the item or troubleshooting problems 5. Limited stock, seller may not be able to replace equipment 6. Risk further downtime dealing with slow shipping or incorrect/faulty products
How to Buy Used Enterprise Equipment Online
If you’re going to buy used hard drives, or other failed IT equipment, it’s pragmatic to buy from a professional and reputable used IT equipment company. Not only can they test equipment and likely have a quality control program in place, they will also have a DOA return policy in place and offer great customer service.
1. Do a Google (or other search engine) search using the part numbers on your failed hard drive. Hint: Using the manufacturer part number may provide the most accurate search results. 2. Look for professional retail websites that offer secure online purchasing (look for the HTTPS in the url, a shield, or badges from Trustwave, Verisign, etc.) 3. Make sure the product listing has an “Add to Cart” or “Buy It Now” button – not just request a quote! 4. If you’re in a pinch, look for sites that offer same-day or overnight shipping. Be sure to read their shipping and return policies. 5. Look for sites with reviews on the used hard drive you will be buying. 6. Avoid sites that offer “instant quotes” or want you to call for pricing – these can take days waiting for responses and force you to compare prices and options from multiple companies. You can also end up on unwanted mailing lists from data mining.
Conclusion on Buying Used Enterprise Hard Drives Online
Buying one or more previously used hard drives can provide you with a quick and inexpensive way to bring your system back to operational. If you’re buying from a trustworthy, knowledgeable business, buying online can be fast, rewarding, and cost-efficient.
As your EMC CLARiiON, VNX, and AX series grow older, sourcing the exact part number replacements for hard drives can get harder and harder. This guide aims to educate you on how to determine the part number and see compatible part numbers for your system.
Determining EMC Hard Drive Part Numbers
There is a good chance that there are many part numbers listed on a single drive pulled from an EMC array. The generic EMC Model Number does not appear on a drive (e.g. EMC CX-SA07-010 1TB SATA Hard Drive). The disk part number (PN) appears on a label on the front of the disk carrier. This is a nine digit Top Level Assembly (TLA) Part Number like PN 005123456. There are several TLA part numbers that fall under the same EMC model number.
Example: Your hard drive has a TLA Part number reading 005048797. Your replacement has a TLA part number 005049070. These are both the same EMC hard drive model number CX-SA07-010 and are hot-swappable.
Finding TLA Part Number in Navisphere
Follow these steps to find TLA part number for a drive in a CLARiiON array:
Open Navisphere by typing in the storage processor’s IP address in a web browser.
Open array with the fault. This is usually indicated by a red “F.”
Open the Bus and Enclosure with the fault.
Right-click the disk above or below the disk with the fault and select properties. The TLA part number should be listed at the bottom.
Follow these steps to check and retrieve necessary information for single disk failure:
Check the current status:
1. Log in to Navisphere manager, right-click CLARiiON name and select “Faults.”
2. Confirm that the drive x_x_x is the only faulty drive that is showing as “Removed.”
3. Expand “LUN Folder” and expand “Unowned LUNs.” Make sure no user LUN is unowned. (It’s normal to see hot spares in the unowned LUNs section.)
Get the TLA Part number of the faulty disk:
1. Right-click SP A or SP B, select “View Events”, and click “Yes” to continue.
2. Click “Filter” in the new window, uncheck “Warning” and “Information,” and click “OK.”
3. Locate Event code “0x7127897c” and Description “Disk(Bus x Enclosure x Disk x) failed,” and double-click to open it.
4. Record the TLA part number in the description field. It is a 9-digit number starting with “005.”
5. Refer to following format:
Only one disk failure No Uknown LUN Disk Slot: x_x_x Disk P/N 005xxxxxx
Decoding EMC Model Part Numbers
First two numbers/letters in the EMC model part number indicate the product these drives are for.
CX – CX series AX – AX series VX/V2/VS/V3/V4 – VNX series
The next four series of numbers indicate Drive Type and Disk Speed (RPM), or in the case of some Fibre Channel drives Data Rate (GB/s) and Disk Speed (RPM)
2G10 – 2GB/s FC 10K 2G15 – 2GB/s FC 15K 2G72 – 2GB/s FC 7.2K 2S10 – 2.5″ SAS 10K 4G10 – 4Gb/s FC 10k 4G15 – 4GB/s FC 15K AF04 – 4GB/s FC SSD AT05 – ATA/SATA 5.4K AT07 – ATA/SATA 7.2K FC04 – 4 GB/s FC LP05 – Low Power FC 5.4K SA07 – SATA 7.2K S207 – SATA 7.2K SS07 – SATA 7.2K SS15 – SAS 15K PS15 – VNX SAS 15K VS07 – VNX SAS 7.2K VS10 – VNX SAS 10K VS15 – VNX SAS 15K
The last digits in an EMC part number indicate storage capacity.
If you install a 2 Gb legacy disk in a disk-array enclosure (DAE) on a 4 Gb bus, you cannot use the disk in a RAID group or thin pool until you change the bus speed to 2 Gb. You can change the bus speed with the Backend Bus Speed Reset Wizard, which is available from the Service option on the Navisphere Manager Tools menu. The speed reset operation reboots the storage processors.
EMC Hard Drive DAE Compatibility
Some general rules for EMC hard drive compatibility within the same DAE.
You can mix 2GB/s and 4GB/s in a single DAE, but the maximum speed will be 2 GB/s for buses connected to the DAE with both these models of disks.
CX-AT, and CX-SA model disks cannot co-exist with other disk models in the same DAE
FLARE (Fibre Logic Array Runtime Environment) OS is a custom Windows Operating System which runs the EMC CLARiiON and VNX Storage Systems. To function properly, disks in an EMC CLARiiON system require that each storage processor run minimum revisions of the FLARE Operating Environment.
The EMC Celerra products use a different version, called DART. EMC Symmetrix / DMX use the Enginuity Code.
FLARE Code version information is as follows:
(Explanation is limited to only to CX, CX3 and CX4 platforms.)
Generation 1: CX200, CX400, CX600
Generation 2: CX300, CX500, CX700 including iSCSI models
Generation 3: CX3-10, CX3-20, CX3-40, CX3-80
Generation 4: CX4-120, CX4-240, CX4-480, CX4-960 (last three digits are the number of drives it can support)
The FLARE Code is a set of 1 number, 2 sets of numbers, a fourth number, and a fifth set of numbers as follows:
1.14.600.5.022 (32 Bit)
2.16.700.5.031 (32 Bit)
2.24.700.5.031 (32 Bit)
3.26.020.5.011 (32 Bit)
4.28.480.5.010 (64 Bit)
First Digit – indicates the number of the machine this code level can be installed on. For the 1st and the 2ndgeneration of machines you should be able to use standard 2nd Generation code levels. CX3 code levels would have a 3 in front of it and so forth. These numbers will always increase as new Generations of CLARiiON machines are added.
Second Set of Digits– are the release numbers which are very important and provide additional features related to the CLARiiON FLARE Operating Environment. These numbers will always increase, 28 being the latest FLARE Code Version.
Third Set of Digits – The next 3 digits are the model number of the CLARiiON, e.g. CX600, CX700, CX3-20 and CX4-480. These numbers can be anything depending what the model number is.
Fourth Digit – The 5 here is unknown, its coming across from previous FLARE releases. It’s believed that this was some sort of code internally used at Data General indicating its a FLARE release.
Fifth Set of Digits – The last 3 digits are the Patch level of the FLARE Environment. This would be the last known compilation of the code for that FLARE version.
Using Navisphere to Determine FLARE Revision
In Navisphere manager the FLARE OE revision appears on the Software tab of the Storage System Properties dialog box for the system.
Log in to EMC Navisphere – https://ip of the array
Under Storage Management, right-click the storage controller, and select Properties.
Then, select the Software tab to view the FLARE-Operating-Environment revision number.
If you’re using CLI, use the navicli or naviseccli to enter the command “navicli ndu -list -isactive” and get a list of all active software on your array.
If this revision is lower than the minimum FLARE OE revision required for the disk, you must upgrade FLARE OE on the storage system before installing the disk. EMC recommends FLARE OE is upgraded with the CLARiiON Software Assistant in the Navisphere Service Taskbar (NST) or by using the Navisphere Secure Command Line Interface (CLI).
The following is a series of simple documents aimed at providing visual guidance on the process required upgrade the FLARE code on an EMC CLARiiON CX4 array.
A version of FLARE 30 is already installed on the array (although the steps for FLARE 29 and below are not significantly different)
You have access to EMC’s Unisphere Service Manager (USM) on a Windows platform
You have downloaded copies of the FLARE code, Recovery Image and Utility Partition Image.
These documents assume that:
These documents are not intended as a replacement for EMC’s official procedure guides, and should only be used in conjunction with documentation available from Powerlink (http://powerlink.emc.com). You may also refer to http://www.emc.com/cx4support for customized documentation.
The typical warnings apply when upgrading firmware on an array, particularly if it has valuable data on it. Always perform a back-up first.
Process for Upgrading FLARE Code on an EMC CX4 Array
There are 4 basic steps to complete an upgrade:
Installing/upgrading Unisphere Service Manager (covered in part 1)
Preparing for Installation (covered in part 2)
Upgrading FLARE code (covered in part 3)
Upgrading the Recovery Image (covered in part 4) and Upgrading the Utility Partition (covered in part 4).
Part 1: Installing Unisphere Service Manager on an EMC CX4
In the past, there was a software installation wizard that could be launched directly from the CLARiiON, but now the requirement is to use EMC’s Unsiphere Service Manager (USM).
This is a client-side application that can be used to perform basic maintenance tasks, including FLARE code upgrades.
Note: This application was known as Navisphere Service Taskbar (NST) prior to the release of FLARE Release 30.
The first step is to install/upgrade USM on the host that will be used to upgrade the array.
Figure 1.1 – Launch USM executable file
Figure 1.2 – Prepare to install
If an earlier version of USM is detected by the installer it will give you the option to upgrade, it is recommended to click Yes.
Figure 1.3 – Upgrade option
Read through the introduction and click Next.
Figure 1.4 – Introduction
Read through the License Agreement and select “I accept the terms of the License Agreement” and click Next to the continue.
Figure 1.5 – License Agreement
The next step involves the selection of the repository location. This is where USM will store diagnostic data (SP Collects) and items such as FLARE code downloaded from EMC. Use the default options unless there is a particular requirement to store it in a different location.
Figure 1.6 – Choose Repository Location
Click Next to view the Pre-Installation Summary.
Figure 1.7 – Pre-Installation Summary
After verifying the information, click the Install button. The old version of USM will be uninstalled.
Figure 1.8 – Uninstalling older version
Figure 1.9 – Installing Java Runtime Environment
Figure 1.10 – Installing Merge Module
Figure 1.11 – Installing Uninstall Option
Figure 1.12 – Installation Complete
The Installation process will complete, giving you the option to Launch the USM. USM can also be launched from Start>All Programs>EMC>Unisphere>Unisphere Service Manager>Unisphere Service Manager.
Figure 1.13- Launching Unisphere Service Manager through the Start menu
USM will launch with a default Login screen. Click on the Login button.
Figure 1.14 – Default Login Screen
A dialog box will appear requesting details of the array to connect to. This can be a hostname or IP address. Once entered, click Connect to continue.
Figure 1.15 – Enter hostname or IP address
At this point you’ll then be asked for credentials. You’ll also have the option to select whether you wish to use LDAP authentication, and whether the account you’re using is Local or Global.
Figure 1.16 – Enter credentials
This example uses a global, generic account on a test lab array. Once you’ve entered the appropriate credentials, click Login and USM will login to the array.
Part 2: Preparing for Installation
Once login into USM is successful, you’ll be presented with the following screen:
Figure 2.1 – USM – System
Click on the Software tab to access System Software, Disk Firmware, and Downloads. Click on System Software.
Figure 2.2 – USM Software Tab
There are three choices under System Software: Prepare for Installation, Install Software, and Download and Install Hot Fix. Click on Prepare for Installation.
Figure 2.3 – USM – Software – System Software
A dialogue screen should pop up with an installation wizard.
Figure 2.4 – Prepare for Installation Wizard
Select “Verify storage environment only” and click Next, as the software has been installed locally.
Figure 2.5 – Verify storage environment only
Verify once again that the correct array is logged into. Click Next.
Figure 2.6 – Storage System Credentials
Select the software for Installation. Click Browse.
Figure 2.7 – Software Selection
Select the CX4-Bundle that has been downloaded and click Open.
Figure 2.8 – Select CX4 Bundle
The software is then unpacked and transferred to the array.
Figure 2.9 – Unpacking Software
Figure 2.10 – Transferring Software
Once the software has been transferred successfully, click Next.
Figure 2.11 – File has been transferred successfully
IMPORTANT: A manual check is required for a number of conditions (some of which may be obscure) this is very important to read.
Click Next to continue.
Figure 2.12 – Manual Check for Other Conditions
USM then checks that the hosts attached to the array are capable of failover.
Figure 2.13 – Server Readiness for Software Update
Sometimes, even though the servers are fine, USM won’t be able to accurately gauge the readiness of the servers. Fortunately, there is an Override option, including the standard warning, available to select.
Figure 2.14 – Override HA Status for all servers warning
Click Next to proceed.
Figure 2.15 – Override HA Status for all servers
Next is a diagnostic information step. It is generally recommended to let the USM collect the information again. Select “Collect the diagnostic information again” and click Next. This will take a few minutes to complete.
Figure 2.16 – Collect Diagnostic Information
Figure 2.17 – Diagnostic Information Collection
Once this step is complete, click Next.
Figure 2.18 – Diagnostic Information Successfully Saved
There are a number of rules that are then checked. If any of the checks fail, there will be an option to fix it. This may take several minutes.
NOTE: If it can’t be fixed by USM, speak with your service provider about a fix before the code upgrade is completed.
Figure 2.19 – Rule Checks
Once completed, if there are only warnings, you can proceed with the installation assuming the warnings are are acceptable. You may click each warning to see more information about it. Click Next to proceed.
Figure 2.20 – Rule Check Complete
Figure 2.21 – Rule Check Warning Information
At this point it is necessary to select the NDU delay. It’s suggested you leave this at the default unless there is a good reason to change it. Click Next to continue.
Figure 2.22 – Non-Disruptive Upgrade Delay
Click Finish to complete.
Figure 2.23 – Finished
Part 3: Installing FLARE Code on an EMC CX-4 Array
Once the installation preparation process is completed, click on Install Software (Step 2).
Figure 3.1 – Install Software
As the Prepare for Installation step has already been completed, there is an option to perform an Express Install.
Figure 3.2 – Welcome to the Install Software Wizard
In this example, the Custom Install will be demonstrated to show all the steps that go into upgrading the code. Express installation will be demonstrated further. Click Next to proceed with the Custom Install.
Figure 3.3 – Custom Install
During the previous phase, the software was pre-staged for deployment. Confirm that it is the correct version and click on Next. At this point you could also choose to change the software you’re installing.
Figure 3.4 – Pre-Staged Packages
Once again, you’ll need to confirm the HA status of the servers attached to the array.
Figure 3.5 – Server readiness for software update
IMPORTANT: Availability issues with these servers must be addressed before continuing.
Figure 3.6 – Override HA status for all servers warning
Once any availability issues are addressed, select to override the warning and click Next to continue with the installation.
Figure 3.7 – Override HA status for all servers
USM will check the repository for diagnostic information.
Figure 3.8 – Diagnostic Information Step – Gathering information
If not long has passed since the Pre-installation and upgrading the software, it should be safe to use the existing diagnostic information.
Figure 3.9 – Diagnostic Information Step
Select “Use the existing diagnostic information” and click Next to continue.
Figure 3.10 – Diagnostic Information Step – Use existing information
USM then performs a series of Rules Checks to ensure that everything is in order for the installation to proceed.
Figure 3.11 – Rule Checks
Note: In this example, a scheduled activity will be interrupted by the NDU. While this isn’t a problem for the lab in the example, one should be mindful of such interruptions in a production scenario. If the results of the Rules Checks are satisfactory and there are no Errors, click Next to proceed to the next step.
Figure 3.12 – Rule Checks Warnings
USM then checks that processor utilization on the array is acceptable. The point of this is that, for a period of time while storage processors reboot, all of the array’s workload will be hosted by one SP. If the CPU is already getting belted, it might be wise to re-schedule. This check can be overridden, but it’s not recommended to do this on production arrays.
Figure 3.13 – Processor Utilization Check
In this example, the lab array is doing sweet FA, so click on Next to continue.
Figure 3.14 – Acceptable Processor Utilization
Set the NDU delay here.
Note: The default it is set to 360 seconds, and it is strongly recommended that this setting is not changed unless there is good reason to.
Figure 3.15 – Non-Disruptive Upgrade Delay
On the next screen you’ll be notified that the ESRS IP Client (also known as CLARalert) will be disabled until the upgrade is complete to prevent false positive alerts to EMC’s triage staff when SPs reboot.
Figure 3.16 – ESRS IP Client Notification
USM is ready to go, so click Next to continue.
Figure 3.17 – Confirmation
This step will take awhile to complete.
Figure 3.18 – Software Maintenance Status
Click on Show Steps to see the steps.
Figure 3.19 – Software Maintenance Status – Show Steps
Figure 3.20 – Software Maintenance Status – Further Progress
At this point it’s rebooting the Secondary SP (SP B). It always does it on SP B first, in case there’s a problem.
Figure 3.21 – Software Maintenance Status – Further Progress
The following shows completion with all green check marks.
Figure 3.22 – Software Maintenance Status – Complete
At this point USM offers to commit the FLARE code on the array. While the array is running new code at this stage, if the software is not committed, it is possible to roll the software back.
Note: While you can service I/O while FLARE is not committed, you can’t do potentially useful things such as bind LUNs, RAID Groups, and Storage Pools. Unless there’s an obvious problem, it’s recommended that you commit to the package. Before you commit the FLARE code, you should check whether the LCC firmware has been successfully upgraded. On a small array with a few DAEs, this won’t take too long. On a larger array (a CX4-960 with 64 DAEs for example) this might take a little longer.
If there are no storage configuration tasks on the horizon, leave it a day or two and make sure there’s no obvious problems. If it’s been upgraded to Release 30.524 to support 3TB drives on the array, and the resources to shut down the drives the moment the code is committed, then you might not be able to wait that long.
Figure 3.23 – Post-install Tasks
In USM, go to the Diagnostics section. Under the Tools section on the right-hand side of USM, select the “LCC and Power Supply Firmware Update Status” option.
Figure 3.24 – USM – Diagnostics – Diagnostics
This screen provides information on the status of LCC Firmware (FRUMON) updates that are kicked off by installing new versions of FLARE. Not every version of FLARE has new LCC firmware, but it’s always a good idea to check.
Figure 3.25 – LCC Firmware (FRUMON) Status
Click on “Show details” to see LCC revisions. Depending on the number and type of DAEs, the time it takes to complete this operation will vary greatly, and can be time consuming.
Figure 3.26 – LCC Firmware (FRUMON) Status
Once the LCC firmware is complete and everything is working as expected, you can commit the FLARE code. To do this, click on the Run button in the Post-install Tasks screen. A warning about write cache will appear, which should not be problematic when completed during a “quiet” period on the array. This process will take awhile to complete.
Figure 3.27 – Commit Packages – Confirm
Figure 3.28 – Commit Packages – Progress
Once it’s complete, there will be a green tick in the Details column and and you may click Next to continue.
Figure 3.29 – Commit Packages – Commit successful
The Finish screen provides information on the completed activities and confirms completion. If your array is registered for support with EMC or a third-party support provider, you can automatically notify them of the upgrade at this point.
Figure 3.30 – Finish
Part 4: Installing the Recovery Image and Utility Partition on an EMC CX4
Installing new Recovery Image and Utility Partition is much easier and less time-consuming than upgrading the FLARE.
Installing the Recovery Image on an EMC CX4
To start, click on Install Software. Click Next to continue.
Note: As you have done the Prepare for Installation steps, you won’t have the option to do an Express Install.
Figure 4.1 – Custom Install
This is basically the same process that you followed for installing the CX4 Bundle, the only difference is selecting different packages. Click on Browse to browse for the Recovery Image software.
Figure 4.2 – Software Selection
Figure 4.3 – Select Recovery Image
Once the file has been transferred, click on Next to continue.
Figure 4.4 – File has been transferred
The installation of the Recovery Image and the Utility Partition doesn’t involve any reboots of the SPs, so you’ll find there aren’t quite as many steps to go through. In fact, just verify the correct version of the Image and click Next to continue.
Figure 4.5 – Express Install Information Verification
USM loads up the Recovery Image in 12 steps.
Figure 4.6 – Express Install Progress
This might take awhile.
Figure 4.7 – Express Install Progress
Once done, click Next to finish.
Figure 4.8 – Complete
At the Finish screen you’ll get confirmation from USM that the software was installed successfully.
Figure 4.9 – Finish
Installing the Utility Partition
Installing the Utility Partition is roughly the same process as installing the Recovery Image. Select the appropriate version of the software to install.
Figure 4.10 – Select Utility Partition
Verify the correct software and click Next.
Figure 4.11 – Express Install Information Verification
This step shouldn’t take quite as long.
Figure 4.12 – Complete
Once it’s done, you’ll get confirmation from USM. Click on Next to finish and complete the process.
Figure 4.13 – Finish
At this point, if you’re running Unisphere agents on your hosts, you might look at updating them.
How to fix the Bad Battery error message with EqualLogic systems
Dell / EqualLogic never created field replaceable units (FRUs) for the controller cache batteries used in different arrays, so there is no easy replacement. The common solution is to replace the entire controller with one that has a battery that hasn’t failed yet.
Figure 1. EqualLogic PS4100, PS6100 Battery Status Failed
Unfortunately, purchasing a used replacement EqualLogic controller doesn’t always buy you much time since you’re replacing a failed unit with another aging unit. The majority of EqualLogic systems needing controller replacements are old and out of service, so new controllers haven’t been available for quite some time. The further down the road we get, the more likely it is that simply replacing the failed controller because of a bad battery message, will fail again within 6 months of replacement.
Dell EqualLogic PS4100, PS6100 Series Controller Battery Replacement
EqualLogic Controller Battery Logic 101
You wouldn’t replace your smoke alarm battery with a 9-volt from an old smoke alarm sitting around in a pile of defunct smoke alarms, so why would you replace a failed controller battery with another old dying one?
If it’s just a battery, can’t I replace it myself? Why do I need to buy an entire controller?
The answer here is fairly simple; take one of these controllers apart and find the battery. In the case of the EqualLogic PS4100 and PS6100 series arrays, they are not “batteries” in the normal sense of the word, and again, we are the only ones refurbishing them.
Extended Warranty for EqualLogic Controllers
We provide a standard 90- day warranty, at a minimum, on everything we sell, but with a few items we sell that have batteries in them, we have a requirement that the bad units being replaced be shipped back to us. We provide a pre-paid shipping label so there’s no cost to you. To sweeten the incentive for taking that simple step of putting the failed unit in the box you received your replacement in and putting our label on it, we will upgrade your 90 day warranty to a full one year warranty upon receiving the failed unit back at our warehouse in Alberton, RSA.
The procedures may vary when a control module (member) is removed or fails depending on the number and type of control modules, the network cabling configuration, and the cache mode settings. Please see the Hardware Maintenance manual for your array model for detailed information about array-specific control module and network failover behavior. Use this guide at your own risk. SPS Pros is not responsible for faults, failures, data loss, etc. as a result of following this guide.
In a dual control module array, if the secondary control module is removed or fails, the remaining control module may enter write-through mode depending on the cache mode policies.
In a dual control module array, if the active control module fails, the secondary control module automatically takes over and become active. If there is a network connection to the active control module, the control module failover will be transparent to applications; however, iSCSI initiators must reconnect to the group IP address.
If the only functioning control module fails, the controller will be inaccessible from the network and data loss is possible.
If a control module fails, replace it as soon as possible with the same control module type.
Caution: Do not mix control module types in an array.
For information about replacing a control module, see the Hardware Maintenance manual for your array model or contact your array support provider.
Do not remove a failed control module until you have a replacement.
NOTE: For proper cooling, do not leave a control module slot empty. If an array will operate for a long time with only one control module, you must install a blank control module in the empty slot. You can order a blank control module from your PS Series array service provider. If you remove the active control module, there will be a short interruption as failover to the secondary control module occurs.
How to Tell if Dell EqualLogic Controller has Failed
You can identify a failed control module by:
LEDs – failed control module may appear as ACT LED no color, ERR LED is red, PWR LED is Off
Messages – A message on the LCD panel (located behind the bezel), on the console, in the event log, or in the Group Manager GUI Alarms panel describes a control module failure
Group Manager GUI and CLI Output – The Member Controllers window or the member select “show controllers command output” shows the control module status “not installed”.
If a controller/control module has failed you must replace it.
How to Replace a Dell EqualLogic Controller
Follow safety and/or ESD protocols.
Make sure the faulty controller is the SECONDARY controller. If not, you must fail over the array (see below).
With faulted controller in the secondary position, disconnect all of the cables (noting their location).
Remove the controller by operating latches. The array should continue to function on the active controller.
Remove the flash card from the faulty controller. (May be a compact flash card or micro SD card)
Insert the flash card into the replacement controller.
Correctly orient and insert the replacement controller and ensure it is properly seated.
Reconnect cables to the replacement controller.
Check LEDs and GUI to ensure that the replacement controller has come online.
NOTE: After replacing the controller if the array reports a critical hardware error with no hardware issues showing, remove the new controller following the steps above, and wait 60 seconds to reseat it. This should clear the error.
If two control modules are installed but only one appears in the GUI or CLI, the control module may not be properly installed. Re-install the control module. If both control modules still do not appear in the GUI or CLI, they may not be running the same firmware. Contact your array support provider
How to Fail Over Dell EqualLogic Controller Array
To fail over one control module to another, you must “restart” the Active control module, which will force the Secondary controller to take over and make the “currently active” controller the original Secondary controller.
In the GUI (Group Manager) click on “Members” and select the array in question.
On the tabs on the right side of the GUI, click on “Maintenance”.
Locate and click on the “Restart” button.
You will be prompted for the grpadmin password.
NOTE: You will see alerts such as “Unable to communicate to the other control module, active fail over cannot occur”. This should resolve after a minute or two once the restart has completed.
IT teams have been steadily moving their workloads to the network’s edge to process data closer to its source. Many of these workloads run in virtual environments, but some IT professionals question whether it makes sense to virtualize an edge computing server at all.
The exact meaning of edge computing and how it’s implemented outside of the data center is still up for debate. Some think of edge computing in terms of intelligent devices. Others envision intermediary gateways that process client data. Still others imagine micro data centers that service the needs of remote workers or satellite offices.
Despite the disparity in perspectives, however, they all share a common characteristic: Client-generated data is processed at the network’s periphery as close to its source as possible.
Edge computing is much different than using a centralized data center. For example, administrators usually manage edge computing servers remotely, often using tools with intermittent network access. In addition, edge sites typically have space and power restrictions that make it difficult to add capacity to an existing system or to significantly modify the architecture. In some cases, an edge site might require specialized hardware or need to connect to other edge sites.
A number of factors push organizations to the edge, particularly mobile computing and IoT, which generate massive amounts of data. The mega data center can no longer meet the demands of these technologies, which results in an increase in data latencies and network bottlenecks.
At the same time, emerging technologies are making edge computing more practical and even more cost-effective than traditional approaches, as it addresses the limitations of the centralized model.
The edge computing server and virtualization
To keep edge processing as efficient as possible, some teams run containers or serverless architectures on bare metal to avoid the overhead that comes with hypervisors and VMs.
In some cases, this might be a good approach, but even in an edge environment, virtualization has benefits — flexibility, security, maintenance and resource utilization, to name a few. Virtualization will likely remain an important component in many edge scenarios, at least for intermediary gateways or micro data centers. Even if applications run in containers, they can still be hosted in VMs.Virtualization will likely remain an important component in many edge scenarios, at least for intermediary gateways or micro data centers.
Researchers view the VM as an essential component of edge computing and believe that admins can use VM overlays to enable more rapid provisioning and to move workloads between servers. But researchers are not the only ones focused on bringing virtualization to the edge.
For example, Wind River’s open source projectStarlingX makes components of its Titanium Cloud portfolio available through the OpenStack Foundation. One of the goals of the project is to address common requirements for virtualizing an edge computing server. The code already includes a virtual infrastructure manager (VIM), as well as VIM helper components.
VMware is also committed to edge computing and is working on ways to virtualize compute resources throughout the entire data flow, including edge environments. For example, VMware offers hyper-converged infrastructure software powered by vSAN. Shops can use the software to support edge scenarios through VMware vSphere and the VMware Pulse IoT Center. This provides a system that includes secure, enterprise-grade IoT device management and monitoring.
Other vendors are also moving toward the edge, and virtualization is playing a key role. Although edge computing doesn’t necessarily imply virtualization, it by no means rules it out and, in fact, often embraces it.
Administration on the edge
Along with edge computing come a number of challenges for admins trying to manage virtual environments. The lack of industry standards governing edge computing only adds to the complexities.
As computing resources move out of the data center and into the network’s periphery, asset and application management are becoming increasingly difficult, especially because much of it is carried out remotely. Admins must come up with ways to deploy these systems, perform ongoing maintenance and monitor the infrastructures and applications for performance issues and trouble spots, and address such issues as fault tolerance and disaster recovery.
If an IT team manages only one edge environment, they should be able to maintain it without too much difficulty. But if the team must manage multiple edge environments and each one serves different functions and is configured differently, the difficulties grow exponentially. For example, some systems might run VMs, some might run containers and some might do both. The systems might also operate on different hardware, use different APIs and protocols, and execute different applications and services.
Admins must be able to coordinate all these environments, yet allow them to operate independently. Edge computing is an industry in its infancy, and network-wide management capabilities have yet to catch up.
But management isn’t the only challenge. An edge computing server often has resource constraints, which can make it difficult to change the physical structure or accommodate fluctuating workloads. These challenges go beyond such capabilities as VM migration.
In addition, admins might have to contend with interoperability issues between the source devices and edge systems, as well as between multiple edge systems. This is made all the more difficult by the different configurations and lack of industry standards.
One of the biggest challenges that admins face is ensuring that all sensitive data is secure and privacy is protected. Edge computing’s distributed nature increases the number of attack vectors, which makes the entire network more vulnerable to attack, and the different configurations increase the risks.
For example, one system might run containers in VMs and the other on bare metal, resulting in disparity in the methods IT uses to control security. The distributed nature can also make it more difficult to address compliance and regulatory issues. And, given the monitoring challenges that come with edge computing, the risks of undetected intrusions are even greater.
How does virtualization factor into your edge computing plan?
Containers enable organizations to expand beyond the standard server in ways that traditional technologies cannot.
With containers, you can bundle a piece of software within a complete file system that contains everything it needs to run: code, runtime, system tools, system libraries and so on. When you deploy an application or service this way, it will always run the same, regardless of its environment.
If you want to containerize a service or an app, you’ll need to get up to speed with Docker, one of the most popular container tools. Here are some guidelines to install Docker on Ubuntu 16.04 servers and fulfill Docker’s potential.
What to know before you install Docker on Ubuntu 16.04
Before you install Docker on Ubuntu 16.04, update the apt utility — a package manager that includes the aptcommand — and upgrade the server. If apt upgrades the kernel, you may need to reboot. If you need to reboot, do it when the server can be down for a brief period. It’s important to note that you can only install Docker on 64-bit architecture, with a minimum kernel of 3.10.
To update and upgrade, enter the following commands:
sudo apt-get update
sudo apt-get upgrade
Once the update/upgrade is complete, you can install Docker with a single command:
sudo apt-get install -y docker.io
When the install completes, start the Docker engine with the command:
sudo systemctl start docker
Finally, enable Docker to run at boot with the command:
sudo systemctl enable docker
Running Docker as a standard user
Out of the box, you can only use Docker if you’re the root user, or by way of sudo. Since either running Docker as the root user or with sudo can be considered a security risk, it’s crucial to enable a standard user. To do that, you must add the user to the Docker group. Let’s say we’re going to add the user “Olivia” to the Docker group so that she can work with the tool. To do this, issue the following command:
sudo gpasswd -a olivia docker
Restart the Docker service with the command:
sudo systemctl restart docker
Once Olivia logs out and logs back in again, she can use Docker.
Before we get into the commands to work with Docker, you’ll need to understand some of its terminology.
Image: a frozen snapshot of live containers. Images are generally pulled from the Docker Hub, but you can create your own images. Images are read-only.
Container: an active, stateful instance of an image that is read-write.
Registry: a repository for Docker Images.
In a nutshell, you pull images from a registry and run containers from those images.
Let’s say you want to run a Debian Linux container so you can test or develop a piece of software. To pull down the Debian image, you should search the registry first. Issue the command docker search debian. The results of that search (Figure A) are important.
The first two listings are marked as “official.” To be safe, always pull official images. Pull down that debian image with the command:
When the image pull is complete, Docker will report the image debian:latest has been downloaded. To make sure it’s there, use the command:
You are now ready to create the debian container with the command:
The above command will run the debian container. It keeps STDIN (standard input) open with the -i option, allocates a pseudo-tty with the -t option, and places you in a Bash prompt so you can work. When you see the Bash prompt change, you’ll know that the command succeeded.
You can work within your container and then exit the container with the command exit.
After you install Docker on Ubuntu 16.04, let’s say you want to develop an image to be used later. When you exit a running container, you will lose all of your changes. If that happens, you cannot commit your changes to a new image. To commit those changes, you first need to run your container in the background (detached), by adding the -d option:
docker run -dit debian
When you run the container like this, you can’t make any changes because you won’t be within the container. To gain access to the container’s shell, issue the command:
docker exec -i -t HASH /bin/bash
HASH is created after running the image in the background — it will be a long string of characters.
Now, you should be inside the running container. Make your changes and then exit the running container with the command exit. If you re-enter the container, your changes should still be present.
To commit your changes to a new image, issue the command:
docker commit HASH NAME
HASH is the hash for our running container and NAME is the name you’ll give the new image.
If you now issue the command docker images, your newly created image will be listed alongside the image you pulled from the Docker Hub registry.
Linux administrators cannot live by the graphical user interface alone. That’s why we’ve compiled useful Linux commands into this convenient guide.
By learning how to use a few simple tools, command-line cowards can become scripting commandos and get the most out of Linux by executing kernel and shell commands.
alias The alias command is a way to run a command or a series of Unix commands using a shorter name than those that are usually associated with such commands.
apt-get The apt-get tool automatically updates a Debian machine and installs Debian packages/programs.
AWK, Gawk AWK is a programming language tool used to manipulate text. The AWK utility resembles the shell programming language in many areas, but AWK’s syntax is very much its own. Gawk is the GNU Project’s version of the AWK programming language.
bzip2 A portable, fast, open source program that compresses and decompresses files at a high rate, but that does not archive them.
cat A Unix/Linux command that can read, modify or concatenate text files. The cat command also displays file contents.
cd The cd command changes the current directory in Linux and can conveniently toggle between directories. The Linux cd command is similar to the CD and CHDIR commands in MS-DOS.
chmod The chmod command changes the permissions of one or more files. Only the file owner or a privileged user can change the access mode.
chown The chown prompt changes file or group ownership. It gives admins the option to change ownership of all the objects within a directory tree, as well as the ability to view information on the objects processed.
cmp The cmp utility compares two files of any type and writes the results to the standard output. By default, cmp is silent if the files are the same. If they differ, cmp reports the byte and line number where the first difference occurred.
comm Admins use comm to compare lines common to file1 and file2. The output is in three columns; from left to right: lines unique to file1, lines unique to file2 and lines common in both files.
cp The cp command copies files and directories. Copies can be made simultaneously to another directory even if the copy is under a different name.
cpio The cpio command copies files into or out of a cpio or tar archive. A tar archive is a file that contains other files, plus information about them, such as their file name, owner, timestamps and access permissions. The archive can be another file on the disk, a magnetic tape or a pipe. It also has three operating modes: copy-out, copy-in and copy-pass. It is also a more efficient alternative to tar.
CRON CRON is a Linux system process that executes a program at a preset time. To use a CRON script, admins must prepare a text file that describes the program and when they want CRON to execute it. Then, the crontab program loads the text file and executes the program at the specified time.
cURL Admins use cURL to transfer a URL. It is useful for determining if an application can reach another service and how healthy the service is.
declare The declare command states variables, gives them attributes or modifies the properties of variables.
df This command displays the amount of disk space available on the file system containing each file name argument. With no file name, the df command shows the available space on all the currently mounted file systems.
echo Use echo to repeat a string variable to standard output.
enable The enable command stops or starts printers and classes.
env The env command runs a program in a modified environment or displays the current environment and its variables.
eval The eval command analyzes several arguments, concatenates them into a single command and reports on that argument’s status.
exec This function replaces the parent process with any subsequently typed command. The exec command treats its arguments as the specification of one or more subprocesses to execute.
exit The exit command terminates a script and returns a value to the parent script.
expect The expect command talks to other interactive programs via a script and waits for a response, often from any string that matches a given pattern.
export The export command converts a file into a different format than its current format. Once a file is exported, it can be accessed by any application that uses the new format.
find The find command searches the directory tree to locate particular groups of files that meet specified conditions, including -name, -type, -exec, -size, -mtime and -user.
for, while The for and while commands execute or loop items repeatedly as long as certain conditions are met.
free With the free command, admins can see the total amount of free and used physical memory and swap space in the system, as well as the buffers and cache used by the kernel.
gawk See AWK.
grep The grep command searches files for a given character string or pattern and can replace the string with another. This is one method of searching for files within Linux.
gzip This is the GNU Project’s open source program for file compression that compresses webpages on the server end for decompression in the browser. This is popular for streaming media compression and can simultaneously concatenate and compress several streams.
history The history function shows all the commands used since the start of the current session.
ifconfig The iconfig command configures kernel-resident network interfaces at boot time. It is usually only needed when debugging or during system tuning.
ifup With ifup, admins can configure a network interface and enable a network connection.
ifdown The ifdown command shuts down a network interface and disables a network connection.
iptablesThe iptables command allows or blocks traffic on a Linux host and can prevent certain applications from receiving or transmitting a request.
kill With kill signals, admins can send a specific signal to a process. It is most often used to safely shut down processes or applications.
less The less command lets an admin scroll through configuration and error log files, displaying text files one screen at a time with backward or forward navigation available.
locate The locate command reads one or more databases and writes file names to match certain output patterns.
lft The lft command determines connection routes and provides information to debug connections or find a box/system location. It also displays route packets and file types.
ln The ln command creates a new name for a file using hard linking, which allows multiple users to share one file.
ls The ls command lists files and directories within the current working directory, which allows admins to see when configuration files were last edited.
lsof Admins use lsof to list all the open files. They can add -u to find the number of open files by username.
lsmod The lsmod command displays a module’s status within the kernel, which helps troubleshoot server function issues.
man The man command allows admins to format and display the user manual that’s built into Linux distributions, which documents commands and other system aspects.
more Similar to less, more pages through text one screen at a time, but has limitations on file navigation.
mount This command mounts file systems on servers. It also lists the current file systems and their mount locations, which is useful to locate a defunct drive or install a new one.
mkdir Linux mkdir generates a new directory with a name path.
neat A Gnome GUI tool that allows admins to specify the information needed to set up a network card.
netconfig/netcfg Admins can use netconfig to configure a network, enable network products and display a series of screens that ask for configuration information.
netstat This command provides information and statistics about protocols in use and current TCP/IP network connections. It is a helpful forensic tool for figuring out which processes and programs are active on a computer and are involved in network communications.
nslookup A user can enter a host name and find the corresponding IP address with nslookup. It can also help find the host name.
od The od command dumps binary files in octal — or hex/binary — format to standard output.
passwd Admins use passwd to update a user’s current password.
ping The ping command verifies that a particular IP address exists and can accept requests. It can test connectivity and determine response time, as well as ensure an operating user’s host computer is working.
ps Admins use ps to report the statuses of current processes in a system.
read The read command interprets lines of text from standard input and assigns values of each field in the input line to shell variables for further processing.
rsync This command syncs data from one disk or file to another across a network connection. It is similar to rcp, but has more options.
screen The GNU screen utility is a terminal multiplexor where a user can use a single terminal window to run multiple terminal applications or windows.
sdiff Admins use sdiff to compare two files and produce a side-by-side listing indicating lines that are dissimilar. The command then merges the files and outputs the results to the outfile.
sed The sed utility is a stream editor that filters text in a pipeline, distinguishing it from other editors. It takes text input, performs operations on it and outputs the modified text. This command is typically used to extract part of a file using pattern matching or to substitute multiple occurrences of a string within a file.
service This command is the quickest way to start or stop a service, such as networking.
shutdown The shutdown command turns off the computer and can be combined with variables such as -h for halt after shutdown or -r for reboot after shutdown.
slocate Like locate, slocate, or secure locate, provides a way to index and quickly search for files, but it can also securely store file permissions and ownership to hide information from unauthorized users.
sort This command sorts lines of text alphabetically or numerically according to the fields. Users can input multiple sort keys.
sudo The sudo command lets a system admin give certain users the ability to run some — or all — commands at the root level and logs all the commands and arguments.
SSH SSH is a command interface for secure remote computer access and is used by network admins to remotely control servers.
tar The tar command lets users create archives from a number of specified files or to extract files from a specific archive.
The tail command displays the last few lines of the file. This is particularly helpful for troubleshooting code because admins don’t often need all the possible logs to determine code errors.
TOP TOP is a set of protocols for networks that performs distributed information processing and displays the tasks on the system that take up the most memory. TOP can sort tasks by CPU usage, memory usage and runtime.
touch Admins can create a blank file within Linux with the touch command.
tr This command translates or deletes characters from a text stream. It writes to a standard output, but it does not accept file names as arguments — it only accepts input from standard input.
traceroute The traceroute function determines and records a route through the internet between two computers and is useful for troubleshooting network/router issues. If the domain does not work or is not available, admins can use traceroute to track the IP.
uniq With uniq, admins can compare adjacent lines in a file and remove or identify any duplicate lines.
vi The vi environment is a text editor that allows a user to control the system with just the keyboard instead of both mouse selections and keystrokes.
vmstat The vmstat command snapshots everything in a system and reports information on such items as processes, memory, paging and CPU activity. This is a good method for admins to use to determine where issues/slowdown may occur in a system.
wget This is a network utility that retrieves web files that support HTTP, HTTPS and FTP protocols. The wget command works non-interactively in the background when a user is logged off. It can create local versions of remote websites and recreate original site directories.
while See for.
whoami The whoami command prints or writes the user login associated with the current user ID to the standard output.
xargs Admins use xargs to read, build and execute arguments from standard input. Each input is separated by blanks.
T-Blog Cookies policy