Well into the cloud era, a significant number of enterprises still have trepidations about moving mission-critical applications and services to the public cloud, preferring to forgo cloud migration risks by keeping apps ensconced within their own data centers.
Heading the list of reservations corporate IT shops have is the lack of visibility, transparency and accountability of public cloud services, according to respondents to the 2019 Uptime Institute’s Annual Global Data Center Survey.
Some 52% of the nearly 1,100 respondents, which included IT managers, owners and operators of data centers, suppliers, designers and consultants, said they do not place their mission-critical workloads in public clouds nor do they have plans to, while 14% said they have placed such workloads in the public cloud and are quite happy with their respective cloud services.
Of the remaining 34%, 12% have placed their services in the public cloud but complain about the lack of visibility. The remaining 22% said they will keep their most important workloads on premises but will consider moving to the cloud if they have adequate visibility.
Cloud migration risks tip the balance
Chris Brown, Uptime Institute’s chief technology officer, said he was a bit surprised that 52% of respondents were reluctant to venture into the public cloud, but a closer look at some of the reasons for that reluctance brought a better understanding.
“Among that 52% (of respondents), there are workloads that just aren’t tailored or good fits for the cloud,” Brown said. “Also, there is a fair number of older applications that have technical issues with adapting to the cloud and there is a lot of rearchitecting associated it with it, or they don’t the budget for it,” he said.
There are some workloads that shouldn’t go to the cloud. But to have these legacy platforms and the associated RDBs sitting around collecting dust just to support a handful of aging apps doesn’t seem to work.
Principal analyst, Interarbor Solutions
Of the 34% who have gone to the public cloud or are considering it, it comes down to a matter of trust, according to Brown. For the most part, respondents in this group realize the benefits cloud can bring, but they have difficulty summoning up enough faith that service providers will live up to the uptime promised in their service-level agreements (SLAs).
These concerns over cloud migration risks appear justified. The number of data center outages this year matched last year’s number for the same period of time; although, this year, more managers reported that outages rippled across multiple data centers. Just over a third of respondents reported that outages, which typically were traced to an infrastructure problem, had a measurable impact on their business. About 10% said their most recent outages resulted in over $1 million in direct and indirect costs.
Brown added that part of the problem is many users don’t understand enough about how the cloud is structured or how their cloud availability zones are designed.
“If users see the cloud as just a black box in the sky, they can only trust their provider to give them what they need when they need it,” Brown said. “And if they have outages, they have to hope their SLAs will make them whole.”
While there is plenty of data available showing how reliable most cloud service providers are, users read about highly publicized outages that have occurred over the past few years from providers such as AWS, Google and Microsoft. Compounding that issue is the basic conservative nature of data center managers.
“From my experience, the data center industry always ventures into something very gingerly,” Brown said.
Yet another reason that holds some users back is the fear of cloud lock-in and its associated expense when they want to switch service providers.
“Everyone deals with a lot of data because storage is so cheap and every IT strategy seems to be based around data,” Brown said. “But when it comes time to pull your data out of the cloud, it can cost you a fortune.”
Cloud vendors meet hesitant users halfway
Some analysts and consultants aren’t surprised at the number of corporate users still skittish about cloud migration risks. One analyst points to “cloud-down” moves from the likes of AWS, Microsoft and Google over the past year or two that offer users the option to run their applications either in the cloud or on premises.
“AWS announced Outposts last year because they want to get more into larger enterprises,” said Judith Hurwitz, president of Hurwitz and Associates, an analyst firm in Needham, Mass. “These accounts say to AWS, ‘We like your offerings, but we really want to keep them behind the firewall.’ This is how products like Outposts, [Google’s] Anthos and [Microsoft’s] AzureStack came to be,” she said.
While some other analysts understand the reluctance of many data centers to move to the cloud, they also believe it makes sense for them to be bolder and take advantage of the benefits the cloud offers now rather than wait.
“There are some workloads that shouldn’t go to the cloud,” said Dana Gardner, principal analyst with Interarbor Solutions LLC in Gilford, N.H. “But to have these legacy platforms and the associated RDBs (relational databases) sitting around collecting dust just to support a handful of aging apps doesn’t seem to work.”
Capacity demand in the enterprise continues to grow, according to the survey, along with cloud and co-location data centers, with workloads running across a range of platforms. While data center capacity is growing, it is decreasing as a percentage of the total capacity needed.
Dig Deeper on IT infrastructure management and planning
The high-capacity point for hard disk drives officially hit 16 TB this week with Seagate Technology’s product launch that targets hyperscale, cloud and NAS customers with rapidly expanding storage requirements.
Seagate brought out a helium-sealed, 7,200 rpm Exos 16 TB HDD for hyperscale data centers and IronWolf and IronWolf Pro 16 TB HDDs for high-capacity NAS use cases in SMBs.
Earlier this year, Toshiba forecasted its 7,200 rpm helium-based MG08 Series 16 TB HDD would become available midyear, although the company has yet to confirm a ship date. Western Digital is expected to ship 16 TB HDDs in 2019 based on conventional magnetic recording (CMR) technology.
Lowering total cost
With SSDs taking over performance use cases, HDDs are largely deployed in systems focused on capacity. Using the highest available capacity is especially important to cloud and enterprise customers with explosively growing volumes of data, as they try to minimize their storage footprint and lower costs. Helium-sealed HDDs help because they enable manufacturers to use thinner platters to pack in more data per HDD and require less power than air-filled drives.
“Time to market is extremely critical given that customers — including hyperscale/cloud customers — have limited resources available to qualify new HDD products,” John Rydning, a research vice president at IDC, noted via email.
Rydning said hyperscale/cloud customers would be first to use the 16 TB HDDs because they have the architecture and software stack to deploy them without diminishing overall system performance. The highest capacity HDDs have lower IOPS per terabyte, he noted.
Sinan Sahin, a principal product manager at Seagate, said the vendor has shipped more than 20,000 test units of its 3.5-inch 16 TB HDDs to hyperscale customers such as Tencent and Google and NAS vendors such as QNAP Systems and Synology.
Toshiba began shipping 16 TB HDDs to customers for qualification slightly after Seagate, and Western Digital has yet to do so, according to Rydning, who tracks the HDD market.
“Cloud customers generally will migrate to the highest available capacity, especially if there is a two- to three-quarter gap before the next capacity is qualified and ramped up in volume,” John Chen, a vice president at Trendfocus, wrote in an email.
Horse race for shift to 16 TB
Chen expects 14 TB CMR HDDs to ramp up in volume in the second half of this year at hyperscale companies. “And it is essentially a horse race between the three suppliers to determine if the transition to 16 TB can be pulled in earlier than the second quarter of 2020,” he added.
Seagate’s Exos schedule shows how the timeline could play out. The 7,200 rpm nearline 12 TB HDD was Seagate’s highest selling enterprise product in the first quarter. Seagate launched its 14 TB Exos HDDs late last year and this spring with only a limited set of customers because the Exos X16 development was running ahead of schedule, according to Sahin.
“We wanted to make sure that we did not have the two products in the channel at the same time,” Sahin said.
Seagate CEO Dave Mosley said during a recent earnings call that he expects Seagate to begin ramping to high volume this year, with the 16 TB HDDs set to become the highest revenue producer by next spring.
List pricing for Seagate’s 6 Gbps SATA-based Exos X16 HDD is $629. The IronWolf 16 TB HDD lists at $609.99, and the IronWolf Pro, which offers a higher sustained data rate, is $664.99.
Seagate’s new Exos X16, IronWolf and IronWolf Pro 16 TB HDDs use a nine-platter design to boost areal density. Chen said other manufacturers will also use a nine-disk design — and potentially even more platters in the future — for enterprise capacity-optimized nearline HDDs.
But CMR HDDs aren’t the only option for hyperscalers seeking high-capacity storage. Seagate, Toshiba and Western Digital are also working on new HDDs that use shingled magnetic recording (SMR) technology, with tracks that overlap like the shingles on a roof to increase areal density.
SMR HDD use is typically restricted to workloads that write data sequentially, such as video surveillance and the internet of things. CMR drives write data randomly across the entire disk. SMR adoption has been low because users generally have to make host-side adjustments to use the HDDs without a performance hit. But industry initiatives could start to make it easier for customers to deploy SMR HDDs in the future.
The highest capacity SMR HDD today is 15 TB. Western Digital began shipping qualification samples of its Ultrastar DC HC620 host-managed SMR HDD last October. Seagate has also sampled an enterprise SMR-based 15 TB HDD, but it hasn’t launched it commercially, according to Sahin. He said Seagate plans to make available a 17 TB SMR HDD, based on the CMR-based Exos X16, later this year. Toshiba did not respond to requests for comment on its SMR HDD plans.
Even higher HDD capacities could hit the market when manufacturers start to ship drives that use heat-assisted magnetic recording (HAMR) and microwave-assisted magnetic recording (MAMR)technologies. Sahin said Seagate expects to make available HAMR-based 20 TB HDDs in late 2020. Toshiba hasn’t specified its roadmap but outlined plans to use MAMR and explore the use of HAMR technology.
Western Digital plans to introduce “energy-assisted” 16 TB CMR HDDs and 18 TB SMR HDDs later this year, according to Mike Cordano, the company’s president and COO. Cordano claimed during the company’s most recent earnings call that the new energy-assisted HDDs would contain fewer disks and heads than competitors’ options. Western Digital late last year had said that its MAMR-based 16 TB HDD would have eight platters.
IDC’s 2018 market statistics for 2.5-inch and 3.5-inch capacity-optimized HDDs showed Seagate in the lead with 47.8% of the unit shipments. Western Digital was next at 22.4% and Toshiba trailed at 9.8%. IDC’s overall HDD unit shipment statistics for 2018 had Seagate in the lead at 40.0%, Western Digital second at 37.2% and Toshiba at 22.8%.
All three vendors make available a wide range of client and enterprise HDDs, including mission-critical enterprise drives that spin at 10,000 rpm and 15,000 rpm.
Blade servers have become a staple in almost every data center. The typical “blade” is a stripped-down modular server that saves space by focusing on processing power and memory on each blade, while forgoing many of the traditional storage and I/O functionality typical of rack and standalone server systems. Small size and relatively low cost makes blades ideal for situations that require high physical server density, such as distributing a workload across multiple Web servers).
But high density also creates new concerns that prospective adopters should weigh before making a purchase decision. This guide outlines the most important criteria that should be examined when purchasing blade servers, reviews a blade server’s internal and external hardware, and discusses basic blade server management expectations.
Internal 2U and 4U server characteristics
Form factor. Although blade server size varies from manufacturer to manufacturer, blade servers are characterized as full height or half height. The height aspect refers to how much space a blade server occupies within a chassis.
Unlike a rackmount server, which is entirely self-contained, blade servers lack certain key components, such as cooling fans and power supplies. These missing components, which contribute to a blade server’s small size and lower cost, are instead contained in a dedicated blade server chassis. The chassis is a modular unit that contains blade servers and other modules. In addition to the servers, a blade server chassis might contain modular power supplies, storage modules, cooling modules (i.e., fans) and management modules.
Blade chassis design is proprietary and often specific to a provider’s modules. As such, you cannot install a Hewlett-Packard (HP) Co. server in a Dell Inc. chassis, or vice versa. Furthermore, blade server chassis won’t necessarily accommodate all blade server models that a manufacturer offers. Dell’s M1000e chassis, for example, accommodates only Dell M series blade servers. But third-party vendors sometimes offer modules that are designed to fit another vendor’s chassis. For example, Cisco Systems Inc. makes networking hardware for HP and Dell blades.
Historically, blades’ high-density design posed overheating concerns, and they could be power hogs. With such high density, a fully used chassis consumes a lot of power and produces a significant amount of heat. While there is little danger of newer blade servers overheating (assuming that sufficient cooling modules are used), proper rack design and arrangement are still necessary to prevent escalating temperatures. Organizations with multiple blade server chassis should design data centers to use hot-row/cold-row architecture, as is typical with rack servers.
Processor support. As organizations ponder a blade server purchase, they need to consider a server’s processing capabilities. Nearly all of today’s blade servers offer multiple processor sockets. Given a blade server’s small form factor, each server can usually accommodate only two to four sockets.
Most blade servers on the market use Intel Xeon processors, although the Super Micro SBA-7142G-T4 uses Advanced Micro Devices (AMD) Inc.’s Opteron 6100 series processors. In either case, blade servers rarely offer less than four cores per socket. Most blade server CPUs have six to eight cores per socket. Some AMD Opteron-based processors, such as the 6100 series used by Super Micro, have up to 32 cores.
If you require additional processing power, consider blade modules that can work cooperatively, such as the SGI Altix 450. This class of blades can distribute workloads across multiple nodes. By doing so, the SGI Altix 450 offers up to 38 processor sockets and up to 76 cores when two-core processors are installed.
Memory support. As you ponder a blade server purchase, consider how well the server can host virtual machines (VMs). In the past, blade servers were often overlooked as host servers, because they were marketed as commodity hardware rather than high-end hardware capable of sustaining a virtual data center. Today, blade server technology has caught up with data center requirements, and hosting VMs on blade servers is a realistic option. Because server virtualization is so memory-intensive, organizations typically try to purchase servers that support an enormous amount of memory.
Even with its small form factor, it is rare to find a blade server that offers less than 32 GB of memory. Many of the blade servers on the market support hundreds of gigabytes of memory, with servers like the Fujitsu Primergy BX960 S1 and the Dell PowerEdge M910 topping out at 512 GB.
As important as it is for a blade server to have sufficient memory, there are other aspects of the server’s memory that are worth considering. For example, it is a good idea to look for servers that support error-correcting code (ECC) memory. ECC memory is supported on some, but not all, blade servers. The advantage to using this type of memory is that it can correct single-bit memory errors, and it can detect double-bit memory errors.
Drive support. Given their smaller size, blade servers have limited internal storage. Almost all the blade servers on the market allow for up to two 2.5-inch hard drives. While a server’s operating system (OS) can use these drives, they aren’t intended to store large amounts of data.
If a blade server requires access to additional storage, there are a few different options available. One option is to install storage modules within the server’s chassis. Storage modules, which are sometimes referred to as storage blades or expansion blades, can provide a blade server with additional storage. A storage module can usually accommodate six 2.5-inch SAS drives and typically includes its own storage controller. The disadvantages to using a storage module are that storage modules consume chassis space and the total amount of storage it provides is still limited.
Organizations that need to maximize chassis space for processing (or provide blade servers with more storage than can be achieved through storage modules) typically deploy external storage, such as network-attached storage or storage area network (SAN). Blade servers can accept Fibre Channel mezzanine cards, which can link a blade server to a SAN. In fact, blade servers can even boot from a SAN, rendering internal storage unnecessary.
If you do use internal storage or a storage module, verify that the server supports hot-swappable drives so that you can replace drives without taking the server offline. Even though hot-swappable drives are standard features among rackmount servers, many blade servers do not support hot-swappable drives.
Expansion slots. While traditional rackmount servers support the use of PCI Express (PCIe) and PCI eXtended (PCI-X) expansion cards, most blade servers cannot accommodate these devices. Instead, blade servers offer expansion slots that accommodate mezzanine cards, which are PCI based. Mezzanine card slots, which are sometimes referred to as fibers, are referred to by letter, where the first slot is A, the second slot is B and so on.
We refer to mezzanine slots this way because blade server design has certain limits and requires consistent slot use. If in one server, you install a Fibre Channel card in slot A, for example, every other server in the chassis is affected by that decision. You could install a Fibre Channel card into slot A on your other servers or leave slot A empty, but you cannot mix and match. You cannot, for example, place a Fibre Channel card in slot A on one server and use slot A to accommodate an Ethernet card on another server. You can, however, put a Fibre Channel card in slot A and an Ethernet card in slot B — as long as you do the same on all other servers in the chassis (or, alternatively, leave all slots empty).
External blade server characteristics
Power. Blade servers do not contain a power supply. Instead, the power supply is a modular unit that mounts in the chassis. Unlike a traditional power supply, a blade chassis power supply often requires multiple power cords, which connect to multiple 20 ampere utility feeds. This ensures that no single power feed is overloaded, and in some cases provides redundancy.
Another common design provides for multiple power supplies. For example, the HP BladeSystem C3000 enclosure supports the simultaneous use of up to eight different power supplies, which can power eight different blade servers.
Network connectivity. Blade servers almost always include 2 GB network interface cards (NICs) that are integrated into the server. However, some servers, such as the Fujitsu Primergy BX960 S1, offer 10 GB NICs instead. Unlike a rackmount server, you cannot simply plug a network cable into a blade server’s NIC. The chassis design makes it impossible to do so. Instead, NIC ports are mapped to interface modules, which provide connectivity on the back of the chassis. The interesting thing about this design is that a server’s two NIC ports are almost always routed to different interface modules for the sake of redundancy. Additional NIC ports can be added through the use of mezzanine cards.
User interface ports. The interface ports for managing blade servers are almost always built into the server chassis. Each chassis typically contains a traditional built-in keyboard, video and mouse (KVM) switch, although connecting to blade servers through an IP-based KVM may also be an option. In addition, the chassis almost always contains a DVD drive that can be used for installing software to individual blade servers. Some blade servers, such as the HP ProLiant BL280c G6, contain an internal USB port and an SD card slot, which are intended for use with hardware dongles.
Controls and indicators. Individual blade servers tend to be very limited in terms of controls and indicators. For example, the Fujitsu Primergy BX960 S1 only offers an on-off switch and an ID button. This same server has LED indicators for power, system status, LAN connection, identification and CSS.
Often the blade chassis contains additional controls and indicators. For example, some HP chassis include a built in LCD panel that allows the administrator to perform various configuration and diagnostic tasks, such as performing firmware updates. The precise number and purpose of each control or indicator will vary with each manufacturer and their blade chassis design.
Management features for 2U and 4U servers
Given that blade servers tend to be used in high-density environments, management capabilities are central. Blade servers should offer diagnostic and management capabilities at both the hardware and the software level.
Hardware-based management features. Hardware-level monitoring capabilities exist so that administrators can monitor server health regardless of the OS that is running on the server. Intelligent Platform Management Interface (IPMI) is one of the most common and is used by the Dell PowerEdge M910 and the Super Micro SBA-7142G-T4.
IPMI uses a dedicated low-bandwidth network port to communicate a server’s status to IPMI-compliant management software. Because IPMI works at the hardware level, the server can communicate its status regardless of the applications that run on the server. In fact, because IPMI works independently of the main processor, it works even if a server isn’t turned on. The IPMI hardware can do its job as long as a server is connected to a power source.
Blade servers that support IPMI 2.0 almost always include a dedicated network port within the server’s chassis that can be used for IPMI-based management. Typically, a single IPMI port services all servers within a chassis. Unlike a rack server, each server doesn’t need its own management port.
Blade servers can get away with sharing an IPMI port because of the types of management that IPMI-compliant management software can perform. Such software (running on a PC) is used to monitor things like temperature, voltage and fan speed. Some server manufacturers even include IPMI sensors that are designed to detect someone opening the server’s case. As previously mentioned, blade servers do not have their own fans or power supplies. Cooling and power units are chassis-level components.
Software-based management features. Although most servers offer hardware-level management capabilities, each server manufacturer also provides their own management software as well, although sometimes at an extra cost. Dell, for example, has the management application OpenManage, while HP provides a management console known as the HP Systems Insight Manager (SIM). Hardware management tools tend to be diagnostic in nature, while software-based tools also provide configuration capabilities. You might, for example, use a software management tool to configure a server’s storage array. As a general rule, hardware management is fairly standardized.
Multiple vendors support IPMI and baseboard management controller (BMC), which is another hardware management standard. Some servers, such as the Dell PowerEdge M910, support both standards. Management software, on the other hand, is vendor-specific. You can’t, for example, use HP SIM to manage a Dell server. But you can use a vendor’s management software to manage different server lines from that vendor. For example, Dell OpenManage works with Dell’s M series blade servers, but you can also use it to manage Dell rack servers such as the PowerEdge R715.
Because of the proliferation of management software, server management can get complicated in large data centers. As such, some organizations try to use servers from a single manufacturer to ease the management burden. In other cases, it might be possible to adopt a third-party management tool that can support heterogeneous hardware, though the gain in heterogeneity often comes at a cost of management granularity. It’s important to review each management option carefully and select a tool that provides the desired balance of support and detail.
ABOUT THE AUTHOR: Brien M. Posey has received Microsoft’s Most Valuable Professional award six times for his work with Windows Server, IIS, file systems/storage and Exchange Server. He has served as CIO for a nationwide chain of hospitals and healthcare facilities and was once a network administrator for Fort Knox.
What did you think of this feature? Write to SearchDataCenter.com’s Nicole Harding about your data center concerns at firstname.lastname@example.org.
A hyper-converged infrastructure based on VMware virtualization technologies uses VMware’s vSAN to provide software-defined storage to the HCI cluster. VMware supports several types of vSAN clusters, including the stretched cluster.
Stretched clusters let administrators implement an HCI that spans two physical locations. An IT team can use a stretched cluster as part of its disaster recovery strategy or to manage planned downtime to ensure the cluster remains available and no data is lost.
In this article, we dig into the stretched cluster concept to get a better sense of what it is and how it works. But first, let’s delve a little deeper into VMware vSAN and the different types of clusters VMware’s HCI platform supports.
The vSAN cluster
An HCI provides a tightly integrated environment for delivering virtualized compute and storage resources and, to a growing degree, virtualized network resources. It’s typically made up of x86 hardware that’s optimized to support specific workloads. HCIs are known for being easier to implement and administer than traditional systems, while reducing capital and operational expenditures, when used for appropriate workloads. Administrators can centrally manage the infrastructure as a single, unified platform.
Some HCIs, such as the Dell EMC VxRail, are built on VMware virtualization technologies, including vSAN and the vSphere hypervisor. VMware has embedded vSAN directly into the hypervisor, resulting in deep integration with the entire VMware software stack.
An HCI based on vSAN is made up of multiple server nodes that form an integrated cluster, with each node having its own DAS. The vSphere hypervisor is also installed on each node, making it possible for vSAN to aggregate the cluster’s DAS devices to create a single storage pool shared by all hosts in the cluster.
VMware supports three types of clusters. The first is the standard cluster, located in a single physical site with a minimum of three nodes and maximum of 64. VMware also supports a two-node cluster for smaller implementations, but it requires a witness host to serve as a tiebreaker if the connection is lost between the two nodes.
The third type of cluster VMware vSAN supports is the stretched cluster.
The vSAN stretched cluster
A stretched cluster spans two physically separate sites and, like a two-node cluster, requires a witness host to serve as a tiebreaker. The cluster must include at least two hosts, one for each site, but it will support as many as 30 hosts across the two sites.
When VMware first introduced the stretched cluster, vSAN required hosts be evenly distributed across the two sites. As of version 6.6, vSAN supports asymmetrical configurations that allow one site to contain more hosts than the other. However, the two sites combined are still limited to 30 hosts.A stretched cluster spans two physically separate sites and, like a two-node cluster, requires a witness host to serve as a tiebreaker.
Because the vSAN cluster is fully integrated into vSphere, it can be deployed and managed just like any other cluster. The cluster provides load balancing across sites and can offer a higher level of availability than a single site. Data is replicated between the sites to avoid a single point of failure. If one site goes offline, the vSphere HA (High Availability) utility launches the virtual machines (VMs) on the other site, with minimum downtime and no data loss.
A stretched cluster is made up of three fault domains: two data sites and one witness host. A fault domain is a term that originated in earlier vSAN versions to describe VM distribution zones that support cross-rack fault tolerance. If the VMs on one rack became unavailable, they could be made available on the other rack (fault domain).
A stretched cluster works much the same way, with each site in its own fault domain. One data site is designated as the preferred site (or preferred fault domain) and the other is designated as the secondary site. The preferred site is the one that remains active if communication is lost between the two sites. Storage on the secondary site is then considered to be down and the components absent.
The witness host is a dedicated ESXi host — physical server or virtual appliance — that resides at a third site. It stores only cluster-specific metadata and doesn’t participate in the HCI storage operations, nor does it store or run any VMs. Its sole purpose is to serve as a witness to the cluster, primarily acting as a tiebreaker when network connectivity between the two sites is lost.
During normal operations, both sites are active in a stretched cluster, with each maintaining a full copy of the VM data and the witness host maintaining VM object metadata specific to the two sites. In this way, if one site fails, the other can take over and continue operations, with little disruption to services. When the cluster is fully operational, the two sites and the witness host are in constant communication to ensure the cluster is fully operational and ready to switch over to a single site should disaster occur.
The HCI-VMware mix
Administrators can use VMware vCenter Server to deploy and manage a vSAN stretched cluster, including the witness host. With vCenter, they can carry out tasks such as changing a site designation from secondary to primary or configuring a different ESXi host as the witness host. Implementing and managing a stretched cluster is much like setting up a basic cluster, except you must have the necessary infrastructure in place to support two locations.
For organizations already committed to HCIs based on VMware technologies, the stretched cluster could prove a useful tool as part of their DR strategies or planned maintenance routines. For those not committed to VMware but considering HCI, the stretched cluster could provide the incentive to go the VMware route.
Why were several major online advertising firms selling traffic from compromised WordPress sites to threat actors operating some of the most dangerous exploit kits around?
That was the question at the heart of a 2018 report from Check Point Research detailing the inner workings of an extensive malvertising campaign it calls “Master134,” which implicated several online advertising companies. According to the report, titled “A Malvertising Campaign of Secrets and Lies,” a threat actor or group had compromised more than 10,000 vulnerable WordPress sites through a remote code execution vulnerability that existed on an older version of the content management system.
Malvertising is a common, persistent problem for the information security industry, thanks to the pervasiveness of digital ads on the internet. Threat actors have become adept at exploiting vulnerable technology and lax oversight in the online ad ecosystem, which allows them to use ads as a delivery mechanism for malware. As a result, many security experts recommend using ad blockers to protect endpoints from malvertising threats.
But Master134 was not a typical malvertising campaign.
A tangled web of redirects
Rather than using banner ads as a vector for malware infection, threat actors relied on a different component of the digital advertising ecosystem: web traffic redirection. In addition to serving digital ads, many ad networks buy and sell traffic, which is then redirected and used to generate impressions on publishers’ ads. These traffic purchases are made through what’s known as real-time bidding (RTB) platforms, and they are ostensibly marketed as legitimate or “real” users, though experts say a number of nefarious techniques are used to artificially boost impressions and commit ad fraud. These techniques include the use of bots, traffic hijacking and malicious redirection codes.
Threat actors never cease to look for new techniques to spread their attack campaigns, and do not hesitate to utilize legitimate means to do so.
Check Point Research’s report, ‘A Malvertising Campaign of Secrets and Lies’
According to Check Point Research, part of Check Point Software Technologies, Master134 was an unusually complex operation involving multiple ad networks, RTB platforms and traffic redirection stages. Instead of routing the hijacked WordPress traffic to malicious ads, the threat actors redirected the traffic intended for those sites to a remote server located in Ukraine with the IP address “126.96.36.199,” hence the name Master134. (Check Point said a second, smaller source of traffic to the Master134 server was a PUP that redirected traffic intended for victims’ homepages.)
Then, the Master134 campaign redirected the WordPress traffic to domains owned by a company known as Adsterra, a Cyprus-based online ad network. Acting as a legitimate publisher, Master134 sold the WordPress traffic to Adsterra’s network to other online ad companies, namely ExoClick, EvoLeads, AdventureFeeds and AdKernel.
From there, the redirected WordPress traffic was resold a second time to threat actors operating some of the most well-known malicious sites and campaigns in recent memory, including HookAds, Seamless and Fobos. The traffic was redirected a third and final time to “some of the exploit kit land’s biggest players,” according to Check Point’s report, including the RIG and Magnitude EKs.
The researchers further noted that all of the Master134 traffic ended up in the hands of threat actors and was never purchased by legitimate advertisers. That, according to Check Point, indicated “an extensive collaboration between several malicious parties” and a “manipulation of the entire online advertising supply chain,” rather than a series of coincidences.
Why would threat actors and ad networks engage in such a complex scheme? Lotem Finkelsteen, Check Point’s threat intelligence analysis team leader and one of the contributors to the Master134 report, said the malvertising campaign was a mutually beneficial arrangement. The ad companies generate revenue off the hijacked WordPress traffic by reselling it. The Master134 threat actors, knowing the ad companies have little to no incentive to inspect the traffic, use the ad network platforms as a distribution system to match potential victims with different exploit kits and malicious domains.
“In short, it seems threat actors seeking traffic for their campaigns simply buy ad space from Master134 via several ad-networks and, in turn, Master134 indirectly sells traffic/victims to these campaigns via malvertising,” Check Point researchers wrote.
Check Point’s report was also a damning indictment of the online ad industry. “Indeed, threat actors never cease to look for new techniques to spread their attack campaigns, and do not hesitate to utilize legitimate means to do so,” the report stated. “However, when legitimate online advertising companies are found at the heart of a scheme, connecting threat actors and enabling the distribution of malicious content worldwide, we can’t help but wonder — is the online advertising industry responsible for the public’s safety?”
Other security vendors have noted that malvertising and adware schemes are evolving and becoming increasingly concerning for enterprises. Malwarebytes’ “Cybercrime Tactics and Techniques” report for Q3 2018, for example, noted that adware detections increased 15% for businesses while dropping 19% for consumers. In addition, the report noted a rise in new techniques such as adware masquerading as legitimate applications and browser extensions for ad blockers and privacy tools, among other things.
The malvertising Catch-22
The situation has left both online ad networks and security vendors in a never-ending game of whack-a-mole. Ad companies frequently find themselves scrutinized by security vendors such as Check Point in reports on malvertising campaigns. The ad companies typically deny any knowledge or direct involvement in the malicious activity while removing the offending advertisements and publishers from their networks. However, many of those same ad networks inevitably end up in later vendor reports with different threat actors and malware, issuing familiar denials and assurances.
Meanwhile, security vendors are left in a bind: If they ban the ad networks’ servers and domains in their antimalware or network security products, they effectively block all ads coming from repeat offenders, not just the malicious ones, which hurts legitimate publishers as well as the entire digital advertising ecosystem. But if vendors don’t institute such bans, they’re left smacking down each new campaign and issuing sternly worded criticisms to the ad networks.
That familiar cycle was on display with Master134; following Check Point’s publication of the report on July 30, three of the online ad companies — Adsterra, ExoClick and AdKernel — pushed back on the Check Point report and adamantly denied they were involved in the Master134 scheme (EvoLeads and AdventureFeeds did not comment publicly on the Master134 report). The companies claimed they are leading online advertising and traffic generation companies and were not directly involved in any illegitimate or malicious activity.
Check Point revised the report on August 1 and removed all references to one of the companies, New York-based AdKernel LLC, which had argued the report contained false information. Check Point’s original report incorrectly attributed two key redirection domains — xml.bikinisgroup.com and xml.junnify.com — to the online ad company. As a result, several media outlets, including SearchSecurity, revised or updated their articles on Master134 to clarify or completely remove references to AdKernel.
But questions about the Master134 campaign remained. Who was behind the bikinisgroup and junnify domains? What was AdKernel’s role in the matter? And most importantly: How were threat actors able to coordinate substantial amounts of hijacked WordPress traffic through several different networks and layers of the online ad ecosystem and ensure that it always ended up on a select group of exploit kit sites?
A seven-month investigation into the campaign revealed patterns of suspicious activity and questionable conduct among several ad networks, including AdKernel. SearchSecurity also found information that implicates other online advertising companies, demonstrating how persistent and pervasive malvertising threats are in the internet ecosystem.1
Open source and information security applications go together like peanut butter and jelly.
The transparency provided by open source in infosec applications — what they monitor and how they work — is especially important for packet sniffer and intrusion detection systems (IDSes) that monitor network traffic. It may also help explain the long-running dominance of Snort, the champion of open source enterprise network intrusion detection since 1998.
The transparency enabled by an open source license means anyone can examine the source code to see the detection methods used by packet sniffers to monitor and filter network traffic, from the OS level up to the application layer.
One problem with open source projects is that when leadership changes — or when ownership of a project moves from individuals to corporations — the projects don’t always continue to be fully free to use, or support for the open source version of the project may take a back seat to a commercial version.
For example, consider Snort, first released as an open source project in 1998. Creator Martin Roesch started Sourcefire in 2001 in a move to monetize the popular IDS. But, in the years running up to Cisco’s 2013 purchase of Sourcefire, the concern was that the company might allow the pursuit of profit to undermine development and support of the open source project. For example, Sourcefire sold a fully featured commercial version of Snort, complete with vendor support and immediate updates, a practice that has bedeviled other open source projects, as users often find the commercial entity gives the open source project short shrift to maximize profits.
Cisco has taken a different approach to the project, however. While the networking giant incorporates Snort technology in its Next-Generation Intrusion Prevention System (IPS) and Next-Generation Firewall products, Cisco “embraces the open source model and is committed to the GPL [GNU General Public License].” Cisco releases back to the open source project any feature or fixes to Snort technology incorporated in its commercial products.
What is an IDS and why is it important?
IDSes monitor network traffic and issue alerts when potentially malicious network traffic is detected. An IDS is designed to be a packet sniffer, a system able to monitor all packets sent on the organization’s network, and IDSes use a variety of techniques to identify traffic that may be part of an attack. IDSes identify suspicious network traffic using the following detection methods:
Network traffic signatures identify malicious traffic based on the protocols used, the source of the packets, the destination of the packet or some combination of these and other factors.
Blocked lists of known malicious IP addresses enable the IDS to detect packets with an IP address identified as a potential threat.
Anomalous network behavior patterns, similar to signatures, use information from threat intelligence feeds or authentication systems to identify network traffic that may be part of an attack.
IDSes can be host- or network-based. In a host-based IDS, software sensors are installed on endpoint hosts in order to monitor all inbound and outbound traffic, while, in a network-based IDS, the functionality is deployed in one or more servers that have connectivity to as many of the organization’s internal networks as possible.
The intrusion detection function is an important part of a defense-in-depth strategy for network security that combines active listening, strong authentication and authorization systems, perimeter defenses and integration of security systems.
Snort, long the leader among enterprise network intrusion detection and intrusion prevention tools, is well-positioned to continue its reign with continued development from the open source community and the ongoing support of its corporate parent, Cisco.
In general terms, Snort offers three fundamental functions:
Snort can be used as a packet sniffer, like tcpdump or Wireshark, by setting the host’s network interface into promiscuous mode in order to monitor all network traffic on the local network interface and then write traffic to the console.
Snort can log packets by writing the desired network traffic to a disk file.
Snort’s most important function is to operate as a full-featured network intrusion prevention system, by applying rules to the network traffic being monitored and issuing alerts when specific types of questionable activity are detected on the network.
Unlike Snort, which is a self-contained application, Security Onion is a complete Linux distribution that packages a toolbox of open source applications — including Snort — that are useful for network monitoring and intrusion detection, as well as other security functions, like log management. In addition to Snort, Security Onion includes other top intrusion detection tools, like Suricata, Zeek IDS and Wazuh.
Infosec professionals can install Security Onion on a desktop to turn it into a network security monitoring workstation or install the Security Onion distribution on endpoint systems and virtual environments to turn them into security sensors for distributed network intrusion monitors.
The Wazuh project offers enterprises a security monitoring application capable of doing threat detection, integrity monitoring, incident response and compliance. While it may be seen as a newcomer, the Wazuh project was forked from the venerable OSSEC project in 2015, and it has replaced OSSEC in many cases — for example, in the Security Onion distribution.
Running as a host-based IDS, Wazuh uses both signatures and anomaly detection to identify network intrusions, as well as software misuse. It also can be used to collect, analyze and correlate network traffic data for use in compliance management and for incident response. Wazuh can be deployed in on-premises networks, as well as in cloud or hybrid computing environments.
First released in beta in 2009, Suricata has a respectable history as a Snort alternative. The platform shares architectural similarities with Snort. For example, it relies on signatures like Snort, and in many cases, it can even use the VRT Snort rules that Snort itself uses.
Like Snort, Suricata features IDS and IPS functionality, as well as support for monitoring high volumes of network traffic, automatic protocol detection, a scripting language and support for industry standard output formats. In addition, Suricata provides an engine for enterprise network security monitoring ecosystems.
The name may be unfamiliar, but the Zeek network security monitor is another mature open source IDS. The network analysis framework formerly known as Bro was renamed Zeek in 2018 to avoid negative associations with the old name, but the project is still as influential as ever.1
More than a simple IDS/IPS, Zeek is a network analysis framework. While the primary focus is on network security monitoring, Zeek also offers more general network traffic analysis functionality.
Specifically, Zeek incorporates many protocol analyzers and is capable of tracking application layer state, which makes it ideal for flagging malicious or other harmful network traffic. It also offers a scripting language to enable greater flexibility and more powerful security.
Those of us working in security like to think our efforts are all we need to find vulnerabilities, contain threats and minimize business risks.
I had this mindset early on in my security career. The thought was: Go through the motions; do x, y and z; and that will serve as a solid security foundation. I quickly learned the world doesn’t work that way; action doesn’t necessarily translate into results.
Certain efforts contribute to a security program in positive ways, while others burn through time, money and effort with no return. Yet, as it relates to application security, all is not lost. You can take steps as part of your program that can yield near-immediate payoffs, boost your security efforts and minimize your business risks.
It’s easy to look at application security testing as a science — a binary set of methodologies, tests and tools that can deliver what you need when executed on a periodic basis. The problem is that it’s not true.
Without going into all the details required to run a strong application security program, let’s look at some of the common shortcomings of application security testing and discuss what you should and shouldn’t do as you move forward and improve. The following issues rank among the biggest applications security challenges.
Application security is often lumped into network security. This means application security testing is often part of more general vulnerability and penetration testing. As a result, application security doesn’t get the detailed attention it deserves.
Simply running vulnerability scans with traditional tools isn’t going to get you where you need to be. Organizations need to be running dedicated web vulnerability scanners like WebInspect and Netsparker, proxy tools like Burp Suite and the OWASP Zed Attack Proxy, and web browser plugins. This will enable you to perform the detailed testing necessary to uncover what are often critical web vulnerabilities that would have otherwise been overlooked. Simply running vulnerability scans with traditional tools isn’t going to get you where you need to be.
This issue is easy to resolve by getting all the right people involved and ensuring your testing efforts are properly scoped.
Web applications aside, mobile apps are often overlooked. I’m not sure why mobile app security is sometimes ignored. Mobile apps have been around years and often serve as a core component of a business’s online presence.
Faulty assumptions about mobile app security abound, however, among them the belief that mobile apps offer only a limited attack surface because of their finite functionality, or that the apps themselves are secure because they have been previously vetted by developers or app stores. This perspective is shortsighted, to say the least, and it can come back to haunt developers, security teams and businesses as a whole.
Abandoning web testing because sites and applications are hosted by a third party. This is similar to mobile apps not being property vetted. If you’re not doing the testing, somebody needs to — and it better be the company doing the hosting or management because I can assure you, no one else is — other than the criminal hackers continually trying to find flaws in your environment. The bad guys are probably not going to tell you about what they’ve uncovered until they have you backed into a corner, if ever.
Don’t let bystander apathy drive your application security testing. Be accountable or hold someone else accountable and review the work.
Companies that decline to perform authenticated application testing. It may be difficult to test every possible user role, but you really need to examine all the aspects of your application eventually.
In the application security testing I conduct, I often see multiple user roles with no critical flaws. But when I test one or two more roles, big vulnerabilities like SQL injection surface. An oversight like this — simply because you didn’t have the time or the budget to test everything — will likely prove indefensible. You need to think about how you’re going to respond when the going gets rough with an incident or breach. Better yet, think about how you’re going to prevent an oversight from facilitating application risks in the first place.
If you want to find and eliminate the blind spots in your application security testing, you must do the following:
A wise person once said, “Is this as good as you’re going to get, or are you going to get any better?“ Look at your application security testing program through this lens. Bring in an unbiased outsider if you need to.
You’re probably working in the security field because it has great payoffs — both tangible and intangible. Things change daily, and there’s always something new to discover and learn. Whether you work for an employer or you’re out on your own, if you’re going to get better and see positive, long-term results with application security, you have to be willing to see what you’re doing with a critical eye and assume there’s room for improvement. Odds are, there is.1
When it comes to understanding how all the elements of a computer network connect and interact, it’s certainly true that a picture — or in this case, a network diagram — is worth a thousand words.
A visual representation of a network makes it a lot easier to understand not only the physical topology of the network, its routers, devices, hubs, firewalls and so on, it can also clarify the logical topology of VPNs, subnets and routing protocols that control how traffic flows through the network.
Maintaining visibility across infrastructures and applications is vital to ensure data and resources are correctly monitored and secured. However, research conducted by Dimensional Research and sponsored by Virtual Instruments showed that most enterprises lack the tools necessary to provide complete visibility for triage or daily management. This is a real concern, as poor infrastructure visibility can lead to a loss of control over the network and can enable attackers to remain hidden.
Infrastructure as code, the management of an IT infrastructure with machine-readable scripts or definition files, is one way to mitigate the security risks associated with human error while enabling the rapid creation of stable and consistent but complex environments. However, it’s vital for you to ensure that the resulting network infrastructures are indeed correctly connected and protected and do not drift from the intended configuration.
Infrastructure as code tools
Infrastructure as code tools, such as Cloudcraft and Lucidchart, can automatically create AWS architecture diagrams showing the live health and status of each component, as well as its current configuration and cost. The fact that the physical and logical topology of the network are created directly from the operational AWS configuration, and not what a network engineer thinks the infrastructure as code scripts have created, means it is a true representation of the network, which can be reviewed and audited.
There are similar tools for engineers using Microsoft Azure, such as Service Map and Cloudockit. Security fundamentals don’t change when resources and data are moved to the cloud, but visibility into the network in which they exist does.
Once a network generated using infrastructure as code tools has been audited and its configuration has been secured, it’s important to monitor it for any configuration changes. Unmanaged configuration changes can occur when engineers or developers make direct changes to network resources or their properties in an out-of-band fix without updating the infrastructure as code template or script. The correct process is to make all the changes by updating the infrastructure as code template to ensure all the current and future environments are configured in exactly the same way.
AWS offers a drift detection feature that can detect out-of-band changes to an entire environment or to a particular resource so it can be brought back into compliance. Amazon Virtual Private Cloud Flow Logs is another feature that can be used to ensure an AWS environment is correctly and securely configured.
This tool captures information about the IP traffic going to and from network interfaces, which can be used for troubleshooting and as a security tool to provide visibility into network traffic to detect anomalous activities such as rejected connection requests or unusual levels of data transfer. Microsoft’s Azure Stack and tools such as AuditWolf provide similar functionality to monitor Azure cloud resources.
Security fundamentals don’t change when resources and data are moved to the cloud, but visibility into the network in which they exist does. Any organization with a limited understanding of how its cloud environment is actually connected and secured, or that has poor levels of monitoring, will leave its data vulnerable to attack.
The tools and controls exist to ensure network engineers and developers can enjoy the benefits of infrastructure as code without compromising security. Like all security controls, though, you need to understand them and use them on a daily basis for them to be effective.1
Hard drives are the lifesource of your business, whether on your computers and laptops, or within the network infrastructure of your business through servers, SANS, RAIDs and more.
No matter how well-designed or sturdy a hard-drive may be; all hard drives will eventually fail. Sometimes a drive will show symptoms of an impending fail, allowing users time to back up their data and search for a replacement.
Signs of Hard Drive Failure Include:
Abnormal heat output
Whirring, clicking, or other sounds
Other times, hard drives will fail without warning – and that total failure can result in the loss of all data from that particular drive. Data recovery process can be expensive, time-consuming, and result in loss of business, and ultimately may not be successful in recovering hundreds of rands’ worth of digital media, thousands of rands’ worth of customer records, financial records, processes and training documents, or more.
Considerations When Buying a Used Hard Drive
If your business is using legacy equipment, replacement parts may have be EOL by the original manufacturer such as EMC, IBM, Dell, Equallogic, Sun, etc. When this happens, if you do not have any spares on-hand, the used/refurbished market is your best bet for finding a replacement.
If you do not have any contacts in the used market, you may be tempted to turn to eBay. Many reputable used companies sell on eBay, but many parts listed on eBay are sold by liquidation companies who do not have the means to test the equipment they acquire, and possess little knowledge about what they’re listing outside of information presented on the label itself.
The danger from buying hard drives on eBay include:
1. Item may not function 2. Item may be listed incorrectly. 3. Hard drive may not have been wiped. 4. Seller may be overseas, or care little for returning the item or troubleshooting problems 5. Limited stock, seller may not be able to replace equipment 6. Risk further downtime dealing with slow shipping or incorrect/faulty products
How to Buy Used Enterprise Equipment Online
If you’re going to buy used hard drives, or other failed IT equipment, it’s pragmatic to buy from a professional and reputable used IT equipment company. Not only can they test equipment and likely have a quality control program in place, they will also have a DOA return policy in place and offer great customer service.
1. Do a Google (or other search engine) search using the part numbers on your failed hard drive. Hint: Using the manufacturer part number may provide the most accurate search results. 2. Look for professional retail websites that offer secure online purchasing (look for the HTTPS in the url, a shield, or badges from Trustwave, Verisign, etc.) 3. Make sure the product listing has an “Add to Cart” or “Buy It Now” button – not just request a quote! 4. If you’re in a pinch, look for sites that offer same-day or overnight shipping. Be sure to read their shipping and return policies. 5. Look for sites with reviews on the used hard drive you will be buying. 6. Avoid sites that offer “instant quotes” or want you to call for pricing – these can take days waiting for responses and force you to compare prices and options from multiple companies. You can also end up on unwanted mailing lists from data mining.
Conclusion on Buying Used Enterprise Hard Drives Online
Buying one or more previously used hard drives can provide you with a quick and inexpensive way to bring your system back to operational. If you’re buying from a trustworthy, knowledgeable business, buying online can be fast, rewarding, and cost-efficient.
It may be necessary to restart the Navisphere management server on an EMC CLARiiON CX, CX3, CX4 if any of the problems below present:
A Fatal Event icon (red letter “F” in a circle) is displayed for some physical element of array, but Navisphere CLI reports no faults.
Host displays a “U” icon even after rebooting host.
Navisphere User Interface (UI) is displaying faults that Navisphere CLI is not showing or are different from what Navisphere CLI is reporting.
An unmanaged Storage Processor (SP) still has owned LUNs.
Navisphere User Interface (UI) hangs or freezes.
Navisphere User Interface (UI) is displaying faults but when the faults option is clicked it shows the array is operating normally.
Fault on primary array but all indications shows that the array is operating normally.
The Management Servers could not be contacted.
Clicking Fault icon returns “array is operating normally” message.
CX series array does not recognize the new DAE from Navisphere Manager.
Fault after replacing Standby Power Supply
Note: The procedure must be performed on both Storage Processors in order to be effective.
Open a new browser window.
Type in the address bar: http:// xxx.xxx.xxx.xxx/setupWhere xxx.xxx.xxx.xxx is the IP address of the Storage Processor (SP).
When the screen has loaded, type in the Username and Password used to access Navisphere User Interface (UI).
Once logged in, click the “Restart Management Server” button.
Once the page has loaded, click “Yes”, and then click “Submit.”
T-Blog Cookies policy