06:00 PM
Connect Directly

Companies Push The Limits Of Virtualization

New software, hardware, and networking systems let IT managers stung by the recession go further with server consolidation.

InformationWeek Green - May 4, 2009 InformationWeek Green
Download the entire InformationWeek "green" issue,
our May 4 magazine distributed solely in pdf form.
(Registration required.)
We will plant a tree
for each of the first 5,000 downloads.

Illustration by Sek Leung When it comes to virtualizing servers, few companies can match Accenture's outsourcing arm. The unit hosts customers such as the airline reservation site, as well as several high-volume trading and insurance systems. Accenture's service-level agreements with these customers impose thousands of dollars of penalties for each minute of downtime.

With money like that at stake, you'd think IT architect Eric Ulmer would be conservative when it comes to virtualizing his Minneapolis, London, Sydney, and Cologne data centers. Not so. While most other companies find their servers are maxed out at 10 or 12 virtual machines per server, he's designed ones that run 30 VMs each. And Ulmer's not stopping there. He'll move to 60 VMs per server as soon as he's completely confident that VMware's new vSphere 4 virtualization management software is up to the task.

InformationWeek Reports

Ulmer admits, though, that the big numbers make him nervous. "With virtualization, a lot of customers go down if one server fails," he says. "If five customers are down, we don't have enough phone lines to take all their complaints."

But it's not stopping him. His testing shows that VMware's ESX Server can run 60 virtual machines and possibly more. His group once assigned all the VMs intended for two servers to one by mistake. Ulmer was surprised to find the server humming along running 100 VMs without a noticeable performance hit.

Ulmer and his team are among an elite group of data center pioneers who right now are testing the limits of server virtualization, pushing for the next tier of performance even as most companies are just getting comfortable with the technology. They're increasing the number of VMs per server in order to save electricity, capital costs, and even labor when the right management tools are in place. In this deep recession, many IT managers would like to go further with server consolidation. It's these pioneers who are discovering the limits to how many virtual machines can practically be loaded onto any one server, and what problems to watch for as each additional VM stresses the overall system.

Ulmer and others who are pushing for more performance are getting plenty of encouragement from vendors, whose next generation of virtualization software and servers appear to be converging toward a big jump in productivity and capacity.

Next-Gen Advantage
VMware has just launched an upgrade to its vSphere 4 data center operating system that shows how vendors are trying to push the state of the art in virtualization. With upgrades that'll be available this month and next, VMware claims companies can get a 30% productivity gain using existing servers--so for companies running 10 VMs today, 13 should be within reach, says Bogomil Balkansky, VP of product marketing.

Couple that with server makers that are launching their next-generation machines based on Intel's Nehalem, or Xeon 5500, chip, and you're talking major virtualization advances. The 5500 is really the first chip to escape from the personal computing bias of the original x86 chips. It has a memory controller built onto the chip instead of off-loaded to a separate dedicated chip, reducing latencies encountered as a VM's operating system manages the memory that its application is using. The 5500 also has more built-in virtualization support for the hypervisor. It's much more of a multithreaded server processor capable of juggling many assignments across its four cores, making it better equipped to run multiple VMs.

IBM and Hewlett-Packard each say they're seeing gains of just over 60% in benchmark tests of their new Xeon 5500 servers in virtualized environments, compared with previous generations. That means a hardware shift could let the typical 10 VM-per-server company bump up to 16. Taken together, the VMware software upgrade and the new server designs could very well let companies double the number of virtual machines they run per server.

Unlocking Virtualization
Facing IT and business realities.
Doubling isn't good enough for Cisco, a newcomer to blade servers. It's upcoming Unified Computing System has built-in I/O devices that promise to juggle network and storage traffic, and when necessary, open additional pipes. Cisco hasn't published benchmark results for its new blade--which it expects to start offering later this year--but claims it will quadruple the number of VMs customers can run on existing hardware.

We'll see if Cisco delivers on that claim in the real world. But if it can, going from 10 to 40 VMs per server would put more IT managers in Ulmer's world of extreme virtualization. It also would introduce more of them to the many problems he's dealing with.

The Limits
Ulmer encounters three severe limits on how many VMs a server can run: CPU, memory, and I/O. To maximize the number of virtual machines you're running, his advice is to first maximize your servers' CPU count and memory.

The Sun Microsystems Sun Fire X-4600 servers he uses, designed by former Sun chief server architect Andy Bechtolsheim, max out what the Advanced Micro Devices Opteron chip can do. Twenty-eight servers are loaded with 128 GB of memory, a big number for their vintage; six have 256 GB of memory, going far beyond most of the current generation. In their next generation of Xeon 5500 servers, IBM and HP plan to put a maximum of 128 GB and 144 GB of memory, respectively, on their servers.

Ulmer's in the process of upgrading his 128-GB server memories to 256 GB. "Memory is the weak link. If you suffer memory depletion, it's the endgame for VMware's hypervisor" as it slows to a crawl, he says.

Ulmer's four- and eight-way Sun Fires are each equipped with dual- or quad-core Opteron CPUs; that's 16 or 32 cores per server. One of the few ways to make use of all those CPU cycles is by hosting multiple virtual machines. At 30 VMs per server, Ulmer's discovered that he's still only using 20% of available CPU cycles.

He's also bought specialized I/O hardware from a startup, Xsigo, which early on saw I/O as a potential bottleneck in virtualization. Xsigo puts converged network adapters on the server to move network and storage traffic coming from the VMs off the server and into an I/O Director, a hardware device that splits traffic up and sends it to its correct destination using 10-Gbps Ethernet pipes.

Most virtualization users let the hypervisor handle VM networking traffic, and that's a big constraint. VMware customers rely on the ESX Server's vSwitch, software in the hypervisor to route network traffic, which has a much greater impact on CPU resources than does bypassing it in favor of dedicated hardware. When network traffic appears, the hypervisor stops what it's doing, clears application instructions and data from the chip pipelines and buffers, and lets the vSwitch decide where to send the traffic. Frequent packet flows to other virtual machines, network routers, and storage will mean frequent interruptions of VM processing and slower operation. Ulmer believes off-loading network traffic from the hypervisor is one of the keys to increasing the number of VMs a server can run.

Because there's so much switching intelligence inside the Xsigo I/O Director, Ulmer uses just two cables from his heavily virtualized servers to the I/O device. Without the hardware device, he says, he'd end up with "a spider's den" of cables. "There'd be so many cables, I believe we'd run into human error," he says.

The I/O Challenge
If Ulmer's right and I/O is the next bottleneck holding back the number of VMs that can run on a server, then Cisco may have stolen a march on IBM, HP, Dell, and Sun as it brings converged network traffic to the virtualized server. In effect, with its Unified Computing System, Cisco is promising to do in Cisco servers and 10 Gigabit Ethernet devices what the Sun servers and Xsigo do with their own combination of hardware.

IBM and HP dispute that Cisco has gained an edge. There's no significant advantage to Cisco's approach, says Gary Thome, director of strategy and architecture for HP's blade group. HP doesn't see the data center as "a network with servers hanging off the end," he says, taking a swipe at Cisco's network orientation.

4 Areas To Watch When You're Adding VMs
1. Management Tools
Make sure you can see all the VMs that are running and find any sleeping ones.
2. Policy Migration
You must have a way to allow security, privacy, and compliance policies to follow your VMs when they move from one server to another.
3. Server Resources
Intel and AMD are adding cores to their latest chips. Make sure your server has enough memory to take advantage of all the CPU power and don't let I/O cause any slowdowns.
4. Standards Issues
The Fibre Channel over Ethernet standard will ease I/O problems. But until it's available later this year, you'll be buying into a vendor's proprietary interpretation of what it will look like.
HP countered last month with its Matrix Orchestration Environment, describing it as a unified management interface for its HP BladeSystem Matrix that will use the Xeon 5500 chips. HP will virtualize I/O and consolidate network devices with its Virtual Connect Flex-10 Ethernet and 8-Gb Fibre Channel devices that come with the blade chassis. This will let customers consolidate their existing Ethernet infrastructures, Thome says, adding that "it's the first time customers can get a converged system without a rip-and-replace strategy."

IBM will announce its own blade architecture upgrade later this year and should be able to provide a converged I/O blade without requiring customers to use nonstandard networking devices. It will do no good to multiply the number of CPU cycles if virtual machines sit idle as the hypervisor laboriously processes Ethernet packets. The goal of any high-powered blade platform is "to build a balanced system," says Rob Sauerwalt, strategic director of marketing at IBM.

For its part, Cisco has worked with VMware to produce VN-Link, a proprietary virtual network link protocol that's been built in firmware in Cisco's UCS 6100 Series Fabric Interconnect or in switching hardware outside the blade. The 6100 Series has the management intelligence to work with VMware's vNetwork Distributed Switch, so a cluster of hypervisors can feed undifferentiated VM network traffic through the distributed switch to the converged network adapters on Cisco's blades. The adapters feed the traffic to the Cisco Fabric Interconnect, where another pre-standard protocol--Cisco's implementation of 10-Gbps Fibre Channel over Ethernet--routes it to storage or data network devices.

Essentially, Cisco has virtualized I/O outside the hypervisor. It's created high-speed Fibre Channel and Ethernet channels that can be shuffled around to meet the needs of high I/O traffic VMs rather than assigning each VM a fixed resource. Cisco's servers should be able to deal with higher volumes of network and storage traffic and VM communications with less impact on core performance. Ultimately, that can help put more VMs on a blade.

Where's The Ceiling?
VMware hasn't said how many VMs companies will be able to run on ESX Server under vSphere 4, but its figures suggest it might average around 40, depending on the nature of the workloads.

There are risks, however, particularly on the management side. VMware's VSphere will need to discover, monitor, and track virtual machines as they're commissioned, provisioned, and decommissioned. If it can't do so effectively, VMs could disappear from view but still be alive and running in the software infrastructure, possibly offering intruders a path into the system. Ones that are supposed to run only with other highly secured VMs could end up being inadvertently moved to run with less secure ones.

Potential For Savings--Or Failure
Accenture's Ulmer knows the potential for savings from increased virtualization is great if the system can be managed effectively. When he reached 15 VMs per server, he achieved enough savings to make new deployments a wash, he says--deploying the next 15 VMs cost nothing in terms of virtualization software expense.

And by the time Ulmer got to 30 VMs per server, the cost savings not only paid for the software and hardware to virtualize the infrastructure, but also for the added hosting servers and network fabric of Accenture's outsourcing center. Accenture was able to speed up time to deployment and reduce labor costs for server management, Ulmer says. There's also an energy savings that he hasn't been able to calculate. Rapid payback has allowed Accenture to focus more of its spending on the most efficient hardware, letting it retire existing physical servers that are just18 months old, but not as efficient, Ulmer says.

Eric Ulmer, IT architect, Accenture
Ulmer's bosses want more.
These gains have grabbed some attention at the company--and only ramped up the pressure on Ulmer and his team to show even more efficiency gains and savings. The next step is moving to 60 VMs per server. That will come only after VMware delivers all the pieces of vSphere, something it promises by the end of the second quarter, and Ulmer is convinced the software is reliable.

Any failures in the stress testing mean Accenture will stick to 30 VMs. Ulmer's very mindful of those stiff SLA penalties, and the fact that Accenture's outsourcing business would take a hit if he experiments with high VM numbers, only to trip over some unexpected bottleneck.

"Everyone is watching us," he says. "We don't want to fail."

Illustration by Sek Leung

Continue to the sidebar:
Cisco Seeks An Edge In The Blade Market

InformationWeek Green - May 4, 2009 InformationWeek Green
Download the entire InformationWeek "green" issue,
our May 4 magazine distributed solely in pdf form.
(Registration required.)
We will plant a tree
for each of the first 5,000 downloads.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Email This  | 
Print  | 
More Insights
Copyright © 2021 UBM Electronics, A UBM company, All rights reserved. Privacy Policy | Terms of Service