Monday, February 05, 2007

Predictions for the future of low-latency computing, it's not where you think it is (Part 3)

Where is the future of general-purpose computer networks going? What could 10Gb Ethernet to the desktop enable? Good questions. For the last three years, I have carried a laptop with a 1Gb connection. Only twice in that time has it connected at above 100Mb. The truth is most LAN switches are still 100Mb, and 100Mb is more than enough bandwidth to support both computing and VoIP phones. A MPEG4 HDTV signal only requires about 4Mb/sec. You can put a lot on a 100Mb connection.

So back to 10Gb to the desktop. I recently built a new PC. I chose an Nvidia GeForce 7600 GS-based fanless video card. This is based on an older video chip, but leverages semiconductor process shrinks to deliver a much lower power solution. However, this card easily surpasses the performance of a state of the art, high-end UNIX workstation graphics card of four years ago. The cards for RISC/UNIX workstations of that era were PCI based, as the RISC/UNIX workstations did not have the Intel AGP slot. A 64-bit, 66MHz PCI slot provided about 500MB/sec of bandwidth. The idea behind these cards was all graphics processing was done on the card, which had very high bandwidth memory, and only instructions and small amounts of data were passed thorough the 500MB/sec PCI bus. Hold that thought.

At about the same time, there was an interest in creating high-end visualization solutions using these high-performance PCI graphics cards in small servers interconnected with a low-latency network such as Myrinet. The idea is these “graphics grids” could replace high-end visualization solutions from SGI and Evans and Sutherland.

So, what would happen if the 1000MB/sec 10Gb Ethernet replaced the PCI bus? Now, take the low-power graphics card, and add a TOE enabled NIC, and I have a high-performance networked display, without the need for an entire computer to support it. But what about Ethernet's latency? Those previous visualization clusters used Myrinet for latency as well as bandwidth. That is where iWARP comes in. One can run a protocol over a low-latency RDMA connection. This could fundamentally change computing, as 10Gb Ethernet replaces the system bus, and the graphics card becomes an add-on device to the display. Add a keyboard and mouse interface, and you now have an engineering thin-client, perfectly suited for a virtualized desktop running in a VM on a larger server.

Speaking of iWARP, the reduction of latency iWARP offers will be more important than the increase in bandwidth 10Gb Ethernet offers. You read it here first: Low-latency Ethernet will have a far greater impact than higher-performance Ethernet. Why? Simple. Lower-latency offers more potential for innovation than more bandwidth. There are plenty of options for bandwidth today, such as EtherChannel for IP and 4Gb Fibre-Channel for storage.

There is a key trend happening in computing over the last few years. Some call it commoditization. Some call it the “trend to free”. Operating systems “went to free” with the emergence of Linux. Some have said all software is “going to free”, with the emergence of open source. Others have spoken of “free bandwidth”. Certainly, one can look at Web 2.0 as an example of what happens to Internet sites when it is assumed everyone has a broadband connection. The emergence of AMD's Opteron and Intel's EM64T 64-bit extensions to the x86 architecture means 64-bit memory addressing is now “free” with the purchase of an x86 system, and is no longer requires an expensive RISC/UNIX platform. And with the emergence of Xen as a standard option for major Linux distributions, and multiple free options from VMware (VMware Player, VMware Server), virtualization is “going to free”.

What happens when something like this becomes “free”? New innovation is enabled at a level above the free layer. And that is why low-latency Ethernet will be so empowering to innovation.

Low-latency computing has always been a very high cost technology. For decades, low-latency computing has been limited to the realm of supercomputers, mainframes, and their logical follow-ons, HPC clusters and high-end RISC/UNIX systems. As a result, most of the battle against latency has occurred in software. The application clustering enabled by 1Gb Ethernet required proprietary software to manage state among many cluster nodes. Replication, caching, and specialized protocols were required to make it all work. In fact, in the clustered Java appserver space, the clustering technology became a key differentiator. But the truth is, if there was a ubiquitous low-latency interconnect available at the time, and intelligent operating system clustering, much less work would have been required on the part of the ISV. Simply put, if BEA was building a clustered Java appserver for a DEC VAXcluster, it would have been much easier, and they would have come to market much faster.

One can look at ISVs currently developing on InfiniBand as early adopters of low-latency computing. The basic “80/20” rule would suggest for every player investing in IB, there are four who could benefit but are not. Or it could be a 90/10 rule. It is not hard to believe for every ISV currently trying to gain competitive advantage with IB, there are nine others who don't feel it is currently worth the effort to pursue high-performance, low-latency networking as an enabler. This is much like early ISV support of Linux. Some felt it was a good fit for their product, others waited for more market acceptance and maturity.

So when low-latency networking becomes free, that is when all x86 servers come with iWARP ready on-board 10Gb Ethernet with TOEs, operating system and application developers will have new assumptions about cluster latency. It could open up initiatives for true single-system image clustering in Linux, and perhaps even Windows. Applications which previously were not clusterable, may be made so, which may disrupt existing applications. The promise of grid/utility computing becomes much more viable with a unified fabric. Blade server backplanes will probably be RDMA Ethernet based. Perhaps a shared storage clustered database alternative to Oracle RAC will emerge. Fundamental changes in the world of real-time computing, such as electronics, data capture, etc. are very likely. Radical changes to client computing are certainly possible, with thin clients offering far more potential than before. Basically, every form of computing which was weird or expensive because it required highly specialized, high-performance interconnects, will be commoditized.

This is my prediction of what 10Gb iWARP Ethernet will enable: Shared system image clustering will emerge as the defacto form of clustering, global filesystems will emerge as the defacto server filesystems, and "grid computing" (shared resource clustering) will become the normal method of deploying multiple servers. My guess for a target date for this becoming the norm in computing will be around 2015.

Fortunately, we do have an opportunity to examine in real-time what happens when a high-end computing technology becomes commoditized. The technology which offers this opportunity to observe is cheap, high-performance 3d graphics card technology. Once an industry unto itself, then an optional feature of a high-performance, expensive workstation, and now standard equipment of an ordinary desktop PC, only now is an x86 desktop operating system being released (Windows Vista), which requires a 3d graphics card. 3d displays are officially commoditized, and they are assumed to be there. Watch what happens in the graphical user interface space over the next few years. It will be a good benchmark for the innovation which occurs around a technology which has been commoditized.

Part 1 | Part 2

1 comment:

Intilop said...

The two most important protocols are TCP offload (Transmission Control Protocol) and IP (Internet Protocol), which are what give the whole name.