I'm doing a presentation at Shmoocon this weekend on scalability. I've now realized that it's an 8 hour presentation I'm trying to compress into 50 minutes, so I'm throwing huge gobs of stuff out. One of the things I'd want to discuss is the history of scalability. In particular, I want to go back to 1996. That was the year everything changed.
Back then, dot-coms were buying up Solaris SPARC and SGI MIPS servers as fast as they could. That's because everyone knew that "Wintel" personal computers were toys that couldn't keep up with large problems.
Then, in 1996 Intel shipped the "Pentium Pro" processor (aka. the P6). In addition, Microsoft shipped WinNT 4.0. The combination was faster and more scalable than any competing RISC/UNIX combination. They were also a heck of a lot cheaper.
What made the Pentium Pro different was that it was a radically new design. Intel discarded completely the design of the old Pentium. By translating x86 instructions into internal RISC-like "micro-ops", it got rid of most of the problems of CISC. At the same time, it had numerous architectural improvements that were years ahead of RISC processors in things like super-scalar out-of-order execution and caching. The consequence was that the Pentium Pro was clearly faster on pretty much all benchmarks than all competing RISC processors.
In much the same way, Windows NT was a completely new operating system design. What we call "Windows" was just a backwards compatibility layer, like WINE is on Linux. This new operating system had many futuristic features, like multi-core capabilities, multi-threading, and "IO completion ports". Moreover, Microsoft's web server software that used these capabilities, IIS 4.0, came with the operating system.
(Linux also added SMP support in 1996, but with things like the big kernel lock, it was far behind Windows in actually being useful. The scalable epoll wasn't added until 2002).
I mention this because of the powerlessness of hard numbers. Unix people thought of Windows and Intel in terms of the Windows 95 and the old Pentium processor. This blinded them to the new reality of WinNT and PentiumPro, which were complete ground up redesigns unrelated to their predecessors in anything but name (and backwards compatibility). The Windows people were unhappy as well. The PentiumPro was designed for 32-bit software, and ran the older 16-bit Windows software poorly. Likewise, WinNT wasn't fully backwards compatible with old Windows, especially with games.
Thus, the ground breaking event of the PentiumPro plus WinNT 4.0 went largely unnoticed. The performance was astronomical and the price cheap, yet nobody cared. Dotcoms continued to invest in hugely expensive but underperforming hardware like Solaris SPARC.
The lesson here is about future history. Looking back, it's obvious why Intel won the competition against RISC, but it wasn't obvious back in 1996. Likewise, the superiority of SMP, threading, and scalable polling looks obvious, but it wasn't so back in 1996.
That's what my presentation is about: future-obvious ideas that are presently-obscured. Operating systems like Linux need a fast-path around the kernel for data-plane processing. This is obvious to engineers working on the bleeding edge, but it's still a bit obscure for the mainstream.
5 comments:
Master Slave multicore is superior to SMP because it allows real-time applications. Say you are doing an 8-core flight simulator... having 7 during one refresh and waiting the the 8th, just doesn't work.
SparrowOS master/slave allows running one app twice as fast, not two like SMP.
"Linux need a fast-path around the kernel for data-plane processing."
For speed Linux should be avoided in favour of FreeBSD anyways.
This is why we have things like kqueue, netmap (avoiding the OS entirely for networking), etc.
I haven't played with "netmap". It's exactly what I'm talking about. Are there any performance benchmarks at 10gigs?
Actually, Pentium Pro dates back to 1995. NT4 shipped in 1996. IIS 2.0 shipped with the original NT4. IIS 4.0 did not come until the NT4 Option Pack was released (after NT4 SP3).
Talking about future-obvious ideas that are presently-obscured, that's often even more fun when they are past-buried. Like a single-level storage rooting from MULTICS and System/38, and recently implemented as fastest key-value storage - LMDB http://symas.com/mdb/
Post a Comment