Friday, November 6, 2009

A Lucky Career Start

Updated on 12.4.2009 with a better diagram of the system's architecure

I have two more bugs in mind for my "bug hunt" series but before I can post about them I need to provide a little background information about my first real industry job.

After graduate school, I had the rare opportunity to help build a large-scale, enterprise-class ccNUMA server. At the time, 1996, ccNUMA architectures were leading-edge technology and ours was one of the first to utilize the Intel platform.

The system's architecture was based on "building block" nodes each containing four PentiumPro processors, 4GB of main memory, and 12 PCI slots. Up to eight nodes could be connected together using a high-speed interconnect to create a system with 32 processors, 32GB of main memory, and 96 PCI slots. The system ran a proprietary version of UNIX carefully tuned to work efficiently on its specific architecture. At its maximum size the system consumed two 19 inch racks - big iron indeed.

The high-speed interconnect was based on the IEEE Scalable Coherent Interconnect (SCI) standard and responsible for creating a single, cache coherent memory space out of the combined node resources. Through the interconnect, any processor could access the memory on any of the nodes - albeit at variable latency based on their relative locations in the system. In addition to the processor caches, each node contained a large capacity "L3" cache to store data fetched from other nodes to avoid, as much as possible, the significantly greater "remote" access latency. The SCI interconnect inter-operated with the processor and "L3" caches to ensure that accesses to cached data were handled correctly. Together, these capabilities are what made the system a cache-coherent non-uniform memory (ccNUMA) architecture. SCI was a ring based protocol and our system used two counter rotating rings to halve the average node-to-node latency, double the bandwidth, and provide some measure of redundancy (not fault-tolerance, just reboot level redundancy if a ring failed).

The ccNUMA subsystem consisted of four custom ASICs, two designed in-house and the others jointly developed with a partner company. One of those ASICs was dedicated to maintaining the SCI coherency protocol and used a microsequencer and microcode combination to provide a programmable, high-performance coherency engine. The microsequencer utilized a VLIW-like instruction set to maximize parallel operation and minimize the control store's size.

I joined the project just after its start and my initial responsibilities were to become an SCI expert, be able to program the coherency microsequencer, and optimize the SCI coherency protocol implementation to match our machine's specific architecture. After completing those tasks ahead of schedule, I went on to do a wide variety of things including writing verification scripts for the coherency ASIC, implementing the BIOS responsible for initializing the SCI subsystem, serving as a systems engineer for the coherency subsystem, root-causing complicated bugs in all four ASICs, and writing a variety of programs to automatically identify failure root causes from simulation, emulation, and production logs. In the subsequent two follow on projects, my responsibilities continued to grow and culminated in an architect role for the third generation system.

As a computer-obsessed young engineer just starting a career this job was a dream come true. On a daily basis, I had direct interaction with ASICs, logic board design, exotic computer architectures, complicated caching protocols, low-level coding, and advanced operating systems. Stuff that I had only read about in text books was now literally at my finger tips to be studied, understood, and played with.

It was the people that I got to work with, though, that really made this job special. Everyone on the project shared an all-hands-on-deck, add-value-where-you-can, succeed-at-all-costs attitude that created an invigorating environment to work in. My principal mentor was a MicroKid from The Soul of a New Machine from whom I learned a great, great deal (and still do!). Thanks to his guidance and support I got to "touch" nearly every part of the system and felt encouraged to continuously take on new challenges. To this day I still recall the feeling of endless possibilities and excitement of knowing that my skills were improving daily.

Truly a lucky way to start my career.

UPDATE

To clarify this post, I decided to add a simplified diagram of the system's architecture. The actual architecture was more complicated, for example some of the ASICs depicted were actually multiple devices, but the diagram should sufficiently convey the overall design.