Today is the 10 year anniversary of the SQL Slammer worm.
Since I have a unique history with this, I thought I’d write up something.
Historically, most worms were named by anti-virus (AV) companies, since AV was often the first to detect new malware. In this case, the name “Slammer” came from an intrusion detection company “Internet Security Systems” or “ISS” (now a division of IBM). That’s because Slammer wasn’t the sort of thing AV was designed to detect. For one thing, it never existed as a file on the disk.
Worms were the sort of thing intrusion detection systems (IDS) were designed to detect. Whereas AV scanned the disk looking for signs of malware, IDS scanned network traffic looking for signs of hacker activity. Slammer used a buffer overflow exploit in order to propagate just like how hackers use buffer overflow exploits to break into computers.
The thing about Slammer is that virtually all IDSs failed at detecting it. If they did anything, they triggered on “UDP flood” or “UDP port scan”. Early traffic on the NANOG mailing list described it as a new DDoS (Distributed Denial of Service) attack. That’s because high levels of UDP traffic had been seen for previous DDoS attacks, but never before with a worm.
However, ISS’s product could detect Slammer for what it was: a worm exploiting a buffer overflow. The exact name of the event was “SQL_SSRP_StackBo”
The reason we could detect this was because ISS used a “vulnerability signature”. What hackers do is composed of two parts. The first is the bug, weakness, or “vulnerability” that can be leveraged by the hacker to gain access to the system. The second is a tool, software, or “exploit” that uses the vulnerability. Historically, IDS focused on detecting known exploits. When hackers created new unknown exploits for known vulnerabilities, IDS had to update their products with new “signatures” to recognize the new exploits. ISS’s technology was different. By writing signatures based on the vulnerability, we could detect any unknown exploits targeting that vulnerability.
A common type of vulnerability is the buffer overflow, where hackers send too much data in a protocol field or “buffer”, thus “overflowing” it. The way to detect this class of vulnerability is to measure the length of the buffer, and trigger when it exceeds a threshold.
In order to measure the length of buffers we first had to find buffers. Historically, IDS treated network payloads as just a string of undifferentiated bytes. This meant that they couldn’t figure out where buffers started and stopped, and hence could not measure their lengths. For our IDS, we did what’s known as a “protocol decode”, finding the start/stop of every field/buffer in the payload.
Since a typical payload contains hundreds of fields, this sounds slow. However, we used a technology called “state machines” to process these fields. State machines are extremely fast, faster even than the pattern matching used in traditional IDS.
Today, most IDS uses these protocol state machines and have the ability to measure the length of buffers and trigger on them. But back in 2003, this was unique to ISS’s technology.
The consequence of this was ISS could detect the new unknown exploit used by the worm against a known vulnerability, whereas other IDSs couldn’t.
But there is more to the story than just detection. Another interesting feature is that ISS IDS sensors could survive the onslaught.
What was interesting about Slammer is the way it created a flood of UDP packets. A single infected machine could fill a gigabit Ethernet link with roughly 300,000 packets/second. At the time, most IDS products could not handle packets at that rate. The problem was that most IDS was written on top of traditional operating systems (Windows, Linux, etc.). Operating systems could not handle packets at that rate, and would lock up in their interrupt handlers.
ISS’s technology was unique because it wrote its own network drivers that bypassed the operating system. Thus, even though technically they ran on Linux, their drivers didn’t, so whereas other products would be crushed under the load of 300,000 packets/second, ISS could handle about 2 million packets/second.
At the time there was some drama in the industry caused by Gartner who claimed that IDS products needed to be based on “hardware” (custom ASIC chips) instead of “software” for this reason. They claimed that software running on traditional operating systems could not keep up with traffic, and thus, IDS needed to be based on hardware. The ISS technology straddled both worlds. It was purely “software”, but by having custom drivers that bypassed the operating system, it was in fact faster than any competing “hardware” products.
Months later, a large government customer invited some IDS people like myself (Dug Song, Marty Roesch, Greg Shipley, and some others I can't recall off the top of my head) to debate with the Gartner analyst his claims that "no IDS could handle more than 500mbps". Also there were the customer's own IT people running my product, who then described how my product could handle the Slammer worm at 800mpbs on their network with only the problem that there was a slight delay in events reaching the console -- whereas everything else in their network fell over under the onslaught. Despite this first hand evidence given to the analyst, he still claimed afterward that no software IDS like mine could keep up with traffic above 500mbps.
Today, things have changed. Drivers bypassing the operating system kernel are now common. There is an open-source project that does this called PF_RING, and a closed product form Intel called DPDK. Using their software and a normal server (single-socket 8 core Sandy Bridge at 2 GHz), Intel has benchmarked their system at being able to forward traffic at a rate of 80 million packets/second. This is in fact faster than any “hardware” system anywhere close to the same price, form factor, or power consumption. Thus, today in 2013, this is extreme level of performance is normal, but back in 2003, it was unusual.
These two features are having a signature and keeping up with traffic are still incomplete. Each packet would trigger an event. Therefore, at 300,000 packets/second caused by Slammer, a typical IDS would try to log 300,000 events/second. The event handling system can’t handle this rate, but if it could, it would quickly fill up the database.
The ISS technology had a unique feature called the “coalescer”. It would automatically combine multiple identical events into a single event. Thus, each event that went through the system would have two timestamps of the first and last original events, and a count of the number of original events that were combined into a single event.
The coalescer didn’t just reduce “identical” events but also “similar” events. It would look for patterns like the same attack sweeping across many machines – just like a worm. It would then combine these and change the target from a single IP address to a range of IP addresses.
The consequence was that the 300,000 Slammer events/second were automatically reduced to a rate of about 500 events/second. The system could handle this smaller rate, and they didn’t fill up the databse. Moreover, the management console also did it’s own “coalescing”, so only a single “SQL_SSRP_StackBo” showed up on the console with a rapidly increasing “count” field.
Today, as far as I can tell, no other IDS has a feature similar to the “coalescer", and suffer from event overloads. The best that they do is for known worms like Slammer is to trigger on the first event every minute, but then discard additional events, thus losing a huge amount of information. In contrast, the ISS coalescer keeps a 100% accurate count of precisely the number of SQL Slammer packets it detected. If you sent precisely 1.0 billion SQL Slammer packets, you’ll see the 1.0 billion count appear on the console – even though the precise number of events in the database will be considerably less.
The reason the ISS technology performed so well with Slammer was because I designed to do so back in 1998.
Back in 1998 I left McAfee (aka. Network Associates) to create my own company to specifically target network worms. I believed (and turned out to be right) that worms were coming that would devastate the Internet. Therefore, I designed from scratch an anti-worm technology. This technology is now known as the “intrusion prevention system” or “IPS”. They way it would work is that it would forward traffic like a router/firewall/bridge, but filter out malicious traffic from hackers/worms.
I called my company “Network ICE” and called my product “BlackICE”.
BlackICE ran in three modes. The first was the traditional IDS, which we called “BlackICE Sentry”. The second was IPS inline mode, which we called “BlackICE Guard”. The third was in IPS mode on the desktop, which we called “BlackICE Defender”.
My company was bought by ISS in 2001, whereas it was rebranded as “RealSecure”. ISS itself was bought by IBM in 2006, and the product is now branded as “Proventia”.
The reason I write this retrospective of Slammer is because of the frustration I had 10 years ago. My product dealt with Slammer extremely well, as I’d designed it five years previously. It had a “vulnerability signature” that could detect it, it could keep up with the packet load, and it didn’t fall over from the event load. No competing product could say the same. Yet, EVERY competing security vendor came out with marketing blurbs extolling the virtues of their product with regards to Slammer. This has taught me an important lesson: it’s pointless trying to be clever trying to outwit the hacker, all you really need to succeed in this industry is good marketing that outwits the customers.
Historically, most worms were named by anti-virus (AV) companies, since AV was often the first to detect new malware. In this case, the name “Slammer” came from an intrusion detection company “Internet Security Systems” or “ISS” (now a division of IBM). That’s because Slammer wasn’t the sort of thing AV was designed to detect. For one thing, it never existed as a file on the disk.
Worms were the sort of thing intrusion detection systems (IDS) were designed to detect. Whereas AV scanned the disk looking for signs of malware, IDS scanned network traffic looking for signs of hacker activity. Slammer used a buffer overflow exploit in order to propagate just like how hackers use buffer overflow exploits to break into computers.
The thing about Slammer is that virtually all IDSs failed at detecting it. If they did anything, they triggered on “UDP flood” or “UDP port scan”. Early traffic on the NANOG mailing list described it as a new DDoS (Distributed Denial of Service) attack. That’s because high levels of UDP traffic had been seen for previous DDoS attacks, but never before with a worm.
However, ISS’s product could detect Slammer for what it was: a worm exploiting a buffer overflow. The exact name of the event was “SQL_SSRP_StackBo”
The reason we could detect this was because ISS used a “vulnerability signature”. What hackers do is composed of two parts. The first is the bug, weakness, or “vulnerability” that can be leveraged by the hacker to gain access to the system. The second is a tool, software, or “exploit” that uses the vulnerability. Historically, IDS focused on detecting known exploits. When hackers created new unknown exploits for known vulnerabilities, IDS had to update their products with new “signatures” to recognize the new exploits. ISS’s technology was different. By writing signatures based on the vulnerability, we could detect any unknown exploits targeting that vulnerability.
A common type of vulnerability is the buffer overflow, where hackers send too much data in a protocol field or “buffer”, thus “overflowing” it. The way to detect this class of vulnerability is to measure the length of the buffer, and trigger when it exceeds a threshold.
In order to measure the length of buffers we first had to find buffers. Historically, IDS treated network payloads as just a string of undifferentiated bytes. This meant that they couldn’t figure out where buffers started and stopped, and hence could not measure their lengths. For our IDS, we did what’s known as a “protocol decode”, finding the start/stop of every field/buffer in the payload.
Since a typical payload contains hundreds of fields, this sounds slow. However, we used a technology called “state machines” to process these fields. State machines are extremely fast, faster even than the pattern matching used in traditional IDS.
Today, most IDS uses these protocol state machines and have the ability to measure the length of buffers and trigger on them. But back in 2003, this was unique to ISS’s technology.
The consequence of this was ISS could detect the new unknown exploit used by the worm against a known vulnerability, whereas other IDSs couldn’t.
But there is more to the story than just detection. Another interesting feature is that ISS IDS sensors could survive the onslaught.
What was interesting about Slammer is the way it created a flood of UDP packets. A single infected machine could fill a gigabit Ethernet link with roughly 300,000 packets/second. At the time, most IDS products could not handle packets at that rate. The problem was that most IDS was written on top of traditional operating systems (Windows, Linux, etc.). Operating systems could not handle packets at that rate, and would lock up in their interrupt handlers.
ISS’s technology was unique because it wrote its own network drivers that bypassed the operating system. Thus, even though technically they ran on Linux, their drivers didn’t, so whereas other products would be crushed under the load of 300,000 packets/second, ISS could handle about 2 million packets/second.
At the time there was some drama in the industry caused by Gartner who claimed that IDS products needed to be based on “hardware” (custom ASIC chips) instead of “software” for this reason. They claimed that software running on traditional operating systems could not keep up with traffic, and thus, IDS needed to be based on hardware. The ISS technology straddled both worlds. It was purely “software”, but by having custom drivers that bypassed the operating system, it was in fact faster than any competing “hardware” products.
Months later, a large government customer invited some IDS people like myself (Dug Song, Marty Roesch, Greg Shipley, and some others I can't recall off the top of my head) to debate with the Gartner analyst his claims that "no IDS could handle more than 500mbps". Also there were the customer's own IT people running my product, who then described how my product could handle the Slammer worm at 800mpbs on their network with only the problem that there was a slight delay in events reaching the console -- whereas everything else in their network fell over under the onslaught. Despite this first hand evidence given to the analyst, he still claimed afterward that no software IDS like mine could keep up with traffic above 500mbps.
Today, things have changed. Drivers bypassing the operating system kernel are now common. There is an open-source project that does this called PF_RING, and a closed product form Intel called DPDK. Using their software and a normal server (single-socket 8 core Sandy Bridge at 2 GHz), Intel has benchmarked their system at being able to forward traffic at a rate of 80 million packets/second. This is in fact faster than any “hardware” system anywhere close to the same price, form factor, or power consumption. Thus, today in 2013, this is extreme level of performance is normal, but back in 2003, it was unusual.
These two features are having a signature and keeping up with traffic are still incomplete. Each packet would trigger an event. Therefore, at 300,000 packets/second caused by Slammer, a typical IDS would try to log 300,000 events/second. The event handling system can’t handle this rate, but if it could, it would quickly fill up the database.
The ISS technology had a unique feature called the “coalescer”. It would automatically combine multiple identical events into a single event. Thus, each event that went through the system would have two timestamps of the first and last original events, and a count of the number of original events that were combined into a single event.
The coalescer didn’t just reduce “identical” events but also “similar” events. It would look for patterns like the same attack sweeping across many machines – just like a worm. It would then combine these and change the target from a single IP address to a range of IP addresses.
The consequence was that the 300,000 Slammer events/second were automatically reduced to a rate of about 500 events/second. The system could handle this smaller rate, and they didn’t fill up the databse. Moreover, the management console also did it’s own “coalescing”, so only a single “SQL_SSRP_StackBo” showed up on the console with a rapidly increasing “count” field.
Today, as far as I can tell, no other IDS has a feature similar to the “coalescer", and suffer from event overloads. The best that they do is for known worms like Slammer is to trigger on the first event every minute, but then discard additional events, thus losing a huge amount of information. In contrast, the ISS coalescer keeps a 100% accurate count of precisely the number of SQL Slammer packets it detected. If you sent precisely 1.0 billion SQL Slammer packets, you’ll see the 1.0 billion count appear on the console – even though the precise number of events in the database will be considerably less.
The reason the ISS technology performed so well with Slammer was because I designed to do so back in 1998.
Back in 1998 I left McAfee (aka. Network Associates) to create my own company to specifically target network worms. I believed (and turned out to be right) that worms were coming that would devastate the Internet. Therefore, I designed from scratch an anti-worm technology. This technology is now known as the “intrusion prevention system” or “IPS”. They way it would work is that it would forward traffic like a router/firewall/bridge, but filter out malicious traffic from hackers/worms.
I called my company “Network ICE” and called my product “BlackICE”.
BlackICE ran in three modes. The first was the traditional IDS, which we called “BlackICE Sentry”. The second was IPS inline mode, which we called “BlackICE Guard”. The third was in IPS mode on the desktop, which we called “BlackICE Defender”.
My company was bought by ISS in 2001, whereas it was rebranded as “RealSecure”. ISS itself was bought by IBM in 2006, and the product is now branded as “Proventia”.
The reason I write this retrospective of Slammer is because of the frustration I had 10 years ago. My product dealt with Slammer extremely well, as I’d designed it five years previously. It had a “vulnerability signature” that could detect it, it could keep up with the packet load, and it didn’t fall over from the event load. No competing product could say the same. Yet, EVERY competing security vendor came out with marketing blurbs extolling the virtues of their product with regards to Slammer. This has taught me an important lesson: it’s pointless trying to be clever trying to outwit the hacker, all you really need to succeed in this industry is good marketing that outwits the customers.
Wow. Bitter much ?
ReplyDeleteGreat write up, great architecture && design of your software!
ReplyDeleteSadly you are absolutely right about the marketing part :(
Thanks for this retrospective. I currently support ISS products, so for me it's enlightening to read how the product has evolved over time. I still see telling remnants of BlackICE in the older products (like blackd.exe and blackd.log in Proventia Desktop). And while, like you said, much it has been rebranded with time, the core (of what we now call PAM) - the foundation you established - remains the same. To me this speaks volumes about the though that went into its architecture. It is the simplest solution to a complex problem. Thanks again for putting your thoughts out there for all of us to peruse.
ReplyDeleteThanks for this retrospective. I currently support ISS products, so for me it's enlightening to read how the product has evolved over time. I still see telling remnants of BlackICE in the older products (like blackd.exe and blackd.log in Proventia Desktop). And while, like you said, much it has been rebranded with time, the core (of what we now call PAM) - the foundation you established - remains the same. To me this speaks volumes about the though that went into its architecture. It is the simplest solution to a complex problem. Thanks again for putting your thoughts out there for all of us to peruse.
ReplyDeleteI remember BlackICE well
ReplyDeleteThanks for the article - It brought back some fond memories of the late 90s when a lack of a substantive framework and 'rules' meant that more innovation in design was a byproduct.
Perhaps we need to return to those days and with the coming hyperconnectivity we may just get the chance in part.