Ten Years Later, Rethinking Microsoft’s Vuln Ratings

Author: Paul Roberts

December 15, 2010 4:47 pm

Microsoft’s vulnerability Severity Rating System is closing in on its tenth birthday. While the security landscape has been transformed during that time, the Ratings have endured. But do they still work? Threatpost asked prominent vulnerability researchers to give us their opinion. You may be surprised at what they had to say.

Microsoft ended 2010 in style on Tuesday, using its finaly monthly “Patch Tuesday” release to fix some 40 vulnerabilities in Microsoft Windows, Office, Internet Explorer, SharePoint and Exchange. As is the case after every Patch Tuesday, the release will set off a scramble within enterprises large and small to sort out which patches are the most important to apply, and then to install them on affected systems. As they’ve done for almost a decade, IT pros will rely on Microsoft’s Severity Rating System for software vulnerabilities to help answer those questions. But are the labels that Microsoft puts on its security bulletins – ratings that declare patches “critical,” “important,” “moderate” or “low” – accurate? And does Microsoft’s definition of what makes a security hole “severe” still reflect the security threats facing organizations?

With the Severity Rating system entering its tenth year, Threatpost is taking a fresh look at the way Microsoft ranks security holes in its products. We talked to experts at Microsoft’s Security Response Center and interviewed some of the top vulnerability researchers and security experts in the industry to get their thoughts on the strengths and weaknesses of Microsoft’s decade old Severity Rating System. Among other things, we wanted to know whether the system still works, or whether changes are needed to the way Microsoft rates security vulnerabilities in its products. We broke the stuff of those interviews into five high-level suggestions they gave for improving the Microsoft Severity Rating System. Here they are:

Recommendation 1: Ditch “Outdated Assumptions”

Recommendation 2: Make Exploitability Count

Recommendation 3: Get with the (CVSS) Program

Recommendation 4: Embrace Complexity

Recommendation 5: Last, Do No Harm

Ditch “Outdated Assumptions”

The experts we consulted about the current state of Microsoft’s Severity Rating System mostly had good things to say. But all agreed that the system is due for renovation and that there was room for improvement to the definitions that describe the severity of vulnerabilities.

First and foremost, the experts we consulted agreed, Microsoft’s Severity Ratings are based on what most agreed were “outdated” assumptions about threats and attacks.
“The threat space has changed greatly in the last eight years,” wrote Chris Wysopal, CTO at Veracode and a noted expert on application security. First, attacks shifted away from worms and viruses – noisy and intrusive threats that were easy to isolate. This, Wysopal notes, reflected shift in the motivation of malware authors from “proving a point” to “monetizing vulnerable systems.” “If you are monetizing vulnerable systems you don’t want to do anything to call attention to yourself, especially with the network traffic a worm generates.”

The second big change has been in the kinds of attacks Microsoft’s customers are being targeted with. Within the past couple years, those have shifted from a focus on core Windows components and vulnerable networks to what Wysopal described as a “target rich environment” in client software such as Adobe’s Acrobat Reader and Web browsers like Internet Explorer and Firefox top threats.

Alas, these important changes aren’t reflected in Microsoft’s Severity Rating System which, reflecting the time at which it was created, still considers “Critical” holes as those that, when exploited, “could allow the propagation of an Internet worm without user action.” “Important” vulnerabilities – the next level down – could be any vulnerability “whose exploitation could result in compromise of the confidentiality, integrity, or availability of user’s data, or of the integrity or availability of processing resources.”

The problem isn’t so much that worms don’t matter, anymore. It’s that they’re no longer as big of a concern as they were in 2002 and, by no means, are they the only concern.
Charlie Miller, Principal Analyst at consulting firm Independent Security Evaluators, observed that few companies or consumers directly connect their computers to the Internet these days, which has reduced the impact of worms. Even lowly home routers are remarkably effective at stopping worms, even though they weren’t designed for that purpose, Miller said.

Add it all up and it means that worms – while still a threat – don’t eclipse other dangers in the way they might have eight or ten years ago.

“Worms and widespread malware are indeed a health problem for the Internet ecosystem,” wrote Wysopal of Veracode. “They are like epidemics that should be treated (as such) even if they are rare. But since they are rare they shouldn’t take up the whole top ranking of the severity rating system anymore.”

Cesar Cerrudo, CEO of Argeniss Information Security and Software, agreed that Microsoft’s definition of a “Critical” vulnerability no longer jives with the threats and attacks that organizations are contending with. “For most ‘critical’ vulnerabilities right now, the exploitation consists of a user visiting a Web page or opening a (Microsoft) Office document, etc….There aren’t Internet worms being propagated,” Cerrudo wrote.

Broadening the concept of severity to pull in more factors than just the ability to create worms is key to modernizing the Severity Rating System, our experts agreed.

< back | next: make exploitability count >

Make Exploitability Count

How might the idea of Severity be broadened? One idea all our experts agreed on was the need to make the ease with which vulnerability can be exploited a core element of the severity rating. While “wormability” matters, the ease with which vulnerability might produce a self-replicating threat is just one factor, behind the ease with which any hole might be used – “exploited” – to take control of a remote system.

Cerrudo, who has discovered prominent holes in Microsoft products including Windows and SQL Server, said the increasing use of exploits for previously unknown (or “zero day”) vulnerabilities makes exploitability more relevant now than it was ten years ago, but that Microsoft’s existing Severity Rating System fails to sufficiently account for it.
“Critical for me should be any vulnerability that can be exploited remotely without needing user name and password and with minimal user interaction,” he wrote. “Important should be used for vulnerabilities that require local access or remote access with a user name and password, or that require complex user interaction or specific configurations to work.”

Miller agrees. “I think it’s important to …make a distinction between those vulnerabilities that are known to have an exploit ‘in the wild’ and those that are just known vulnerabilities,” he wrote.

That might sound like common sense. But, as it stands, Microsoft tracks vulnerability severity and exploitability separately, according to Jerry Bryan, Group Manager for Response Communications at Microsoft.

Severity is a measure of what Microsoft calls the “maximum impact” of vulnerability. That’s defined as “the highest impact vulnerability may have.” Bryan described “maximum impact” as a way of making sure that Microsoft errs on the conservative side with its Severity Ratings. For example, Bryan explained in an email, if a vulnerability could be used for remote code execution, but might also cause a denial of service, Microsoft would consider the more severe case (remote code execution) when assigning a severity rating (in this case, a critical rating).

Information on how easy it is for attacker to use (or exploit) security vulnerabilities, however, is tracked in what Microsoft calls the “Exploitability Index” a separate rating system introduced in 2008 in response to customer requests for more context around vulnerabilities. The Exploitability Index measures the ease with which a vulnerability could be exploited, on a scale from 1 (“consistent exploit code likely”) to 3 (“functioning exploit code unlikely”).

“Together, these two pieces of information help organizations prioritize the deployment of Microsoft updates for their environment by indicating the risk of an issue being exploited and the impact if the exploit is successful,” Bryan said in a statement.

Microsoft combines the two elements in its “deployment priority” chart with each patch release, which it describes as a ranking “based on severity rating, exploitability index rating, available mitigation and workarounds and range of affected products.”

Of course, how “easy” something is to exploit is a subjective measure that’s based, in part, on the researcher’s level of skill and expertise, Miller and others acknowledge. Other vulnerability scoring systems, notably the Common Vulnerability Scoring System (CVSS) that is maintained by the Forum of Incident Response and Security Teams (FIRST) blend exploitability data with other factors, including environmental considerations and “impact” – the way the vulnerability might expose sensitive data or be used to compromise or disable key systems. While not exact, the elements of CVSS ratings are simple enough to allow the creation of tools for generating consistent severity scores as well as impact and exploitability sub scores.
That’s not a bad model for Microsoft, says Miller: overall ratings that take exploitability into account, along with sub ratings that indicated whether a vulnerability didn’t require user interaction to trigger and for which a public exploit was circulating, or – barring a public exploit – whether “mad skillz” would be required to create one or “my grandma could write the exploit,” would help shade an individual vulnerability in a way that would be useful for IT pros, Miller said.

< back | next: get with the (CVSS) program >

Get with the (CVSS) Program

One problem with Microsoft’s Severity Rating system is that the system, itself, is proprietary and doesn’t play nice with other vendor-neutral industry standards. Bulletins might link out to independent vulnerability tracking services like MITRE’s Common Vulnerabilities and Exposures List, but don’t necessarily track to them.

At the same time, IT professionals have a wealth of information from the company on any vulnerability, but have to cobble it together from different sources, including advance bulletin notifications, security advisories, the MSRC blog, Microsoft’s exploitability index, and so on.

That complexity is enough that popular understanding of the Severity Ratings has drifted from their actual definitions. After almost a decade, he argues, the Ratings have become little more than “rules of thumb” internalized by harried IT administrators rather than pinpoint accurate declarations about the potential harm that could result from a specific hole going unpatched, Cerrudo said.

“I think what Microsoft has accomplished over the years (is) people associating a ‘Critical’ severity rating with ‘very dangerous,’ ‘patch should be applied ASAP,’ (and) ‘I’m going to be hacked if I don’t patch,'” Cerrudo wrote. However, lower ratings tend to fall off the map.

“Important severity level means: ‘I can patch later’, ‘There are more critical things I should care about.'” Moderate and Low vulnerabilities, he said, often are interpreted as ‘We will patch someday’ or ‘Not dangerous at all,’ Cerrudo wrote.

That becomes problematic when “Important” vulnerabilities are understood to be holes that lack urgency, even though a close reading would show that they could facilitate remote compromise of vulnerable systems.

As an example of just this problem, Wysopal notes MS10-088, a recent Microsoft bulletin issued regarding two recent Powerpoint vulnerabilities. Microsoft assigned an “Important” severity rating to the bulletin, despite the fact that they affected a common component of the widely used Office Suite to the hole and that each of the vulnerabilities that were identified would allow an attacker to “take complete control of an affected system…” installing their own programs or user accounts and modifying, copying or deleting data. Both could also be leveraged in Web based attacks by “a Web site that contains a Web page that is used to exploit this vulnerability” or “Web sites and Web sites that accept or host user-provided content… (and) contain specially crafted content that could exploit this vulnerability.”

CVSS, on the other hand, rates the MS10-088 vulnerabilities “HIGH” with a score of 9.3 out of 10 – with an exploitability sub score of 8.6 out of 10 and impact subscore of 10 out of 10 contributing to the overall rating. In that case, as others, Wysopal argues, Microsoft’s bias towards “wormability” dilutes the severity of vulnerabilities that are the most commonly used tools for compromising individuals and organizations today: those sent via the Web and aimed at client side software.

Wysopal suggests that Microsoft ditch its proprietary severity rating system altogether in favor of the CVSS system, which he calls a “more realistic and common standard” that has also been adopted as the basis for NIST’s National Vulnerability Database (NVD).

< back | next: embrace complexity >

Embrace Complexity

Ultimately, any kind of rating system will oversimplify the complex issues that surround vulnerabilities, but researchers say that’s OK. In fact, Microsoft needs to embrace complexity if it is to reform its existing and flawed Severity Rating System.

“If I were writing a rating system, it would be much more refined,” wrote Miller. “As someone who finds bugs for a living, I have to assess the ease of exploitation of bugs all the time, and it is difficult, and subjective, but I have to do it in order to move on.”

Microsoft could help by providing more contexts for exploitability in its bulletins that help shade different vulnerabilities and also supports the ratings that are given.

“Consider the difference between an off-by-one error that allows you to write a null one byte past a buffer compared with a bug that allows you to write an arbitrary amount of arbitrary data past a buffer,” Miller offered, as an example. “I think it’d be really interesting if they tried to make a scale from 1-10 based on this criteria,” he said.

Jeremiah Grossman, CTO and founder of application security firm WhiteHat Security thinks that Microsoft’s Severity Ratings need to embrace complexity, even as they strive to simplify information about vulnerabilities.

“There are several variables, of varying degrees of real importance, worth considering in a final actionable ‘severity’ formula,” he writes. Among them: system severity, exploit difficulty, wormability, local or remote accessibility, industry knowledge and pervasiveness. While there might not be an easy or consistent way to weight all those different factors, Grossman believes that its worth trying, especially if Microsoft can mine its knowledge of exploits and use it to help rank a decade of vulnerability data. “They might create a new formula, apply it to old and patched issues, and see if the resulting severity matches what would be reasonable accurate (in light of passed exploits),” Grossman suggested.

< back | next: do no harm >

Last: Do No Harm

Despite their criticisms of elements of Microsoft’s Severity Ratings System, there was broad agreement among the experts we contacted that Microsoft should keep some kind of rating system in place – even if it decides not to fix its flaws.

While the Severity Rating System may be more a reflection of the security landscape at the time it was created than an accurate measure of severity in today’s threat environment, its still much better than what preceded it: a ponderous, erratic and overly technical security bulletin system that focused on technical descriptions of vulnerabilities and patches, without providing much in the way of guidance to IT practitioners.

After close to a decade, IT staff has developed a familiarity and understanding of the rating system: building patch processes around the company’s monthly Patch Tuesday bulk release, and prioritizing activities based on the Severity Ratings of those patches.

“I think one of the successes is the predictable patching schedule,” wrote Grossman of WhiteHat. “From the statistics I read on actual breaches, what’s most important in patch management is not necessarily the speed, but the coverage. Predictable schedules, I think, encourage more broad patching coverage.”

In short: ten years later, the Severity Rating System, even in its current incarnation, accomplishes its core objective: communicating information about the relative seriousness of security vulnerabilities and the need to patch to a broad audience, our experts agreed. Most said that they’d be loath to throw it out altogether.

“I think the way they do it is meaningful in the way it is intended,” wrote Miller. “The rating system is designed for enterprise customers to help them decide how quickly they need to patch,” he wrote.

Cerrudo of Argeniss agreed. “The Severity Rating System is how Microsoft communicates to people about what vulnerabilities should be patched ASAP or not,” he wrote.

The question, these experts say, is whether Microsoft is, in the next decade, willing to commit equivalent resources to secure development and improved communication about product vulnerabilities as they have committed in the past decade.

Some, like Grossman, worry that the company’s attention may be wandering now that software security isn’t perceived of as much of a liability for Microsoft.

“This is no longer the era of (Microsoft) trying to look more secure than Linux,” wrote Grossman, who has blogged on the topic. “Microsoft won the security PR war long ago. Right or wrong, Adobe and Apple are the new whipping boys now.”
And that may be a lost opportunity for the community, as a whole, he said.

“We’ve learned a great deal as an industry since 2002. A modern system should take much more into consideration.”

< back