A Case Against CVSS: Vulnerability Management Done Wrong
Brief
The Common Vulnerability Scoring System (CVSS) is used to describe computer vulnerabilities and their severity should they be exploited, and is commonly used as a means of gauging the risk that a single vulnerability introduces into an environment. With a lack of justification for its underlying formula, inconsistencies in its specification document, and no correlation to exploited vulnerabilities in the wild, it is unable to provide a meaningful metric for describing a vulnerability’s severity.
CVE & CVSS Background
CVSS is a ubiquitous information security standard used to gauge the severity of Common Vulnerabilities and Exposures (CVEs) on a scale of 1(Low) to 10 (Critical) with the first iteration, CVSS v1, being introduced in 2005. There were other vulnerability scoring systems at the time, but unlike the others, CVSS was vendor agnostic, and sponsored by the National Institute for Standards and Technology (NIST). CVSS v1 was replaced with CVSS v2 in 2007, and all the CVE’s CVSS scores were backfilled with CVSS v2 scores. The current version released in 2019, CVSS 3.1, shares the same rating scheme as CVSS v3.0, released in 2015, with only a few differences in features. Currently the standard is maintained by the CVSS Special Interest Group (CVSS-SIG). Since CVSS v3.0 and CVSS v3.1 have identical scoring formulas, CVSS v3, will be used to refer to both of them.
The centrally managed repository of CVEs is known as the National Vulnerability Database (NVD),[4] which has been curated by NIST since 2000. The NVD experienced many growing pains in its first few years, as a centralized vendor agnostic database for cataloging computer vulnerabilities was a novel idea. Currently, the NVD contains over 150,000 CVEs and is growing rapidly with the proliferation of software, vulnerability testing capabilities, and CVE reporting infrastructure as shown below in figure M. Both CVSS and CVE are standards contained under the umbrella of the Security Content Automation Protocols (SCAP), a series of systems created to help manage and document security content.
CVSS is a Risk Score Without Accountability
CVSS has largely been used as a risk score. It requires little thought and is less expensive than performing a full risk assessment to say that a CVE with a CVSS v3 of 9.0 needs to have priority patching over one with a 7.0 without consideration for the context surrounding the vulnerability. The CVSS-SIG recognized this issue, and after many complaints, including those outlined in the 2018 paper, Towards Improving CVSS, released the specification documents for CVSS v3.1 in June of 2019. The title of section 2.1 in the CVSS v3.1 User Guide plainly states that “CVSS Measures Severity, not Risk”. This is intended to further separate CVSS from its perception as a risk score, but is contradicted by CVSS v3.1’s own specification document. CVSS measures risk by incorporating threat into its scoring with the metric, Exploit Code Maturity (ECM), included in the Temporal Metric Group. Potential values of the ECM include Not Defined, High, Functional, Proof of Concept and Unproven. Naturally, the ECM of a CVE should relate directly to the threat of exploitation of a vulnerability, as the measure of exploitability is a measure of threat. If the ECM is High, this vulnerability is much more likely to be successfully exploited than if the ECM is just functional. This is analogous to someone with a gun vs. a bow and arrow, which weapon is more developed, which one is the greater threat? With the ECM, CVSS is able to take in vectors for threat and vulnerability into its formula, the criteria for measuring risk in cyber security. This is inappropriate if CVSS is meant to be solely a vulnerability score.
As an aside, it stands to reason that if exploit code has been developed, then attack complexity should be decreased. While the exploitation could still be technically complex, this process is automated with exploit code, making the exploitation itself trivial.
Validity of CVSS Weightings
CVSS v3 is made up of a variety of metrics with a wide range of weightings in its formula, yet there are no publicly available documents providing justification for the weighting of each constituent vulnerability metric in the equation, nor are the equations used to derive those weightings made available. For example, in the CVSS Exploitability Metric Group, the most severe level for Attack Complexity (AC) is Low (L) with a corresponding formula coefficient of .77, the most severe value for the Attack Vector (AV) is Network (N) with a formula coefficient of .85. I am unable to determine from the publicly available information why AC/L is weighted less than AV/N in the formula for the Exploitability Score. This was left up to the subjective opinions of the CVSS v3 authors; an alternative group of security professionals could have weighted these two metrics differently. Until CVSS-SIG is able to present the underlying formulas producing the weightings for each CVSS v3 metric, the system remains completely unjustified.
General Context Issues
The issues regarding CVSS’s inability to account for a vulnerability’s context are well known and widely documented in academic papers, forums, and blogs.[10][6][11][12][18] The Environmental Score is a component of CVSS v3 meant to grapple with this issue, but lacks the granularity required to be effective. The Environmental Score allows you to create a score based on the requirements you may have for a system’s confidentiality, integrity, or availability. The metrics and values for each of these scores can be found in table E.[3] Changing the value of any of these variables only allows you to move the score up or down one increment, which is simply not enough for a large and diverse environment. To illustrate, in a pharmaceutical company, the availability value for a system managing assembly line manufacturing would be the same as a web server, high. They both require high availability, but the requirement for a manufacturing system would be much higher in actuality than that of the web server in a business value comparison. Even if you do use the Environmental Score in this instance, you would have to supplement it with a compensatory control, such as GxP. A scale would be much more useful than the current four value nominal system, only two values of which modify the score.
Quantification of the Confidentiality Impact in the Impact Metrics is also an issue for CVSS. The Confidentiality Impact is defined roughly as the amount of data that a vulnerability gives you access to in its context, but this approach is incapable of accounting for the value of the data that may be leaked from a vulnerability. Take CVE-2014–3566, aka POODLE; it is a MITM vulnerability with a CVSS v3 score of 3.4, which allows the attacker to decrypt a small amount of ciphertext traveling over SSL 3.0. The revealed information from this vulnerability could be benign or the CEO’s SSN, either way, CVSS is unable to account for it. An argument could be made that this is where the Environmental Score is meant to be applied, but it would not be useful for three reasons. One, when POODLE was disclosed, SSL 3.0 was ubiquitous across nearly every networked machine, it would be impractical to apply the environmental score to all of them. Two, it is generally not easy to predict what kind of data is being transported over a system, which is what is needed to determine the Confidentiality Requirement. Three, the vulnerability is already so low that, even if the Confidentiality Requirement is set to High, the CVSS v3 Environmental Score still only comes out to 4.2, just barely a medium severity vulnerability according to table A. Examples like this where the greater context of a vulnerability cannot be taken into account make CVSS prone to false negatives and positives.
Does CVSS really measure severity?
CVSS scores are grouped into increasing levels of severity by the CVSS-SIG in table A.[3] Vulnerabilities that are more severe should in principle mean that they are more attractive candidates to develop exploitation and weaponization mechanisms for. If this is the case, we should see a linear correlation between the CVSS v3 score of a CVE and the likelihood of the CVE having been presently or historically weaponized. A weaponized vulnerability is defined as a vulnerability that has been exploited with malicious intent in the wild.
This hypothesis will be tested using a sample of CVEs with CVSS v3 scores where the weaponization status can be reliably determined. A CVE’s weaponization status is decided by information available in the Qualys Knowledge Base (QKB) associated with each CVE. Qualys Inc., a company focused on vulnerability management solutions, maintains exploitation information on all of the CVEs it tracks in the QKB. Of the 152,159 CVEs captured in the NVD circa October 2020, 70,140 or 54% are assigned CVSS v3 scores. Of that, 28,779, or approximately, 41% of CVEs with CVSS v3 scores are tracked by the QKB. While Qualys was unable to comment on what criteria a CVE must meet to be tracked, the similarity in the characteristics between our two data sets shown below in table X. would suggest that our sample from the QKB is representative of a significant portion of the total CVEs with CVSS v3 scores from the NVD, if not the whole NVD.
CVE weaponization status is not explicitly defined in the QKB, so it has been derived from the Threat Intel Values (TIV)s that are associated with each CVE. A CVE is confirmed to have been weaponized if the TIV, Exploit_Kit, Malware, or Active_Attacks, is associated with it. While there are TIVs that indicate the development of PoC exploit code, this does not necessarily mean that the code was used for malicious intent. The distribution of weaponization values across floored CVSS v3 values in the sample dataset is shown in diagram A. and table C. Values are floored for data visualization and are not floored during statistical tests. Linear correlation tests between a vulnerability’s weaponization status, its CVSS v3 score, and elements of its CVSS v3 vector string are shown in table D.CVE weaponization status is not explicitly defined in the QKB, so it has been derived from the Threat Intel Values (TIV)s that are associated with each CVE. A CVE is confirmed to have been weaponized if the TIV, Exploit_Kit, Malware, or Active_Attacks, is associated with it. While there are TIVs that indicate the development of PoC exploit code, this does not necessarily mean that the code was used for malicious intent. The distribution of weaponization values across floored CVSS v3 values in the sample dataset is shown in diagram A. and table C. Values are floored for data visualization and are not floored during statistical tests. Linear correlation tests between a vulnerability’s weaponization status, its CVSS v3 score, and elements of its CVSS v3 vector string are shown in table D.
As graph A. and table C. indicate, there is no linear correlation between a CVE’s CVSS v3 score and its weaponization status. In fact, it may be the case that CVEs with CVSS v3 scores of 7.0 are actually the most severe on average if we are measuring severity by their actual exploitation. This data also shows that any security practitioner wanting to protect their environment should not use CVSS as the basis for prioritizing the device patching, regardless of context concerns.
As only a fraction of the overall patches can be applied due to the resource constrained nature in which most IT organizations operate, resolving the most severe vulnerabilities according to CVSS v3 would not substantially improve their security posture.
Conclusion
CVSS v3 is laden with problems. There is no clear reasoning given as to how the system was devised, it is riddled with logical inconsistencies, and overlooks many components of its attributes, only able to partially account for the context of a vulnerability, as well as being an empirically poor means of representing a vulnerability’s true severity. It is unreasonable for NIST and PCI DSS to recommend and mandate that CVSS be used for the purposes of vulnerability management.[8][9] CVSS is an artifact of an emerging discipline, and despite being on effectively its third version, is inadequate for effective patch management. With it’s fourth iteration in active development, the standard will need radical changes as well as efficacy tests to ensure transparency and viability. [20]
Sources
[1]Campagna, Rich. “5 Reasons to Stop Using CVSS Scores to Measure Risk.” Balbix, 12 Sept. 2020, www.balbix.com/blog/5-reasons-to-stop-using-cvss-scores-to-measure-risk/.
[2]“CVSS v2 History.” FIRST, www.first.org/cvss/v2/history.
[3]“CVSS v3.1 Specification Document.” FIRST, www.first.org/cvss/v3.1/specification-document.
[4]National Vulnerability Database, NIST, 12 Dec. 2020, nvd.nist.gov/.
[5]Avner, Gabriel. “The National Vulnerability Database Explained.” WhiteSource, 15 Sept. 2020,resources.whitesourcesoftware.com/blog-whitesource/the-national-vulnerability-database-explained.
[6]Spring, Jonathan et al. Towards Improving CVSS. 14 Dec. 2018, resources.sei.cmu.edu/library/asset-view.cfm?assetid=538368.
[7]Qualys Knowledge Base, Qualys Inc., 12 Dec. 2020, www.qualys.com/. A huge thanks to Qualys for allowing me to use their data for my research.
[8]“Payment Card Industry (PCI) Data Security Standard.” PCI Security Standards, July 2006, www.pcisecuritystandards.org/.
[9]Souppaya, Murugiah, and Karen Scarfone. “NIST Special Publication 800–40 Revision 3: Guide to Enterprise Patch Management Technologies.” Nvlpubs, July 2013, nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800–40r3.pdf.
[10]Allodi, Luca. The Effect of Security Education and Expertise on Security Assessments: the Case of Software Vulnerabilities. ArXiv, 2018.
[11]Klinedinst, Dan J. CVSS and the Internet of Things, 2 Sept. 2015, insights.sei.cmu.edu/cert/2015/09/cvss-and-the-internet-of-things.html.
[12]Prajapati, Yenifer. “A Review of the Common Vulnerability Scoring System.” Medium, Critical Stack, 10 Sept. 2018, medium.com/critical-stack/a-review-of-the-common-vulnerability-scoring-system-2c7d266eda28.
[13]Möller, Bodo, et al. Google, 2014, This POODLE Bites: Exploiting The SSL 3.0 Fallback.
[14]“Common Vulnerability Scoring System Version 3.0 Calculator.” FIRST, www.first.org/cvss/calculator/3.0.
[15]“CVSS v3.0 User Guide.” FIRST, www.first.org/cvss/v3.0/user-guide.
[16]“Alert (TA14–290A).” Cybersecurity and Infrastructure Security Agency CISA, us-cert.cisa.gov/ncas/alerts/TA14–290A.
[17]“Microsoft Exploitability Index.” Microsoft, June 2001, www.microsoft.com/en-us/msrc/exploitability-index.
[18]Amit, Yair. “Symantec Mobile Threat Defense: Spotlight on Modern Endpoint Vulnerability Management.” Symantec Blogs, symantec-enterprise-blogs.security.com/blogs/product-insights/symantec-mobile-threat-defense-spotlight-modern-endpoint-vulnerability-management.
[19]Saunders, Harry. “FIRST Announces Availability of New Common Vulnerability Scoring System (CVSS) Release.” FIRST, 10 June 2015, www.first.org/newsroom/releases/20150610.
[20]Manion, Art, and Jake Kouns. “Vulnerability Prioritization and Disclosure — The Right Security.” Risk Based Security, 15 Dec. 2020, www.riskbasedsecurity.com/2020/12/15/vulnerability-prioritization-and-disclosure-the-right-security/.