Release 2018-11-06
2018-11-06 Updated data
2018-01-30 Updated data
2017-11-08 Updated data. Sorted JSON keys so future updates should diff more cleanly in git commit logs.
2017-03-30 Updated data. Some more CVSS scores backfilled.
2016-11-03 Updated data. Backfilled "Not Defined" values for many vulnerabilities' CVSS scores.
2016-06-24 Changed search URL, sponsorship, added GitHub migration, no change to data
2016-03-30 Updated data
2014-12-08 Updated data, note significant number of nearly identical reports associated with automated Android SSL testing
2014-05-22 Updated data
2013-11-20 Updated data
2013-04-03 Updated data, added field definition for [DateCreated], fixed 8-bit characters in section 8
2012-07-10 Initial release
This data archive contains nearly all of the non-sensitive vulnerability data collected by CERT, from the inception of the vulnerability notes database (approximately May 1998) to the date the archive was prepared, as noted above in the Change Log.
Since roughly 2004, the United States Department of Homeland Security (DHS) United States Computer Emergency Readiness Team (US-CERT) and other government sponsors have funded the vulnerability analysis and coordination work that includes this vulnerability data and the publication of Vulnerability Notes.
This data is incomplete. All records (reports) should have an ID, title, and creation date. Only some (~6%) of the reports have been analyzed, coordinated, written up, and published as Vulnerability Notes.
Most of the reports are in a preliminary state, with blank or default field values. Few fields are consistently entered across the entire data set. It is generally inappropriate from an analysis perspective to draw conclusions from incomplete and inconsistent data. You have been warned.
There are two sets of data, vulnerability reports and vendor records. A published Vulnerability Note is made up of one vulnerability report and one or more vendor records.
In this document, field names are enclosed in square brackets, like this: [FieldName].
Vulnerability reports describe information about a reported vulnerability. A report may contain 0 or more vulnerabilities. CERT typically attempts to have one vulnerability report per vulnerability, but this isn't always a practical level of abstraction. [VulnerabilityCount] is the number of vulnerabilities (per CERT's definition) in a vulnerability report.
Vulnerability reports have 0 or more associated vendor records. Vendor records describe vendor information related to a vulnerability report. A record is typically created when CERT notifies a vendor about a vulnerability, reasonably believes the vendor may be affected, becomes aware of information about the vendor related to the vulnerability, or otherwise feels that there is relevant information about the vendor related to the vulnerability.
See https://github.com/CERTCC-Vulnerability-Analysis/Vulnerability-Data-Archive-Tools for some python tools to get you started with using this data.
Did you find something interesting in the data? Did you come up with some cool way of slicing it or remixing it and you want to share? You can tweet us @certcc. Or send mail to [email protected].
If you find a problem with the data or the tools, please create an issue report in the appropriate repository.
Please be aware though that we offer no formal support, however we may respond to questions and feedback sent to [email protected] with the tag VCALL-119 in the subject.
The data in this repository has been transformed from the data found at http://www.cert.org/download/vul_data_archive/. That data set contains archives raw exports of the CERT Vulnerability Notes database. The Vulnerability Notes Database is a Lotus Notes application, and the raw JSON and XML exports in the original archive can be difficult to work with. In this repository we've converted the JSON data to more conventional key-value pairs to make it easier to use.
The directory structure is as follows:
./data/
contains the entire data set.- Below that,
./data/0/
through./data/99/
contain subdirectories for individual vulnerability reports. A vulnerability report can be found by taking the VU#NNNNN ID modulo 100. (I.e., the last two digits of the VU#, so VU#123456 would be in./data/56/
) - Individual vulnerability report directories
./data/56/vu_123456/
should contain exactly onevu_*.json
file and 0 or morevendor_*.json
files. vu_*.json
files contain data as described in Field Definitions for Vulnerability Reportsvendor_*.json
files contain data as described in Field Definitions for Vendor Records
In the Lotus Notes application from which this data was exported, s\Some fields are initially created as text and the field type is updated only once data has been entered. For example, a datetime field that is blank may appear as a text field.
Descriptions for fields included in published Vulnerability Notes are also available here.
Some of the fields in this archive are not included in published Vulnerability Notes.
Field Name | Data Type | Description |
---|---|---|
[ID] | text | Vulnerability report ID, the format is VU#nnnnnn (six digits), with some older records having fewer than six digits. This is the associated vulnerability report for the vendor record. |
[IDNumber] | text | The numerical portion of the ID (nnnnnn, six or sometimes fewer digits). |
[Title] | text | Title of the vulnerability report. May be HTML-escaped. |
[Keywords] | textlist | List of keywords. These become HTML meta keywords in a published Vulnerability Note. |
[Overview] | text | Overview/summary of the vulnerability report. |
[Description] | text | A detailed description of the vulnerability report. |
[Impact] | text | The impact of the vulnerability. |
[Resolution] | text | A solution to the vulnerability, typically a complete solution, such as a patch or update. |
[Workarounds] | text | Workarounds or mitigations for the vulnerability, typically something less than a complete solution, but still effective at mitigating the vulnerability. |
[SystemsAffectedPreamble] | text | A preamble to the list of vendors. |
[ThanksAndCredit] | text | Acknowledgement, credit, and possibly thanks to the person(s) or organization(s) who discovered or reported the vulnerability or who contributed to the coordination or analysis effort or information used in the Vulnerability Note. |
[Author] | text | This field is set to the name of the analyst who is first assigned the vulnerability report. When a report is published as a Vulnerability Note, this field is the author. |
[References] | text or textlist | URLs to reference information about the vulnerability report. |
[CVEIDs] | text or textlist | List of one or more related CVE IDs. |
[CERTAdvisory] | text or textlist | References to one or more CERT Advisories. |
[US-CERTTechnicalAlert] | text or textlist | References to one or more US-CERT Technical Alerts. |
[VulnerabilityCount] | number | The number of unique vulnerabilities in a vulnerability report. |
[DateCreated] | datetime | Date the vulnerability report was created. This closely corresponds to the date CERT was first aware of the vulnerability. |
[DatePublic] | datetime | Date the vulnerability is known to be public. This may be the date a Vulnerability Note is published. This field may only contain date information, not time. |
[DateFirstPublished] | text or datetime | Date the Vulnerability Note was first published. This field is a text type if it is blank, datetime when it is populated. |
[DateLastUpdated] | datetime | Date the vulnerabilty report was last updated. |
[Revision] | number | Number of times the vulnerability report was revised, with "1" being the initial creation of the report. |
Note on fields beginning with VRDA_D1_ | text (integers) | "VRDA_D1_" in the field name indicates that the field is used in the first round of decision support (triage, surface analysis, or D1) to determine how to handle the vulnerability report. Although the "VRDA_D1" component fields are text type, they can be safely treated as numbers (integers). "VRDA_D1" fields were added to the database in 2007. More information about VRDA (Vulnerability Response Decision Assistance) is available here: http://search.cert.org/search?q=vrda |
[VRDA_D1_DirectReport] | text | If this field is set to "1" then the vulnerability was directly reported to or found by CERT. If this field is "0" then the vulnerability was not a direct report or internal discovery. If this field is blank then the report may or may not have been a direct report or internal discovery. |
[VRDA_D1_Population] | text | This field answers the question "What is the population of affected systems?" [VRDA_D1_Population] maps to [CVSS_TargetDistribution]. "1" - Low "2" - Low-Medium "3" - Medium-High "4" - High (e.g., Microsoft Windows, Adobe Flash, TCP, DNS, core UNIX/Linux) |
[VRDA_D1_Impact] | text | This field answers the question "What is the impact of the vulnerability?" "1" - Low (e.g., nuisance DoS/resource consumption, limited information disclosure) "2" - Low-Medium "3" - Medium-High "4" - High (e.g., execute arbitrary code with elevated privileges, take control of target) |
Notes on fields beginning with CAM_ | text (integer) | "CAM_" in the field name stands for "CERT Advisory Metric." More information is available here: http://www.kb.cert.org/vuls/html/fieldhelp#metric If all of the "CAM_" fields in a vulnerability report are "0" then it is most likely that the vulnerability report has not been analyzed beyond initial creation and surface analysis. Although the "CAM" component fields are text type, they can be safely treated as numbers (integers from 0-20). The calculated "CAM" fields are number type. |
[CAM_WidelyKnown] | text | This field answers the question "Is information about the vulnerability widely available or known?" |
[CAM_Exploitation] | text | This field answers the question "Is the vulnerability being exploited?" |
[CAM_InternetInfrastructure] | text | This field answers the question "Is internet infrastructure at risk because of this vulnerability?" |
[CAM_Population] | text | This field answers the question "How many systems on the internet are at risk from this vulnerability?" |
[CAM_Impact] | text | This field answers the question "What is the impact of exploiting the vulnerability?" |
[CAM_EaseOfExploitation] | text | This field answers the question "How easy is it to exploit the vulnerability?" |
[CAM_AttackerAccessRequired] | text | This field answers the question "What are the preconditions does an attacker require to exploit the vulnerability?" |
[CAM_ScoreCurrent] | number | Calculated CERT Advisory Metric score, decimal number from 0-180. |
[CAM_ScoreCurrentWidelyKnown] | number | Calculated CERT Advisory Metric score with [CAM_WidelyKnown] set to 20. |
[CAM_ScoreCurrentWidelyKnownExploited] | number | Calculated CERT Advisory Metric score with [CAM_WidelyKnown] and [CAM_Exploitation] both set to 20. |
[IPProtocol] | text | IP protocol information related to the vulnerability, e.g., 80/tcp, 161/udp. |
Notes on fields beginning with CVSS_ | text | "CVSS_" in the field name stands for "Common Vulnerability Scoring System." More information is available here: http://www.first.org/cvss/cvss-guide and in Vulnerability Severity Using CVSS. Vulnerability Reports started including CVSS v2 metrics in March 2012. Older reports will have empty CVSS_ values. Reports that have not been scored will have default CVSS_ values, including "--" for some fields. |
[CVSS_AccessVector] | text | See http://www.first.org/cvss/cvss-guide#i2.1.1 |
[CVSS_AccessComplexity] | text | See http://www.first.org/cvss/cvss-guide#i2.1.2 |
[CVSS_Authenication] | text | See http://www.first.org/cvss/cvss-guide#i2.1.3 |
[CVSS_ConfidentialityImpact] | text | See http://www.first.org/cvss/cvss-guide#i2.1.4 |
[CVSS_IntegrityImpact] | text | See http://www.first.org/cvss/cvss-guide#i2.1.5 |
[CVSS_AvailabilityImpact] | text | See http://www.first.org/cvss/cvss-guide#i2.1.6 |
[CVSS_Exploitability] | text | See http://www.first.org/cvss/cvss-guide#i2.2.1 |
[CVSS_RemediationLevel] | text | See http://www.first.org/cvss/cvss-guide#i2.2.2 |
[CVSS_ReportConfidence] | text | See http://www.first.org/cvss/cvss-guide#i2.2.3 |
[CVSS_CollateralDamagePotential] | text | See http://www.first.org/cvss/cvss-guide#i2.3.1 |
[CVSS_TargetDistribution] | text | See http://www.first.org/cvss/cvss-guide#i2.3.2 [CVSS_TargetDistribution] maps to |
[CVSS_SecurityRequirementsCR] | text | See http://www.first.org/cvss/cvss-guide#i2.3.3 |
[CVSS_SecurityRequirementsIR] | text | See http://www.first.org/cvss/cvss-guide#i2.3.3 |
[CVSS_SecurityRequirementsAR] | text | See http://www.first.org/cvss/cvss-guide#i2.3.3 |
[CVSS_BaseScore] | text | See http://www.first.org/cvss/cvss-guide#i3.2.1 |
[CVSS_BaseVector] | text | See http://www.first.org/cvss/cvss-guide#i2.4 |
[CVSS_TemporalScore] | text | See http://www.first.org/cvss/cvss-guide#i3.2.2 |
[CVSS_TemporalVector] | text | See http://www.first.org/cvss/cvss-guide#i2.4 |
[CVSS_EnvironmentalScore] | text | See http://www.first.org/cvss/cvss-guide#i3.2.3 |
[CVSS_EnvironmentalVector] | text | See http://www.first.org/cvss/cvss-guide#i2.4 |
Field Name | Data Type | Description |
---|---|---|
[ID] | text | Vulnerability report ID, the format is VU#nnnnnn (six digits), with some older records having fewer than six digits. This is the associated vulnerability report for the vendor record. |
[VendorRecordID] | text | Lotus Notes unique ID for a vendor record |
[Vendor] | text | Name of the vendor. |
[Status] | text | Status of the vendor with regard to the vulnerability. By default, the status is "Unknown." If we believe a vendor is affected, the status is "Affected" or "Vulnerable". If we believe a vendor is not affected, the status is "Not Affected" or "Not Vulnerable". |
[VendorStatement] | text | A statement about the vunerability by the vendor. The statement in this field is authenticated -- usually the text has been cryptographically signed by the vendor or verified by CERT out-of-band. |
[VendorInformation] | text | Information about the vendor about the vulnerability. This is more loosely vetted information, for example something available on the vendor's web site. |
[VendorReferences] | text or textlist | URLs to vendor reference information about the vulnerability. |
[Addendum] | text | Additional comments or rebuttal from CERT to |
[DateNotified] | datetime | Date the vendor was notified by CERT. Notification implies at least that CERT sent email to a last-known good contact at the vendor. |
[DateResponded] | datetime | Date the vendor responded. |
[DateLastUpdated] | datetime | Date the vendor record was last updated. |
[Revision] | number | Number of times the vendor record was revised, with "1" being the initial creation of the record. |