Technical writing

NIST NVD: The National Vulnerability Database Behind CVE Scoring and Cybersecurity Compliance

November 23, 2026· AI Analytics

NISTNVDCybersecurityCVEFederal Data

Every piece of software running in a federal agency, a hospital network, a bank, or a consumer device carries a history of publicly disclosed security flaws. The infrastructure that catalogs those flaws, scores their severity, classifies their type, and maps them to affected products is the NIST National Vulnerability Database—the US government's authoritative repository of standards-based vulnerability management data and the backbone of virtually every compliance framework in American cybersecurity.

What NVD Is

The National Vulnerability Database (NVD) is maintained by the National Institute of Standards and Technology (NIST) under the US Department of Commerce. Published since 1999 and significantly expanded in 2005, NVD functions as an enrichment layer built on top of the Common Vulnerabilities and Exposures (CVE) system maintained by the MITRE Corporation at CVE.org, which is funded by the Cybersecurity and Infrastructure Security Agency (CISA). As of 2024, NVD contains more than 250,000 CVE entries spanning commercial software, open-source libraries, operating systems, hardware firmware, industrial control systems, and network devices.

The division of labor between MITRE and NIST is architecturally important. MITRE assigns and maintains CVE identifiers—unique labels for publicly disclosed vulnerabilities—but the CVE record itself contains only minimal metadata: a description, a list of references, and the identifier. NIST's NVD takes each CVE and enriches it with a standardized set of additional data: a CVSS severity score, a CWE weakness classification, a CPE (Common Platform Enumeration) list of affected software and hardware versions, additional reference links, and remediation guidance where available. This enrichment transforms a simple vulnerability label into a structured record that compliance frameworks, patch management tools, and vulnerability scanners can consume programmatically.

NVD is the data source that feeds the vulnerability signature databases used by Qualys, Tenable Nessus, Rapid7 InsightVM, and essentially every commercial vulnerability management platform. When a scanner reports that a host is running software with a known Critical vulnerability, it is comparing installed software versions against CPE data derived from NVD. When a compliance officer reports that a system has unpatched High-severity vulnerabilities, the severity thresholds are defined by CVSS scores from NVD. NVD is infrastructure so ubiquitous in commercial security tooling that its presence is rarely noticed—until, as occurred in 2024, NIST falls behind on enriching new CVEs.

The CVE Assignment System

CVE identifiers follow the format CVE-YYYY-NNNNNN, where YYYY is the year of assignment and NNNNNN is a sequence number that since 2014 has had no fixed width, allowing for arbitrarily large annual counts. CVEs are assigned by CVE Numbering Authorities (CNAs)—organizations authorized by MITRE to assign CVE IDs within their defined scope. As of 2024, there are more than 400 CNAs worldwide.

The largest CNAs are major software vendors that assign CVEs for vulnerabilities discovered in their own products: Microsoft, Google (covering Android, Chrome, and Google Cloud), Apple, Red Hat, Oracle, and Cisco each assign hundreds to thousands of CVEs per year. MITRE serves as the “root CNA” and assigns CVEs for vulnerabilities that fall outside any other CNA's scope—primarily open-source projects without a dedicated CNA. CISA operates as a CNA of Last Resort for US government systems. Bug bounty programs, security research firms, and national CERTs in other countries also hold CNA status for their respective domains.

The disclosure model for a CVE determines how and when it enters the public record. Coordinated disclosure—sometimes called responsible disclosure—involves the researcher notifying the affected vendor before public release, giving the vendor time to develop a patch. Google Project Zero popularized the 90-day coordinated disclosure window, after which Project Zero publishes details regardless of patch availability. The 90-day standard has since been widely adopted. Full or immediate disclosure bypasses vendor notification and publishes vulnerability details immediately, a practice controversial in the security community but sometimes employed when researchers believe vendors will not respond or when a vulnerability is already being actively exploited. In 2023, approximately 28,000 new CVEs were assigned—a record at the time, reflecting both the expanding attack surface of modern software and growing investment in vulnerability research.

The 2024 NVD backlog crisis exposed a structural weakness in the CVE-to-NVD enrichment pipeline. Beginning in February 2024, NIST significantly slowed its enrichment of new CVEs with CVSS scores and CPE data, citing resource constraints and a change in its support structure. By May 2024, more than 10,000 CVEs published in 2024 had not yet received CVSS scores from NIST—meaning they appeared in NVD without the severity data that compliance frameworks and vulnerability scanners depend on. Federal agencies operating under FedRAMP and FISMA requirements that mandate patching of High and Critical vulnerabilities found themselves unable to programmatically classify new CVEs that had not yet been enriched. The crisis prompted CISA to work with NIST on a supplemental enrichment program, and a number of commercial vendors began providing their own CVSS scores for unenriched CVEs as a stopgap.

CVSS Scoring

The Common Vulnerability Scoring System (CVSS) is the universal standard for expressing the severity of security vulnerabilities as a numeric score from 0.0 to 10.0. CVSS v3.1 is the current dominant version used in NVD enrichment, though CVSS v4.0 was published by FIRST (Forum of Incident Response and Security Teams) in late 2023 and adoption is gradually increasing. CVSS scores consist of three groups: the Base Score, which reflects the intrinsic characteristics of the vulnerability; the Temporal Score, which adjusts for current exploit availability and remediation status; and the Environmental Score, which adjusts for the specific deployment context of the organization scoring it. NVD publishes Base Scores; Temporal and Environmental scores are typically calculated by the consuming organization or tool.

The CVSS v3.1 Base Score is calculated from six metrics. Attack Vector (AV) describes how the vulnerability is exploited: Network (remotely exploitable, highest severity contribution), Adjacent (requires access to the same physical or logical network), Local (requires local access to the system), or Physical (requires physical access to the device). Attack Complexity (AC) captures whether the attacker needs to meet special conditions beyond controlling the attack vector: Low means the attack can be carried out reliably at will; High means specific conditions must exist that the attacker cannot control. Privileges Required (PR) indicates the level of authentication needed before exploitation: None, Low (standard user), or High (administrator). User Interaction (UI) indicates whether a human victim must perform an action: None or Required. Scope (S) captures whether the vulnerability can affect components beyond the vulnerable component itself: Unchanged or Changed. Impact is measured across three axes— Confidentiality (C), Integrity (I), and Availability (A)—each rated None, Low, or High.

The resulting score ranges map to qualitative severity labels used throughout compliance frameworks. None is 0.0. Low is 0.1 to 3.9. Medium is 4.0 to 6.9. High is 7.0 to 8.9. Critical is 9.0 to 10.0. A CVSS 10.0 score requires: Attack Vector Network, Attack Complexity Low, Privileges Required None, User Interaction None, Scope Changed, and High impact on all three CIA dimensions. Log4Shell (CVE-2021-44228) achieved a CVSS 10.0 score under precisely these conditions: it was remotely exploitable over the network without authentication, required no victim interaction, affected all CIA dimensions, and its Scope was Changed because the exploit allowed the attacker to execute code in the context of the server rather than merely the vulnerable component.

The Temporal Score modifies the Base Score based on two factors that change over time. Exploit Code Maturity reflects whether working exploit code is publicly available: Unproven, Proof-of-Concept, Functional, or High (weaponized). Remediation Level reflects vendor response: Official Fix, Temporary Fix, Workaround, or Unavailable. A vulnerability with a Functional exploit and no official patch will have a Temporal Score close to its Base Score, while a patched vulnerability with no public exploit will see its Temporal Score reduced, reflecting the lower operational risk. The Environmental Score allows organizations to further adjust for their specific context—a vulnerability in software a given organization does not run contributes nothing to their risk posture, and the Environmental Score allows setting Modified Impact metrics to None for non-applicable components.

CISA KEV: Known Exploited Vulnerabilities

The CISA Known Exploited Vulnerabilities (KEV) catalog is the single most operationally actionable vulnerability dataset produced by the US government. Where CVSS scores express theoretical severity, the KEV catalog expresses confirmed operational reality: every entry in KEV represents a vulnerability that CISA has evidence is being actively exploited in the wild. As of 2024, the catalog contains more than 1,000 entries drawn from the full NVD CVE corpus.

KEV is the direct product of two federal mandates. Executive Order 14028, signed by President Biden in May 2021 following the SolarWinds and Microsoft Exchange compromises, required CISA to establish a catalog of known exploited vulnerabilities and required federal agencies to remediate them. CISA Binding Operational Directive 22-01 (BOD 22-01), issued in November 2021, operationalized this requirement for all federal civilian executive branch agencies: agencies must patch KEV vulnerabilities within prescribed timeframes, currently 14 calendar days for critical infrastructure vulnerabilities and up to six months for older entries. BOD 22-01 does not apply to the Department of Defense or intelligence community, which operate under separate directives, but it covers all civilian .gov systems.

The operational significance of KEV is that it dramatically narrows the prioritization problem. Of the approximately 250,000 CVEs in NVD, roughly 60,000 have a High or Critical CVSS score. No organization can realistically patch all of them within days of disclosure. KEV's 1,000+ confirmed-exploited entries represent the vulnerabilities that adversaries have actually weaponized—the ones that federal threat intelligence confirms are being used in intrusion campaigns, ransomware attacks, or targeted exploitation. Research by cybersecurity firms has consistently shown that the majority of ransomware intrusions exploit a small set of CVEs, many of which appear in KEV, and that organizations patching KEV entries on the prescribed schedule significantly reduce their exposure to prevalent threat actor TTPs (tactics, techniques, and procedures). CISA publishes the KEV catalog as a machine-readable JSON feed that can be queried programmatically and cross-referenced against NVD records.

CWE Classification and CPE Product Mapping

NVD enriches each CVE with a Common Weakness Enumeration (CWE) classification that identifies the underlying software security weakness class—the type of coding defect that created the vulnerability. Where CVSS describes how severe the vulnerability is, CWE describes what kind of vulnerability it is. The CWE taxonomy, maintained by MITRE at cwe.mitre.org, contains more than 900 weakness categories organized in a hierarchical tree from abstract weakness classes (CWE-697: Incorrect Comparison) to concrete variant weaknesses (CWE-486: Comparison of Classes by Name).

The most prevalent CWEs in NVD's Critical and High CVEs reveal where software security investment has the greatest return. CWE-787 (Out-of-Bounds Write) is consistently among the most frequent CWEs in Critical vulnerabilities and is the dominant memory safety weakness in C and C++ codebases. CWE-79 (Improper Neutralization of Input During Web Page Generation, commonly called Cross-Site Scripting or XSS) and CWE-89 (SQL Injection) dominate in web application vulnerabilities. CWE-20 (Improper Input Validation) is a broad category that encompasses many injection-type weaknesses. CWE-476 (NULL Pointer Dereference), CWE-416 (Use After Free), and CWE-125 (Out-of-Bounds Read) are the other major memory safety CWEs that the NSA and CISA have highlighted in their joint guidance promoting memory-safe programming languages.

The NSA/CISA “software memory safety” campaign—a series of joint advisories issued beginning in 2022—explicitly used NVD CWE data to demonstrate that memory safety vulnerabilities (CWE-787, CWE-416, CWE-125, CWE-476, CWE-190, CWE-415) constitute a disproportionate share of high-severity CVEs and are nearly exclusive to C and C++ codebases. The advisories recommended that software vendors and federal agencies migrate new code development to memory-safe languages (Rust, Go, Python, Java, C#, Swift) to structurally reduce the incidence of these weakness classes. The 2024 White House Office of the National Cyber Director (ONCD) report “Back to the Building Blocks” extended this recommendation, explicitly citing NVD CWE statistics to argue that memory-unsafe languages produce a systemic vulnerability tax paid by the entire software ecosystem.

CPE (Common Platform Enumeration) is the standardized naming scheme NVD uses to identify which software products and versions are affected by a given CVE. A CPE string encodes vendor, product name, and version in a URI-like format: cpe:2.3:a:apache:log4j:2.14.1:*:*:*:*:*:*:* identifies Log4j version 2.14.1 by Apache as an application. Vulnerability scanners compare installed software CPEs against NVD's cpeMatch lists to identify affected hosts. CPE matching is the mechanism by which a scanner knows that a server running Apache Log4j 2.14.1 is affected by CVE-2021-44228 and a server running Log4j 2.17.1 (which includes the patch) is not.

High-Profile CVEs

Several CVEs have achieved prominence beyond the security community because of their scale of impact, the novelty of the exploit class, or the downstream damage from exploitation. These landmark vulnerabilities illustrate how CVE records, CVSS scores, and NVD metadata translate into real-world consequences.

CVE-2021-44228 (Log4Shell) is the highest-profile vulnerability of the 2020s. Disclosed in December 2021, it affected Apache Log4j2, the Java logging library used in hundreds of millions of Java applications globally, from enterprise software to game servers. The vulnerability exploited Log4j's JNDI (Java Naming and Directory Interface) lookup feature, which could be triggered by a crafted log message to cause the server to connect to an attacker-controlled LDAP server and execute arbitrary code. The CVSS score was 10.0 Critical—the maximum possible. Exploitation was weaponized within hours of public disclosure, with mass scanning for vulnerable systems beginning within days. CISA added Log4Shell to KEV immediately and issued an emergency directive. The severity was compounded by the difficulty of finding every affected system, since Log4j is often bundled as a transitive dependency several layers deep in software supply chains, meaning systems were affected without their operators knowing Log4j was present at all.

CVE-2017-0144 (EternalBlue) is the vulnerability behind the WannaCry and NotPetya catastrophes of 2017. It exploited a flaw in the Windows SMBv1 (Server Message Block) protocol to achieve unauthenticated remote code execution on Windows systems, earning a CVSS score of 9.3 Critical. The exploit was developed by the NSA's Tailored Access Operations unit and leaked publicly by a group calling itself the Shadow Brokers in April 2017, weeks after Microsoft had issued a patch (MS17-010). WannaCry, deployed by North Korean state actors in May 2017, used EternalBlue to self-propagate across networks, encrypting files and demanding ransom, ultimately affecting the UK National Health Service, FedEx, Telefonica, and hundreds of thousands of other organizations. NotPetya, deployed by Russian military intelligence (GRU) weeks later, used EternalBlue as its propagation mechanism for what was effectively a destructive cyberattack against Ukrainian critical infrastructure that spilled globally, causing an estimated $10 billion in damage and making it the most destructive cyberattack in recorded history.

CVE-2014-0160 (Heartbleed) remains one of the most consequential vulnerabilities in internet history despite its CVSS score of 7.5 High—a score that seems modest relative to its impact. It exploited a buffer over-read in OpenSSL's implementation of the TLS heartbeat extension, allowing an attacker to read up to 64 kilobytes of arbitrary memory from the server process without authentication and without leaving any log trace. Because OpenSSL served as the TLS implementation for approximately 17 percent of all HTTPS servers at the time of disclosure, Heartbleed exposed server private keys, session tokens, and user credentials across a substantial fraction of the encrypted web. The lower CVSS score relative to Log4Shell reflects that Heartbleed is a data-exposure vulnerability rather than remote code execution—it cannot by itself write to or execute code on the affected system—but its passive exploitability and the sensitivity of the data exposed made it uniquely dangerous.

CVE-2023-44487 (HTTP/2 Rapid Reset Attack) is notable as the first CVE assigned to a distributed denial-of-service attack technique rather than a traditional software defect. Disclosed in October 2023, it exploited a design characteristic of the HTTP/2 protocol's stream cancellation feature to generate request floods of unprecedented scale: threat actors demonstrated attacks reaching 201 million requests per second, approximately three times larger than the previous record DDoS attack. Cloudflare, Google, and AWS all observed and mitigated attacks using this technique before coordinated disclosure. Mitigation required updates to HTTP/2 server implementations across the ecosystem, and the vulnerability illustrated how protocol-level design decisions can create exploitable amplification primitives at internet scale.

Compliance Applications

NVD CVSS scores are the quantitative foundation of vulnerability management requirements in every major US federal and commercial compliance framework. Understanding where and how these frameworks cite NVD is essential for interpreting their requirements operationally.

NIST SP 800-53 (Security and Privacy Controls for Information Systems and Organizations) defines the control catalog used by all federal information systems under FISMA (Federal Information Security Modernization Act). Control RA-5 (Vulnerability Monitoring and Scanning) requires agencies to scan for vulnerabilities and remediate them in accordance with risk assessments, which in practice means using CVSS scores to establish remediation priority and timelines. FedRAMP (Federal Risk and Authorization Management Program), which authorizes cloud service offerings for federal use, requires cloud providers to remediate Critical and High vulnerabilities within 30 days of discovery and Medium vulnerabilities within 90 days—a timeline that assumes CVSS-based severity classification derived from NVD.

PCI DSS (Payment Card Industry Data Security Standard) requires quarterly vulnerability scans of cardholder data environments and mandates remediation of all High-severity vulnerabilities (CVSS 7.0 and above) before the environment can pass a scan. PCI DSS v4.0, effective 2024, tightened these requirements and added explicit requirements for targeted risk analysis to account for vulnerabilities below the High threshold. Approved Scanning Vendors (ASVs) accredited by the PCI Security Standards Council all use NVD CVSS scores as their severity basis. SOC 2 auditors reviewing a vendor's vulnerability management program similarly reference CVSS thresholds as the industry-standard severity metric.

Vendor risk assessment programs at large organizations increasingly use the NVD API to perform automated checks on software used in their supply chain. A procurement workflow might query the NVD API for all Critical CVEs affecting a vendor's software stack, flag those that appear in CISA KEV as immediate remediation requirements, and produce a risk score for vendor software based on the density and recency of High and Critical unpatched CVEs. This use of NVD as a continuous vendor intelligence feed rather than a periodic scan result reflects the shift from point-in-time vulnerability assessments to continuous monitoring that NIST SP 800-137 (Information Security Continuous Monitoring) describes.

NVD API

The NVD REST API v2.0, documented at nvd.nist.gov/developers/vulnerabilities, provides programmatic access to the full NVD CVE corpus. The primary endpoint is /rest/json/cves/2.0, which returns paginated JSON results with rich metadata for each matching CVE record. The API is free to use without registration, though obtaining a free API key raises the rate limit from 5 requests per 30 seconds to 50 requests per 30 seconds. API keys are available at nvd.nist.gov/developers/request-an-api-key and are issued automatically. For bulk downloads or high-frequency monitoring pipelines, the API key is effectively required.

The API supports a rich set of query parameters. Date range filtering uses pubStartDate and pubEndDate (accepting ISO 8601 format with millisecond precision). Severity filtering uses cvssV3Severity with values CRITICAL, HIGH, MEDIUM, or LOW. CWE filtering uses cweId to retrieve all CVEs classified under a specific weakness. CPE filtering uses cpeMatchString to retrieve all CVEs affecting a specific software product and version, which is the mechanism vulnerability scanners use to find CVEs for installed components. The hasKev parameter filters results to only CVEs that appear in the CISA KEV catalog. Keyword search is available via keywordSearch. Pagination uses startIndex and resultsPerPage, with a maximum page size of 2,000 results.

Each CVE record in the API response includes the CVE ID and description, the CVSS metrics (cvssMetricV31 for CVSS v3.1 scores, cvssMetricV40 for v4.0 where available), the weaknesses array containing CWE classifications, the configurations array containing CPE match strings for affected products, and a references array of external links including vendor advisories, patch documentation, and exploit proof-of-concept references. The API also exposes the cisaKevAdded field for KEV entries, indicating the date CISA added the vulnerability to its catalog.

A key operational constraint is the NVD API's 120-day maximum date range window. Queries spanning more than 120 days must be split into multiple requests covering sub-ranges and then merged client-side. For full-year analysis, splitting by quarter is a reliable approach. The API also enforces a brief sleep between paginated requests even with a key; building in a 0.6-second delay between requests avoids hitting rate limits during large bulk retrieval operations.

Python: Critical CVE Analysis for 2024

The following script uses the NVD REST API and the CISA KEV JSON feed to retrieve all Critical-severity CVEs published in 2024, parse each record for its CVSS score, CWE classification, and affected CPE products, and compute three analyses: weekly publication volume throughout 2024 to identify disclosure spikes, the top-10 CWE weakness categories among Critical CVEs, and the percentage of 2024 Critical CVEs that appear in the CISA KEV catalog as confirmed-exploited vulnerabilities. The script handles the 120-day NVD API date window by splitting 2024 into quarters and merges results into a single pandas DataFrame for analysis.

import requests
import pandas as pd
from collections import Counter
from datetime import datetime, timedelta

# ---------------------------------------------------------------------------
# NIST NVD API: Critical CVEs published in 2024 — weekly counts, top CWEs,
# and CISA KEV overlap.
#
# NVD REST API v2.0:
#   Base: https://services.nvd.nist.gov/rest/json/cves/2.0
#   Auth: optional API key via header "apiKey" for higher rate limit
#         With key:    50 requests / 30 seconds
#         Without key:  5 requests / 30 seconds
#   Pagination: resultsPerPage (max 2000), startIndex
#   Date filter: pubStartDate / pubEndDate (ISO 8601: YYYY-MM-DDThh:mm:ss.sss)
#   Severity filter: cvssV3Severity=CRITICAL
#
# CISA KEV JSON feed:
#   https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json
# ---------------------------------------------------------------------------

NVD_BASE = "https://services.nvd.nist.gov/rest/json/cves/2.0"
KEV_URL  = "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"

# Optional: set your NVD API key here for higher rate limits
# Register free at https://nvd.nist.gov/developers/request-an-api-key
API_KEY = None  # or "your-api-key-here"

HEADERS = {"apiKey": API_KEY} if API_KEY else {}


def fetch_nvd_page(
    pub_start: str,
    pub_end: str,
    severity: str = "CRITICAL",
    start_index: int = 0,
    results_per_page: int = 2000,
) -> dict:
    """Fetch one page of NVD CVE results with date and severity filters."""
    params = {
        "pubStartDate": pub_start,
        "pubEndDate":   pub_end,
        "cvssV3Severity": severity,
        "startIndex":   start_index,
        "resultsPerPage": results_per_page,
    }
    resp = requests.get(NVD_BASE, params=params, headers=HEADERS, timeout=60)
    resp.raise_for_status()
    return resp.json()


def fetch_all_critical_cves_2024() -> list[dict]:
    """
    Retrieve all Critical-severity CVEs published in calendar year 2024.
    NVD limits date range queries to 120 days; we split 2024 into quarters.
    """
    quarters = [
        ("2024-01-01T00:00:00.000", "2024-03-31T23:59:59.999"),
        ("2024-04-01T00:00:00.000", "2024-06-30T23:59:59.999"),
        ("2024-07-01T00:00:00.000", "2024-09-30T23:59:59.999"),
        ("2024-10-01T00:00:00.000", "2024-12-31T23:59:59.999"),
    ]
    all_vulns = []
    for start, end in quarters:
        start_index = 0
        while True:
            data = fetch_nvd_page(start, end, start_index=start_index)
            vulns = data.get("vulnerabilities", [])
            all_vulns.extend(vulns)
            total = data.get("totalResults", 0)
            start_index += len(vulns)
            if start_index >= total:
                break
        print(f"  Fetched quarter {start[:10]} to {end[:10]}: {len(all_vulns)} cumulative")
    return all_vulns


def parse_cve_record(vuln: dict) -> dict | None:
    """
    Extract key fields from a raw NVD CVE JSON record.
    Returns dict with id, published, cvss_score, cwe_id, cpe_products list.
    """
    cve = vuln.get("cve", {})
    cve_id = cve.get("id", "")
    published = cve.get("published", "")[:10]   # YYYY-MM-DD

    # CVSS v3.1 base score (prefer v3.1 over v3.0)
    cvss_score = None
    for metric_key in ("cvssMetricV31", "cvssMetricV30"):
        metrics = cve.get("metrics", {}).get(metric_key, [])
        if metrics:
            cvss_score = metrics[0].get("cvssData", {}).get("baseScore")
            break

    # CWE classification (first weakness listed)
    cwe_id = None
    weaknesses = cve.get("weaknesses", [])
    for weakness in weaknesses:
        for desc in weakness.get("description", []):
            if desc.get("lang") == "en" and desc.get("value", "").startswith("CWE-"):
                cwe_id = desc["value"]
                break
        if cwe_id:
            break

    # Affected CPE products (from configurations)
    cpe_products = []
    for node in cve.get("configurations", []):
        for cpe_match in node.get("cpeMatch", []):
            if cpe_match.get("vulnerable"):
                cpe_products.append(cpe_match.get("criteria", ""))
        for child_node in node.get("nodes", []):
            for cpe_match in child_node.get("cpeMatch", []):
                if cpe_match.get("vulnerable"):
                    cpe_products.append(cpe_match.get("criteria", ""))

    return {
        "cve_id":       cve_id,
        "published":    published,
        "cvss_score":   cvss_score,
        "cwe_id":       cwe_id,
        "cpe_products": cpe_products,
    }


# ---------------------------------------------------------------------------
# Step 1: Load CISA KEV catalog
# ---------------------------------------------------------------------------

print("Fetching CISA KEV catalog...")
kev_resp = requests.get(KEV_URL, timeout=30)
kev_resp.raise_for_status()
kev_data = kev_resp.json()
kev_cve_ids = {v["cveID"] for v in kev_data.get("vulnerabilities", [])}
print(f"CISA KEV catalog: {len(kev_cve_ids)} entries total")

# ---------------------------------------------------------------------------
# Step 2: Fetch all Critical CVEs published in 2024
# ---------------------------------------------------------------------------

print("\nFetching Critical CVEs published in 2024 from NVD (4 quarters)...")
raw_vulns = fetch_all_critical_cves_2024()

# Parse records
records = []
for vuln in raw_vulns:
    parsed = parse_cve_record(vuln)
    if parsed:
        records.append(parsed)

df = pd.DataFrame(records)
df["published"] = pd.to_datetime(df["published"])
print(f"\nTotal Critical CVEs parsed: {len(df)}")

# ---------------------------------------------------------------------------
# Step 3: Weekly counts of Critical CVEs throughout 2024
# ---------------------------------------------------------------------------

df["week"] = df["published"].dt.to_period("W")
weekly = df.groupby("week").size().rename("critical_cves")
weekly_df = weekly.reset_index()
weekly_df["week_str"] = weekly_df["week"].astype(str)

print("\n--- Weekly Critical CVE Publication Counts, 2024 ---")
print(f"Mean CVEs per week:    {weekly.mean():.1f}")
print(f"Median CVEs per week:  {weekly.median():.1f}")
print(f"Peak week:             {weekly.idxmax()} ({weekly.max()} CVEs)")
print(f"Lowest week:           {weekly.idxmin()} ({weekly.min()} CVEs)")
print()

# Top 5 highest-volume weeks
top5_weeks = weekly_df.nlargest(5, "critical_cves")[["week_str", "critical_cves"]]
print("Top 5 weeks by Critical CVE volume:")
print(top5_weeks.to_string(index=False))

# ---------------------------------------------------------------------------
# Step 4: Top-10 CWE types among Critical CVEs in 2024
# ---------------------------------------------------------------------------

cwe_counts = Counter(df["cwe_id"].dropna().tolist())
top10_cwe = cwe_counts.most_common(10)

# Human-readable CWE names for common entries
CWE_NAMES = {
    "CWE-787": "Out-of-Bounds Write",
    "CWE-89":  "SQL Injection",
    "CWE-79":  "Cross-Site Scripting (XSS)",
    "CWE-20":  "Improper Input Validation",
    "CWE-125": "Out-of-Bounds Read",
    "CWE-416": "Use After Free",
    "CWE-22":  "Path Traversal",
    "CWE-78":  "OS Command Injection",
    "CWE-476": "NULL Pointer Dereference",
    "CWE-190": "Integer Overflow",
    "CWE-94":  "Code Injection",
    "CWE-502": "Deserialization of Untrusted Data",
    "CWE-77":  "Command Injection",
    "CWE-269": "Improper Privilege Management",
    "CWE-306": "Missing Authentication for Critical Function",
}

print("\n--- Top 10 CWE Classifications in Critical CVEs, 2024 ---")
total_with_cwe = sum(c for _, c in top10_cwe)
for cwe_id, count in top10_cwe:
    name = CWE_NAMES.get(cwe_id, "See cwe.mitre.org")
    pct  = 100 * count / len(df)
    print(f"  {cwe_id:<10} {name:<40} {count:>4} ({pct:.1f}% of Critical CVEs)")

no_cwe = df["cwe_id"].isna().sum()
print(f"  (No CWE assigned: {no_cwe} CVEs, {100*no_cwe/len(df):.1f}%)")

# ---------------------------------------------------------------------------
# Step 5: CISA KEV overlap with 2024 Critical CVEs
# ---------------------------------------------------------------------------

df["in_kev"] = df["cve_id"].isin(kev_cve_ids)
kev_in_2024_critical = df["in_kev"].sum()
kev_pct = 100 * kev_in_2024_critical / len(df)

print(f"\n--- CISA KEV Overlap with 2024 Critical CVEs ---")
print(f"Critical CVEs in 2024:          {len(df)}")
print(f"Of those, in CISA KEV:          {kev_in_2024_critical} ({kev_pct:.1f}%)")
print(f"Not yet confirmed exploited:    {len(df) - kev_in_2024_critical} ({100-kev_pct:.1f}%)")

# KEV entries by CWE
kev_df = df[df["in_kev"]]
kev_cwe = Counter(kev_df["cwe_id"].dropna().tolist())
print("\nTop CWEs in KEV-confirmed Critical CVEs (2024):")
for cwe_id, count in kev_cwe.most_common(5):
    name = CWE_NAMES.get(cwe_id, "See cwe.mitre.org")
    print(f"  {cwe_id:<10} {name:<40} {count}")

# CVSS score distribution for KEV vs non-KEV
print(f"\nMean CVSS score (KEV-confirmed):     {kev_df['cvss_score'].mean():.2f}")
print(f"Mean CVSS score (not in KEV):         {df[~df['in_kev']]['cvss_score'].mean():.2f}")

Running this analysis against the 2024 NVD dataset will typically show several hundred to over a thousand Critical CVEs published each quarter, with occasional weekly spikes corresponding to coordinated disclosure batches from major vendors (Microsoft Patch Tuesday, Oracle Critical Patch Update, and Adobe Security Bulletins each generate sharp weekly spikes). The CWE distribution will be dominated by memory safety weaknesses (CWE-787 Out-of-Bounds Write, CWE-416 Use After Free) and injection weaknesses (CWE-89 SQL Injection, CWE-78 OS Command Injection), consistent with the structural vulnerability patterns that NSA and CISA highlight in their memory-safe programming advisories. The KEV overlap—the fraction of 2024 Critical CVEs confirmed as actively exploited—will typically be in the low single-digit percentage range, which appears small but represents the absolute highest-priority remediation targets: a Critical CVE in KEV means an adversary is already using it operationally and federal agencies must patch it within 14 days. The mean CVSS score for KEV-confirmed entries versus non-KEV entries will typically show that KEV entries cluster toward the upper end of the Critical range, reflecting that the most technically severe vulnerabilities attract the most exploitation investment from threat actors.

For the CDC WISQARS injury and violence mortality database tracking firearm deaths, drug overdose, suicide epidemiology, and nonfatal injury ED visits across all 50 states, including a Python walkthrough for state-level firearm death rate analysis and gun ownership correlation, see CDC WISQARS: The Federal Injury and Violence Mortality Database Behind Public Health Research (2026-11-20).

For the USASpending.gov federal spending database covering $6 trillion in annual contracts, grants, and loans under the DATA Act, including FPDS-NG contract structure, DoD contractor concentration, the small business set-aside system, and a Python walkthrough of prime contractor rankings, see USASpending.gov: The Federal Spending Database Behind $6 Trillion in Annual Contracts, Grants, and Loans (2026-11-19).