Technical writing

How Voidly measures DNS-layer censorship: dual-resolver design, interference classification, and false positive mitigations

August 28, 2025· 12 min read· AI Analytics

CensorshipVoidlyDNSMethodology

DNS is where internet censorship almost always starts. Before a browser can make an HTTP request, before a TLS handshake can be negotiated, before a single byte of application data flows — the client has to look up the domain name. For a government or ISP that wants to block access to a site, intercepting that lookup is the cheapest possible intervention: DNS queries use UDP, ISP recursive resolvers are centralized chokepoints that every subscriber on the network uses by default, and the ISP never needs to inspect HTTPS traffic or maintain a stateful TCP interception proxy. A forged NXDOMAIN response is three lines of BIND configuration.

DNS censorship is also the most detectable form of interference. The evidence — a wrong IP address, an NXDOMAIN for a domain that resolves fine from a neutral server, or simply no response — is unambiguous in ways that TCP RST injection and HTTP block pages are not. A TCP reset could be a transient network error. An HTTP 403 could be a server-side access control. A DNS NXDOMAIN for a domain that resolves correctly from 8.8.8.8 is almost always censorship or a seriously misconfigured resolver.

This article goes deep on the DNS measurement layer inside Voidly: how we structure the dual-resolver comparison, the four interference types we classify, the false positive sources that are genuinely tricky to handle, and the diagnostic queries we run over DoH and DoT when the ISP resolver shows interference. The dataset schema for all DNS fields is documented in the field-by-field schema reference; this article is about the measurement design and the reasoning behind it.

DNS as the first censorship layer

The economics of DNS-based blocking favor the censor. Most ISP networks serve DNS via a small number of recursive resolvers — typically one or two per POP, shared among hundreds of thousands of subscribers. Censoring at DNS requires no per-connection infrastructure, no deep packet inspection hardware, and no modification to the transport layer. The ISP simply adds a zone override: for domain X, return NXDOMAIN or return IP Y.

The geopolitical reality matches. Turkey's BTK issues blocking orders that require ISPs to update DNS responses within 24 hours; compliance is monitored by the regulator. Pakistan's PTA sends similar directives. Iran's Ministry of ICT operates the national filtering system as a centralized gateway that all ISP traffic passes through, and that gateway handles DNS interception centrally rather than pushing configuration to individual ISPs. Russia's TSPU (“sovereign internet”) system intercepts DNS at the IXP level via deep packet inspection boxes, but the result at the application layer is the same: a subscriber trying to resolve a blocked domain gets a forged answer.

China's Great Firewall is the canonical example at scale. The GFW injects false DNS responses for tens of thousands of blocked domains, returning bogon addresses — 127.0.0.1, 8.7.198.45, or addresses in the 243.185.187.0/24 range that China does not announce to the global routing table. These injections happen inside the resolver infrastructure, not at a host level, so every subscriber in China who uses any Chinese ISP resolver receives the same forged answer.

The four DNS interference types

Voidly classifies DNS interference into four types. Each represents a distinct mechanism and carries different false-positive risk, which is why the classifier treats them separately rather than collapsing them into a single “DNS blocked” label.

NXDOMAIN injection

The ISP resolver returns NXDOMAIN (DNS response code 3: “name does not exist”) for a domain that resolves successfully from a neutral resolver. This is conceptually simple and mechanically easy to implement — it requires no IP address infrastructure on the censor's side, just a zone override that returns NXDOMAIN for the blocked domain. The subscriber's browser gets a name resolution failure and stops there.

NXDOMAIN injection is dominant in Turkey and Pakistan. Turkey's BTK-ordered blocks have historically used NXDOMAIN as the primary mechanism for major carriers (Türk Telekom, Vodafone Turkey, Turkcell), though some carriers have shifted to IP-redirect methods in recent years to allow them to serve the mandatory legal notice page. Pakistan's PTA blocks are also predominantly NXDOMAIN across all major ISPs.

False positive risk is low but not zero: if a domain has genuinely expired or been delisted from its registrar, it will return NXDOMAIN from all resolvers. The Voidly comparison mitigates this by checking NXDOMAIN disagreement (probe gets NXDOMAIN, control gets NOERROR) rather than treating any NXDOMAIN as interference.

IP spoofing / DNS injection

The ISP resolver returns NOERROR but with one or more IP addresses that do not actually serve the blocked domain. The returned IPs typically fall into three categories: (1) a local ISP redirect page server that displays a legal notice, (2) a government-operated warning page server, or (3) bogon addresses — IPs in ranges that are either reserved (RFC 1918, loopback) or not globally routed, which cause the connection to simply fail at the TCP layer.

The GFW is the most documented example of category 3. For domains blocked in China, the GFW injects responses with addresses like 127.0.0.1 (loopback, connection immediately refused), 8.7.198.45 (a well-known GFW injection IP, not routed to anything reachable from outside China), and addresses in the 243.185.187.0/24 range (not globally announced). OONI and the academic DNS measurement community have documented these injection addresses extensively since at least 2012.

Category 1 (ISP redirect page) is common in Germany for adult content blocks, in India for court-ordered blocks (where some ISPs return the IP of a carrier landing page), and in several Southeast Asian countries. The Voidly classifier checks whether the returned IP belongs to a known block-page ASN as part of the interference typing.

Empty answer (NOERROR with no records)

The resolver returns NOERROR but the answer section is empty — no A records, no AAAA records, no CNAME chain. DNS protocol allows this: NOERROR with an empty answer is the correct response for a domain that exists at the zone apex but has no records of the queried type. In practice this legitimately occurs for domains that have A records but no AAAA records when queried with QTYPE=AAAA.

As a censorship mechanism, NOERROR with an empty answer is less common but appears in some Iran configurations: the ISP resolver acknowledges the query without returning a usable answer, which prevents the subscriber from connecting without the resolver technically lying about whether the domain exists. The “domain exists” answer may be intended to make the interference less obvious to automated detection tools that only check RCODE.

False positive risk is higher here than for NXDOMAIN injection. Misconfigured domains and domains in DNS propagation can produce empty answers legitimately. Voidly treats empty answers as a weaker signal than NXDOMAIN disagreement and requires corroborating evidence (TCP failure, HTTP anomaly, or other probes from the same country seeing the same pattern) before classifying as interference.

Timeout and SERVFAIL

The resolver drops the UDP query without responding (timeout after the probe's 2-second deadline), or returns RCODE 2 (SERVFAIL — “the server failed to complete the request”). Timeout/refuse-based interference is common in Iran, where some IRGC-adjacent filtering configurations simply drop queries for blocked domains rather than returning a forged answer. It also appears in some Myanmar ISP configurations.

Timeout-based interference has the highest false positive rate of the four types: a resolver under load, a transient network failure, or a UDP packet loss on a poor connection all produce the same observable — no response within 2 seconds. Voidly mitigates this by retrying with TCP transport (DNS over TCP) on timeout, requiring at least three probes from the same country/ASN to see the same timeout pattern within a 24-hour window before flagging it as interference, and requiring that the control resolver successfully returned a response for the same domain.

The dual-resolver measurement design

Every Voidly DNS measurement queries two resolvers: the probe's local ISP resolver and a neutral control resolver from a hardcoded list. The comparison between the two results is what distinguishes interference from legitimate network state.

The local ISP resolver is discovered at probe startup using the OS-level resolver configuration (/etc/resolv.conf on Linux, the Network framework on macOS, or GetNetworkParams()on Windows). The probe records the resolver IP at the start of each measurement cycle and checks whether it has changed between cycles. A resolver change mid-cycle causes the current cycle to be discarded and re-run from the new resolver.

The neutral control resolvers are hardcoded: 8.8.8.8 (Google Public DNS), 1.1.1.1 (Cloudflare), and 9.9.9.9 (Quad9). The probe rotates through these across measurement cycles rather than using the same control resolver for every query. This rotation prevents a measurement artifact where a single control resolver with a temporary outage or ECS configuration difference causes a systematic false-positive pattern. Each DNS result records which control resolver was used.

The Rust struct that holds a single DNS measurement result:

struct DnsResult {
    resolver: IpAddr,
    resolver_type: ResolverType,  // ISP | Control
    query_domain: String,
    response_code: u16,           // RCODE: 0=NOERROR, 3=NXDOMAIN, 5=REFUSED
    answer_ips: Vec<IpAddr>,
    answer_ttls: Vec<u32>,
    cname_chain: Vec<String>,
    response_ms: f64,
    truncated: bool,
    qname_minimization: bool,     // probe detected QNAME minimization support
}

The resolver_type enum is recorded so that downstream analysis can always identify which resolver produced which result. The cname_chain field captures the full CNAME resolution chain, which matters for CDN-served domains where the final IP often comes through a CNAME to an Akamai or Cloudflare edge domain rather than being a direct A record. A censored CNAME chain — one that terminates earlier than expected, or at a different point than the control chain — is itself a distinguishable interference pattern.

qname_minimization records whether the ISP resolver uses QNAME minimization (RFC 7816), which sends only the minimal name necessary to each authoritative server during recursive resolution rather than the full query name. This affects how certain ECS-aware authoritative servers respond and is needed context when evaluating whether a difference between probe and control IPs is due to topology or interference.

False positive sources and mitigations

The hardest part of DNS censorship measurement is not detecting interference when it exists — it is avoiding false positives from legitimate DNS behavior that makes probe and control results legitimately differ. Four sources cause the most problems.

CDN geofencing via EDNS Client Subnet

Akamai, Fastly, CloudFront, and other CDN providers use EDNS Client Subnet (ECS, RFC 7871) to return topologically optimal IP addresses. When a query arrives at the CDN's authoritative DNS server with an ECS option containing the subscriber's /24, the authoritative server returns the nearest PoP's IP. A probe in Istanbul and the Voidly control server in Frankfurt will receive different IPs for the same CDN-hosted domain — by design, not by interference.

Voidly handles this with an ASN-level check before flagging IP differences. If both the probe's returned IPs and the control's returned IPs belong to the same CDN's ASN space, the difference is classified as expected CDN behavior rather than interference:

from ipaddress import ip_address
from typing import List

# ASN ranges for major CDNs — kept as a flat dict of ASN -> CDN name
CDN_ASNS: dict[int, str] = {
    20940: 'Akamai',
    16625: 'Akamai',
    54113: 'Fastly',
    16509: 'Amazon CloudFront',
    14618: 'Amazon CloudFront',
    13335: 'Cloudflare',
    209242: 'Cloudflare',
    15169: 'Google',
    396982: 'Google',
}

def is_cdn_expected_difference(
    probe_ips: List[str],
    control_ips: List[str],
    probe_asn: int,
) -> bool:
    """
    Return True if a difference in DNS responses between probe and control
    is explained by CDN geographic load balancing.

    Both probe IPs and control IPs must map to the same CDN ASN for this
    to pass — if the probe got a CDN IP and the control got a non-CDN IP,
    that is not expected CDN behavior and should still be flagged.
    """
    def asn_for(ip: str) -> int | None:
        # resolve_asn() queries the local GeoIP/ASN database
        return resolve_asn(ip_address(ip))

    probe_asns  = {asn_for(ip) for ip in probe_ips if asn_for(ip)}
    control_asns = {asn_for(ip) for ip in control_ips if asn_for(ip)}

    probe_cdn_names  = {CDN_ASNS[a] for a in probe_asns if a in CDN_ASNS}
    control_cdn_names = {CDN_ASNS[a] for a in control_asns if a in CDN_ASNS}

    if not probe_cdn_names or not control_cdn_names:
        return False

    # Both sides resolve to the same CDN provider
    return bool(probe_cdn_names & control_cdn_names)

An important edge case: if a probe receives a CDN IP (ASN 20940, Akamai) but the control resolves to the origin ASN — a non-CDN IP — that is not expected CDN behavior and the difference is still flagged. This catches the case where the ISP has redirected the probe to Akamai's error page ASN rather than the domain's actual CDN distribution.

EDNS client subnet propagation delay

Even when CDN geofencing is not in play, ECS-aware resolvers return different answers based on the source address of the resolver making the recursive query. The probe's ISP resolver uses the ISP's resolver IP as the ECS source address; the control resolver at 8.8.8.8 uses 8.8.8.8's IP. Authoritative servers with global anycast infrastructure (Azure CDN, Akamai) may return different IPs for a Frankfurt-sourced ECS option versus a California-sourced one.

Voidly detects whether ECS is in use by inspecting the DNS response options for EDNS option code 8 (ECS). When both the probe resolver and the control resolver are returning ECS-annotated responses, IP differences are weighted less heavily in the comparison — they are still recorded, but the ASN-level check (rather than the IP-level check) becomes the primary signal. The ASN for the returned IPs is what matters: same ASN across probe and control means the same CDN node cluster is being reached, just different anycast addresses within that cluster.

Round-robin DNS and load balancing

High-traffic domains — google.com, cloudflare.com, facebook.com — deliberately return different A records for successive queries from the same resolver via DNS round-robin, and different resolvers see different subsets of the anycast pool. A probe querying 8.8.8.8 and the control querying 8.8.8.8 might get different IPs for google.com even in the absence of any censorship, because Google's authoritative DNS rotates through hundreds of addresses.

The mitigation is ASN-level rather than IP-level comparison for known high-traffic domains. If probe IPs and control IPs share at least one ASN, the measurement is treated as consistent even if no individual IP appears in both answer sets. The ASN check catches censorship that redirects to an ISP-owned or government-owned IP (which will not share ASN with Google or Cloudflare) while ignoring legitimate round-robin variation within the same provider's address space.

NXDOMAIN for new and transitional domains

A domain that was registered less than 90 days ago may be in DNS propagation — its authoritative nameserver records may not have propagated to all resolvers, causing some resolvers to return NXDOMAIN not because the domain doesn't exist but because their cache hasn't been updated. Similarly, a domain undergoing nameserver migration may return NXDOMAIN transiently from some resolvers but not others.

Voidly checks domain registration age from a whois cache that is updated weekly. Domains with registration age under 90 days have NXDOMAIN disagreement classified as dns_indeterminate rather thannxdomain_injection. The 90-day threshold is conservative — most propagation issues resolve within 48 hours — but it ensures that a legitimately new domain that happens to be on the test list doesn't generate false censorship signals.

The comparison algorithm

The comparison function takes the probe's ISP resolver result and the control resolver result and produces a structured DnsComparisonobject that the anomaly classifier consumes. Every field in this object represents a specific hypothesis about what the DNS results mean.

from dataclasses import dataclass
from ipaddress import ip_address, IPv4Address, IPv6Address
from typing import Literal, Optional
import ipaddress

# Known injection IPs by country — see next section for full contents
KNOWN_INJECTION_IPS: dict[str, set[str]] = {
    'CN': {
        '127.0.0.1', '8.7.198.45', '37.61.54.158',
        '243.185.187.39', '243.185.187.30', '59.24.3.173',
        '4.36.66.178', '159.106.121.75', '202.106.1.2',
        '211.94.66.147', '78.16.49.15', '1.1.1.31',
        '61.131.208.210', '61.131.208.211', '66.249.90.104',
        '69.63.176.13', '128.121.126.139', '192.67.198.6',
    },
    'IR': {
        '10.10.34.34',    # IRGC redirect server (RFC1918, only visible inside IR)
        '10.10.34.35',
        '10.10.34.36',
    },
    'TR': {
        '195.175.254.2',  # Old BTK redirect, pre-2022
        '195.175.254.3',
    },
    'PK': {
        '202.83.24.24',   # PTA redirect page server
    },
    'RU': {
        # Roskomnadzor TSPU redirect IPs — varies by ISP, sampled set
        '193.187.126.101',
        '193.187.126.100',
    },
}

def is_bogon(ip: str) -> bool:
    """Return True if ip is RFC1918, loopback, link-local, or otherwise reserved."""
    try:
        addr = ip_address(ip)
        return (
            addr.is_private
            or addr.is_loopback
            or addr.is_link_local
            or addr.is_reserved
            or addr.is_multicast
            or addr.is_unspecified
        )
    except ValueError:
        return False

@dataclass
class DnsResult:
    resolver: str
    resolver_type: str          # 'ISP' | 'Control'
    query_domain: str
    response_code: int          # 0=NOERROR, 3=NXDOMAIN, 2=SERVFAIL, 5=REFUSED
    answer_ips: list[str]
    answer_ttls: list[int]
    cname_chain: list[str]
    response_ms: float
    truncated: bool
    timed_out: bool
    qname_minimization: bool
    dnssec_valid: bool | None
    dnssec_bogus: bool | None
    dnssec_indeterminate: bool | None

@dataclass
class DnsComparison:
    # IP-level agreement signals
    ip_in_control_set: bool       # any probe IP appears in control answer set
    asn_match: bool               # all probe IPs share ASN with at least one control IP
    cdn_expected_diff: bool       # difference explained by CDN geofencing
    # RCODE disagreement signals
    nxdomain_disagree: bool       # probe got NXDOMAIN, control got NOERROR
    empty_answer_disagree: bool   # probe got empty answer, control got records
    # Injection signals
    bogon_injection: bool         # probe IPs include RFC1918/loopback/reserved
    known_injection_ip: bool      # probe IP appears in KNOWN_INJECTION_IPS
    known_injection_country: Optional[str]  # which country's entry matched
    # Classified interference type
    interference_type: Literal[
        'nxdomain',
        'ip_spoofing',
        'empty_answer',
        'timeout',
        'none',
    ]
    # Confidence modifier (0.0–1.0, used by classifier)
    confidence: float

def compare_dns_results(
    probe: DnsResult,
    control: DnsResult,
    probe_cc: str,
    probe_asn: int,
) -> DnsComparison:
    """
    Compare ISP resolver result against control resolver result and classify
    DNS interference type. probe_cc is the ISO 3166-1 alpha-2 country code
    of the probe; probe_asn is the probe's AS number.
    """
    # Timeout takes precedence — if the ISP resolver didn't respond, classify
    # immediately without attempting IP-level comparison.
    if probe.timed_out and not control.timed_out:
        return DnsComparison(
            ip_in_control_set=False,
            asn_match=False,
            cdn_expected_diff=False,
            nxdomain_disagree=False,
            empty_answer_disagree=False,
            bogon_injection=False,
            known_injection_ip=False,
            known_injection_country=None,
            interference_type='timeout',
            confidence=0.55,   # lower confidence — timeout is noisy
        )

    # RCODE 3: NXDOMAIN from probe but NOERROR from control
    nxdomain_disagree = (
        probe.response_code == 3 and control.response_code == 0
    )

    # Empty answer: NOERROR from probe with no answer IPs, but control returned records
    empty_answer_disagree = (
        probe.response_code == 0
        and len(probe.answer_ips) == 0
        and control.response_code == 0
        and len(control.answer_ips) > 0
    )

    # IP-level signals (only meaningful when both sides have IPs)
    probe_ip_set = set(probe.answer_ips)
    control_ip_set = set(control.answer_ips)

    ip_in_control_set = bool(probe_ip_set & control_ip_set)

    # ASN-level match using MaxMind GeoLite2
    probe_asns   = {resolve_asn(ip_address(ip)) for ip in probe_ip_set   if ip}
    control_asns = {resolve_asn(ip_address(ip)) for ip in control_ip_set if ip}
    asn_match = bool(probe_asns & control_asns)

    cdn_expected = is_cdn_expected_difference(
        list(probe_ip_set),
        list(control_ip_set),
        probe_asn,
    )

    # Bogon injection: probe returned RFC1918, loopback, or reserved IPs
    bogon_injection = any(is_bogon(ip) for ip in probe.answer_ips)

    # Known injection IP database check
    country_injections = KNOWN_INJECTION_IPS.get(probe_cc, set())
    known_injection_ip = bool(probe_ip_set & country_injections)
    known_injection_country = probe_cc if known_injection_ip else None

    # Classify interference type
    if nxdomain_disagree:
        interference_type = 'nxdomain'
        confidence = 0.90
    elif bogon_injection or known_injection_ip:
        interference_type = 'ip_spoofing'
        confidence = 0.95 if known_injection_ip else 0.85
    elif (
        not ip_in_control_set
        and not asn_match
        and not cdn_expected
        and len(probe.answer_ips) > 0
        and len(control.answer_ips) > 0
    ):
        # IPs differ, not explained by CDN or load balancing
        interference_type = 'ip_spoofing'
        confidence = 0.70   # lower confidence — needs corroboration
    elif empty_answer_disagree:
        interference_type = 'empty_answer'
        confidence = 0.60   # weakest signal, needs corroboration
    else:
        interference_type = 'none'
        confidence = 0.0

    return DnsComparison(
        ip_in_control_set=ip_in_control_set,
        asn_match=asn_match,
        cdn_expected_diff=cdn_expected,
        nxdomain_disagree=nxdomain_disagree,
        empty_answer_disagree=empty_answer_disagree,
        bogon_injection=bogon_injection,
        known_injection_ip=known_injection_ip,
        known_injection_country=known_injection_country,
        interference_type=interference_type,
        confidence=confidence,
    )

The confidence values are starting points for the classifier — they are not final outputs. The classifier combines the DNS comparison confidence with TCP, TLS, and HTTP signals, and with the cross-probe corroboration score, to produce the final classifier_confidence field in the dataset. A DNS comparison alone with confidence=0.70that is also accompanied by TCP connection failure and no HTTP response reaches a final confidence well above 0.90 after the multi-layer combination.

The known injection IP database

The KNOWN_INJECTION_IPS dictionary is curated from OONI confirmed censorship events, academic DNS measurement research (notably the Great Firewall Research papers from Princeton and UMass), and Voidly's own probe measurement history. It is updated monthly as part of the Voidly analyst review cycle, primarily adding new Russian ISP redirect IPs (which vary by carrier and change more frequently than China's or Turkey's).

The current table of known injection IPs by country:

Country     IP count    Sample IPs                         Notes
---------   --------    --------------------------------   ----------------------------
China (CN)     18       127.0.0.1, 8.7.198.45,            GFW injection pool. 243.x.x.x
                        37.61.54.158, 243.185.187.39,      range not globally routed.
                        243.185.187.30, 59.24.3.173,       Documented since 2012.
                        4.36.66.178, 159.106.121.75,
                        202.106.1.2, 211.94.66.147,
                        78.16.49.15, 1.1.1.31,
                        61.131.208.210, 61.131.208.211,
                        66.249.90.104, 69.63.176.13,
                        128.121.126.139, 192.67.198.6

Iran (IR)       3       10.10.34.34, 10.10.34.35,         IRGC redirect server, RFC1918
                        10.10.34.36                        — only reachable inside IR.
                                                           Returns Ministry warning page.

Turkey (TR)     2       195.175.254.2, 195.175.254.3      Pre-2022 BTK redirect.
                                                           Most carriers switched to
                                                           NXDOMAIN + separate block page
                                                           served over HTTP.

Pakistan (PK)   1       202.83.24.24                      PTA redirect page server.
                                                           Seen across multiple major ISPs.

Russia (RU)    var.     193.187.126.100,                   TSPU redirect IPs — varies by
                        193.187.126.101                    ISP. Roskomnadzor maintains a
                        (+ per-ISP IPs added monthly)      central pool; ISPs implement
                                                           separately.

The Iran entries deserve specific note. The 10.10.34.x addresses are RFC 1918 private addresses — they are not routable on the public internet. They are reachable only within Iran's ISP networks, where the IRGC-operated redirect infrastructure serves the Ministry of Culture warning page. A probe outside Iran would never see these IPs returned for a domain; only probes with Iranian ISP resolvers do. This makes them unambiguous injection signals when they appear: there is no legitimate DNS configuration where a public domain resolves to a private RFC 1918 address.

China's injection pool has been studied extensively. Addresses like 8.7.198.45 appear in GFW injection responses but are not BGP-announced globally — a traceroute to 8.7.198.45 from outside China times out. From inside China, the GFW injects these addresses before the packet ever reaches an authoritative nameserver, so the injected answer is what the probe sees regardless of which root server it would have eventually reached.

DNSSEC and its limits as a detection mechanism

DNSSEC-signed domains make IP spoofing theoretically detectable at the protocol level: a validating resolver that receives a response with a forged A record will fail DNSSEC validation because the answer does not match the RRSIG. In theory, DNS injection is impossible against DNSSEC-enabled domains with validating resolvers. In practice, this fails in every censored environment for three reasons.

First, the vast majority of blocked domains are not DNSSEC-signed. DNSSEC deployment remains sparse: as of 2025, roughly 25% of .com domains have DNSSEC signatures, and the domains most likely to be blocked — news sites, social media, VPNs — are disproportionately not signed. DNSSEC helps for the set of signed domains that are also censored, which is small.

Second, ISP resolvers in censored environments disable DNSSEC validation before returning injected responses. The resolver that the probe queries is configured to forward forged answers without performing the signature check that would cause it to return SERVFAIL instead. There is no protocol mechanism that allows the stub resolver (the probe) to force a recursive resolver to perform DNSSEC validation — the probe can only check whether the returned response has a valid signature itself, which the probe does independently of the control resolver comparison.

Third, the GFW and similar inline injection systems operate below the resolver level — they inject answers into the UDP stream before the resolver sees them. Even if the ISP's resolver had DNSSEC validation enabled, the injected packet would arrive first and the resolver would return the spoofed answer before it received the legitimate response with the RRSIG.

Voidly records three DNSSEC fields per DNS result:

dnssec_valid: bool | None           # True if probe validated DNSSEC chain
                                    # None if domain is not DNSSEC-signed
dnssec_bogus: bool | None           # True if probe received signed response
                                    # that failed validation (injection detected)
                                    # None if not signed
dnssec_indeterminate: bool | None   # True if the chain cannot be fully validated
                                    # (e.g., DS record missing at parent zone)

dnssec_bogus = True is a near-certain censorship signal: the domain is signed, the probe received a response claiming to be signed, and the signature did not validate. The only legitimate causes — key rollover during a very short validation window, or a nameserver returning a stale cached response during an emergency key compromise recovery — are both rare and short-lived. A dnssec_bogus result that persists across multiple probes and multiple query times is injection. In practice, we see this rarely because most injection systems return unsigned responses, which produce dnssec_indeterminaterather than dnssec_bogus.

DoH and DoT as diagnostic queries

When the ISP resolver shows interference — anyinterference_type other thannone — the Voidly probe runs a secondary diagnostic query using DNS-over-HTTPS and DNS-over-TLS. These encrypted transport protocols bypass the ISP's DNS infrastructure entirely: DoH sends the DNS query inside an HTTPS request to a well-known DoH resolver, and DoT wraps the DNS query in TLS on port 853.

The diagnostic value is in the differentiation it provides:

If the ISP resolver shows interference but DoH and DoT resolve the domain correctly, the DNS tampering is isolated to the ISP's DNS infrastructure. The censor has not blocked DNS-over-HTTPS or DNS-over-TLS, so the domain is genuinely blocked only at the legacy DNS layer. This is the common case in Turkey and Pakistan.

If the ISP resolver shows interference and DoH also fails — DoH returns NXDOMAIN or times out — it could mean the domain genuinely does not exist, or that DoH is itself being blocked. The probe checks for DoH blocking separately by attempting the DoH connection to cloudflare-dns.comon port 443 with a control domain (example.com) first. If the DoH connection itself fails, the probe records doh_blocked: trueand the DoH result for the target domain is treated as inconclusive. This matters in China, where DoH to Cloudflare is blocked, and in Iran, where DoH is intermittently blocked depending on the ISP.

import asyncio
import httpx
import struct

async def probe_doh(
    domain: str,
    server: str = 'cloudflare-dns.com',
    timeout_s: float = 5.0,
) -> DnsResult:
    """
    Query domain via DNS-over-HTTPS (RFC 8484 binary wire format, application/dns-message).
    Returns a DnsResult with resolver_type='DoH' and resolver set to the server URL.

    server: hostname of the DoH resolver (e.g. 'cloudflare-dns.com', 'dns.google')
    """
    url = f'https://{server}/dns-query'

    # Build a minimal DNS query packet: QTYPE=A, QCLASS=IN
    query_id   = 0x1234
    flags      = 0x0100   # RD (recursion desired) set
    qdcount    = 1
    header     = struct.pack('>HHHHHH', query_id, flags, qdcount, 0, 0, 0)

    # Encode domain name in wire format (labels separated by length bytes)
    labels = b''.join(
        bytes([len(part)]) + part.encode()
        for part in domain.rstrip('.').split('.')
    ) + b''
    question = labels + struct.pack('>HH', 1, 1)  # QTYPE=A, QCLASS=IN
    dns_wire  = header + question

    start = asyncio.get_event_loop().time()
    try:
        async with httpx.AsyncClient(http2=True, timeout=timeout_s) as client:
            resp = await client.post(
                url,
                content=dns_wire,
                headers={
                    'Content-Type': 'application/dns-message',
                    'Accept': 'application/dns-message',
                },
            )
        elapsed = (asyncio.get_event_loop().time() - start) * 1000.0
        answer_ips = _parse_a_records(resp.content)
        return DnsResult(
            resolver=url,
            resolver_type='DoH',
            query_domain=domain,
            response_code=_parse_rcode(resp.content),
            answer_ips=answer_ips,
            answer_ttls=_parse_ttls(resp.content),
            cname_chain=[],
            response_ms=elapsed,
            truncated=False,
            timed_out=False,
            qname_minimization=False,
            dnssec_valid=None,
            dnssec_bogus=None,
            dnssec_indeterminate=None,
        )
    except (httpx.TimeoutException, httpx.ConnectError):
        return DnsResult(
            resolver=url,
            resolver_type='DoH',
            query_domain=domain,
            response_code=2,      # SERVFAIL as proxy for connection failure
            answer_ips=[],
            answer_ttls=[],
            cname_chain=[],
            response_ms=(asyncio.get_event_loop().time() - start) * 1000.0,
            truncated=False,
            timed_out=True,
            qname_minimization=False,
            dnssec_valid=None,
            dnssec_bogus=None,
            dnssec_indeterminate=None,
        )

The DoT probe is structured identically but establishes a TLS connection to the resolver on port 853 (9.9.9.9:853 for Quad9) and sends the DNS wire format query over that TLS stream rather than wrapping it in HTTPS. The choice of Quad9 for DoT and Cloudflare for DoH is deliberate: using different providers reduces the risk of both being blocked simultaneously in the same network.

DNS measurement at probe scale

The Voidly network currently runs 37+ probes across multiple countries, each probe measuring 80 domains. For each domain, the probe sends five DNS queries: one to the ISP resolver (QTYPE=A), one to the ISP resolver (QTYPE=AAAA), one to the control resolver (QTYPE=A), one DoH diagnostic (when interference is detected), and one DoT diagnostic (when interference is detected). Under normal operation without triggered diagnostics: 37 probes × 80 domains × 3 DNS queries = ~8,880 DNS queries per probe cycle. With diagnostics firing on 15% of measurements: ~14,800 DNS queries total per cycle.

DNS responses are fast (median 45ms, p99 280ms), which means per-probe latency is not the constraint — the constraint is keeping all queries concurrent to complete a full measurement cycle in under 60 seconds. The probe uses a Tokio async runtime with a JoinSet to run all DNS queries for a measurement cycle concurrently:

use tokio::task::JoinSet;
use std::net::IpAddr;
use hickory_resolver::TokioAsyncResolver;

pub async fn run_dns_measurement_cycle(
    domains: &[String],
    isp_resolver: IpAddr,
    control_resolver: IpAddr,
) -> Vec<DnsResult> {
    let mut set: JoinSet<Vec<DnsResult>> = JoinSet::new();

    for domain in domains {
        let domain   = domain.clone();
        let isp_res  = isp_resolver;
        let ctrl_res = control_resolver;

        set.spawn(async move {
            // Both queries run concurrently within each domain's task
            let (isp_result, ctrl_result) = tokio::join!(
                query_resolver(&domain, isp_res,  ResolverType::ISP),
                query_resolver(&domain, ctrl_res, ResolverType::Control),
            );
            vec![isp_result, ctrl_result]
        });
    }

    let mut all_results: Vec<DnsResult> = Vec::with_capacity(domains.len() * 2);
    while let Some(task_result) = set.join_next().await {
        match task_result {
            Ok(results) => all_results.extend(results),
            Err(e) => {
                // JoinError: task panicked or was cancelled — log and continue
                tracing::warn!("DNS task failed: {e}");
            }
        }
    }
    all_results
}

async fn query_resolver(
    domain: &str,
    resolver_ip: IpAddr,
    resolver_type: ResolverType,
) -> DnsResult {
    use hickory_resolver::config::*;
    use hickory_resolver::proto::rr::RecordType;
    use std::time::Instant;

    let mut config = ResolverConfig::new();
    config.add_name_server(NameServerConfig {
        socket_addr: std::net::SocketAddr::new(resolver_ip, 53),
        protocol: Protocol::Udp,
        tls_dns_name: None,
        trust_negative_responses: true,
        bind_addr: None,
    });
    let opts = ResolverOpts {
        timeout: std::time::Duration::from_millis(2000),
        attempts: 1,
        validate: false,   // DNSSEC validation handled separately
        ..Default::default()
    };

    let resolver = TokioAsyncResolver::tokio(config, opts);
    let start = Instant::now();

    match resolver.lookup(domain, RecordType::A).await {
        Ok(lookup) => {
            let answer_ips: Vec<IpAddr> = lookup
                .iter()
                .filter_map(|r| r.as_a().map(|a| IpAddr::V4(a.0)))
                .collect();
            let answer_ttls: Vec<u32> = lookup
                .record_iter()
                .map(|r| r.ttl())
                .collect();
            DnsResult {
                resolver: resolver_ip,
                resolver_type,
                query_domain: domain.to_string(),
                response_code: 0,   // NOERROR
                answer_ips,
                answer_ttls,
                cname_chain: vec![],
                response_ms: start.elapsed().as_secs_f64() * 1000.0,
                truncated: false,
                timed_out: false,
                qname_minimization: false,
            }
        }
        Err(e) => {
            let timed_out = e.kind().to_string().contains("timed out");
            let rcode = if timed_out { 0 } else { extract_rcode(&e) };
            DnsResult {
                resolver: resolver_ip,
                resolver_type,
                query_domain: domain.to_string(),
                response_code: rcode,
                answer_ips: vec![],
                answer_ttls: vec![],
                cname_chain: vec![],
                response_ms: start.elapsed().as_secs_f64() * 1000.0,
                truncated: false,
                timed_out,
                qname_minimization: false,
            }
        }
    }
}

The JoinSet pattern ensures that slow resolvers for one domain do not block queries for other domains. Each domain's task holds its own resolver instance and 2-second timeout. Under the p99 latency of 280ms, a full measurement cycle for 80 domains typically completes in under 4 seconds on a connection with reasonable packet loss. When ISP resolvers are timing out at scale — which itself is a censorship signal — the full cycle runs up to the 2-second timeout per domain, completing in roughly 4–6 seconds sequentially but still under 10 seconds with the concurrent JoinSet approach.

The raw DNS results from each cycle are collected into aMeasurementBatch, serialized to CBOR, and uploaded to the Voidly control server over a QUIC connection. The control server runs the Python comparison algorithm against the control measurements it collected independently, produces the DnsComparison objects, and writes the final structured records to TimescaleDB. The round-trip from a probe completing a DNS measurement to the record appearing in the database is typically under 8 seconds.

For the full measurement dataset schema where all DNS fields are documented: The Voidly measurement dataset: field-by-field schema reference →

For how DNS anomalies combine with TCP, TLS, and HTTP signals in the anomaly classifier: The Voidly anomaly classifier: five interference classes and why we optimize for recall →

For the HTTP layer of censorship detection — block pages, body hashing, and body length ratio: How Voidly measures HTTP and HTTPS censorship: the full protocol lifecycle from DNS through TLS to body comparison →

For how Voidly's control server compares DNS results across ISP vs neutral resolvers: The Voidly control server: how we tell censorship from a bad network →