Technical writing

Voidly's real-time event pipeline: from measurement anomaly to journalist alert in under 8 minutes

December 5, 2025· 8 min read· AI Analytics

CensorshipVoidlyInfrastructureReal-time systems

When a probe reports a DNS block in Tehran, there is a clock running. A journalist chasing a story before access disappears entirely does not need a batch job that runs at 2 AM. From the moment a probe flags an anomaly to the moment an alert lands in a journalist's inbox, every second of unnecessary latency is a second closer to the story window closing. Our design target was sub-10-minute probe-to-alert latency. In practice, for events with real-time OONI and IODA corroboration, we are consistently at 7–8 minutes end to end.

This post walks through the entire pipeline: how probe measurements are scored inline before entering the event queue, how the corroboration worker queries OONI and IODA concurrently, how confidence thresholds determine what gets published at which tier, how the alert-fatigue guard filters out noise without delaying genuine incidents, and how CensoredPlanet fits into the picture as a retroactive — not real-time — data source.

The 5-minute probe cycle

Every Voidly probe runs its measurement cycle every 5 minutes against a test list of roughly 80 domains. The list is country-specific for the top tier (news, political opposition, LGBTQ+ resources, VPN provider landing pages) and supplemented by the global Citizen Lab test list for coverage breadth. The cycle is staggered across probes in the same country so we are never waiting for an entire country's probes to finish before acting on any of them.

Results are sent immediately to the collector via the WireGuard tunnel rather than batched. This is a deliberate architectural choice: batching probe results at the probe before transmission would add anywhere from 30 seconds to 5 minutes of latency depending on where in the cycle a measurement falls. The tunnel overhead is negligible — a single measurement serializes to under 2 KB.

The collector performs first-pass anomaly scoring inline on each arriving measurement. The scoring function runs in under 50 ms per measurement. If a domain shows a DNS miss and a TLS failure in the same measurement, it is immediately queued as a candidate event — we do not wait for the next batch cycle to catch it. The key insight is that multi-signal agreement within a single measurement (DNS + TLS) is already a meaningful signal, even before cross-probe or cross-source corroboration.

The collector also processes all probes arriving from the same country together in a rolling window. If 3 of 4 probes in the same country flag the same domain within a 2-minute window, the composite confidence score jumps sharply — the cross-probe agreement is weighted more heavily than any single measurement's anomaly score.

async def on_measurement_received(measurement: Measurement) -> None:
    score = score_anomaly(measurement)  # inline, <50ms
    if score >= CANDIDATE_THRESHOLD:    # 0.35 default
        await event_queue.put(CandidateEvent(
            country=measurement.country,
            domain=measurement.domain,
            anomaly_type=score.type,
            initial_confidence=score.value,
            probe_count=1,
            timestamp=measurement.timestamp,
        ))

The candidate threshold of 0.35 is intentionally lower than the publication threshold. We want the queue to catch everything plausibly interesting and let the corroboration step do the filtering. Events that arrive at the queue but fail corroboration are discarded silently after 30 minutes — they do not appear in the public dataset at any tier.

The event queue and corroboration worker

A dedicated corroboration worker consumes events from the queue. The moment it dequeues a candidate, it fires two asynchronous HTTP requests in parallel: one to the OONI Explorer real-time API and one to the IODA real-time signal API. We useasyncio.gather so neither request blocks the other — the total wait time is the slower of the two responses, not the sum.

The OONI query searches for recent measurements from the same country and domain that were flagged anomalous:

GET https://api.ooni.io/api/v1/measurements?since=T-15m&probe_cc=IR&domain=example.com&anomaly=true

The 15-minute lookback window is wide enough to catch OONI measurements that may have been collected slightly before our probe fired, but narrow enough to exclude stale data from a different measurement context. OONI's real-time API reflects measurements within a few minutes of collection, so a 15-minute window reliably captures contemporaneous observations.

The IODA query pulls the BGP and active probing signal for the country over the last 30 minutes:

GET https://api.ioda.inetintel.cc.gatech.edu/v2/signals/events?from=T-30m&country=IR

IODA's value in the real-time context is primarily in distinguishing DNS/application-layer blocking from BGP-level routing changes. If IODA shows no prefix withdrawals and no significant drop in active probing reachability, we are almost certainly looking at targeted application-layer blocking rather than a full infrastructure shutdown. That distinction matters for the confidence score and for the content of any journalist alert.

Both APIs respond in under 2 seconds under normal load conditions. We set a 5-second timeout on each request; if either times out, we proceed with the confidence score computed from whatever data we have, with a flag indicating the timeout. CensoredPlanet is explicitly not queried in this real-time step — CP publishes daily batch exports, not a live API, so it cannot contribute to real-time corroboration. Its role comes later, in the retroactive pass.

async def corroborate(event: CandidateEvent) -> CorroboratedEvent:
    ooni_task = asyncio.create_task(query_ooni_realtime(event))
    ioda_task  = asyncio.create_task(query_ioda_realtime(event))
    ooni_result, ioda_result = await asyncio.gather(ooni_task, ioda_task)

    confidence = compute_composite_confidence(
        event.initial_confidence,
        ooni_corroboration=ooni_result.corroborates,
        ioda_corroboration=ioda_result.corroborates,
    )
    return CorroboratedEvent(event, confidence, ooni_result, ioda_result)

The compute_composite_confidence function applies the independence-weighted combination described in the confidence tier documentation. A Voidly anomaly score of 0.62 that receives OONI corroboration (independence weight 0.80) ends up with a composite confidence of around 0.88 — enough to reach Verified Incident tier in a single corroboration pass.

Confidence thresholds and publication

After corroboration completes, the event is routed to one of three publication buckets based on its composite confidence score. Events below 0.40 composite confidence go into an internal “Observed” bucket — they inform the 7-day forecast model as weak signals but are not published to the public dataset. We do not surface these to journalists because we cannot stand behind them without corroboration.

Events with composite confidence between 0.40 and 0.75 are published as Corroborated. These are real events — not noise — but they have not yet been independently confirmed by multiple external sources. A Corroborated event might reflect a block that OONI has seen but IODA has not yet responded to, or a block in a country where OONI coverage is sparse and only Voidly probes have fired.

Events with composite confidence at or above 0.75 are published as Verified Incidents. These represent three or more independent corroborations by our weighting scheme. A Verified Incident is citable and appears in the dashboard incident counter. The journalist alert system evaluates alert conditions only on Verified Incidents — Corroborated events do not trigger outbound alerts.

The publication step is a single atomic operation: write to the public dataset API, mark the incident on the dashboard, and evaluate alert conditions. The end-to-end latency from probe anomaly to Corroborated publication is typically 6–8 minutes: 4–5 minutes for probe-to-collector transit (the probe is in the middle of its 5-minute cycle when the block starts), and under 2 minutes for the corroboration round trip. A concrete example:

T+0:00  Probe detects DNS block for bbc.com
T+0:04  Measurement received at collector (4s tunnel RTT)
T+0:04  Inline scoring: anomaly score 0.62, queued
T+0:04  Two probes on different ASNs report same anomaly → confidence → 0.71
T+0:05  Corroboration worker fires OONI + IODA queries
T+0:07  OONI returns 12 matching measurements (confidence → 0.88)
T+0:07  IODA returns no prefix withdrawal (BGP normal, not a full shutdown)
T+0:07  Final confidence: 0.88 — VERIFIED INCIDENT
T+0:08  Published to dataset API, marked on dashboard
T+0:08  Alert evaluation: threshold crossed, fatigue guard checked

The 4-second figure for tunnel RTT is a median — it can be higher for probes in countries with high-latency egress links. For probes where the WireGuard tunnel RTT exceeds 10 seconds we log a quality warning; above 30 seconds the probe is flagged for investigation. High-latency probes still contribute to confidence scoring but are weighted slightly lower in the cross-probe aggregation.

Alert fatigue and the two-window guard

Early versions of the alert system sent a notification every time a Verified Incident threshold was crossed. Within two weeks of launch, our journalist subscribers were complaining about false-alarm fatigue. The problem was not false positives in the traditional sense — the events were real — but the system was alerting on transient blocks that self-resolved within a single measurement window before the journalist could even open the email. A 10-minute block of BBC News in a country where it is normally accessible is interesting; it is not worth interrupting someone's workflow at 3 AM.

The current design fires an alert only when the confidence threshold is crossed on at least 2 of the last 3 evaluation windows. Each 5-minute probe cycle constitutes one evaluation window. In practice, this means an event must maintain above-threshold confidence in two consecutive cycles — 10 minutes apart — before an alert goes out. A block that self-resolves within 5 minutes never satisfies this condition.

There is one explicit bypass: BGP full-prefix withdrawal at the country level skips the two-window guard entirely. A country-level shutdown — the kind where autonomous systems stop advertising routes for entire national IP ranges — is urgent enough that we do not buffer it. In that scenario, a 10-minute delay in alerting is the difference between a journalist being able to document the shutdown in progress and arriving after the story is over.

def should_alert(event: VerifiedEvent) -> bool:
    if event.anomaly_type == AnomalyType.BGP_FULL_SHUTDOWN:
        return True  # bypass fatigue guard for full outages

    recent_windows = event_history.get_windows(event.country, event.domain, n=3)
    threshold_crossings = sum(1 for w in recent_windows if w.confidence >= ALERT_THRESHOLD)
    return threshold_crossings >= 2  # two of last three windows

The ALERT_THRESHOLD is set to 0.75 — the same floor as Verified Incident publication. We considered using a higher threshold (0.85+) for alerts specifically, but the two-window guard already filters out most of the transient noise, and raising the alert threshold would miss genuine events in countries with lower OONI coverage where composite confidence rarely reaches 0.85 even for real blocks.

Alert subscribers can configure per-country and per-domain thresholds to tune the sensitivity for their specific coverage area. A journalist covering Iran specifically can lower their personal alert threshold to 0.65 and accept more noise in exchange for earlier notification; a general news desk monitoring global connectivity can leave it at 0.75.

The CensoredPlanet retroactive pass

CensoredPlanet is one of the most comprehensive censorship measurement projects in existence — but it does not offer a real-time API. CP publishes daily batch exports of its scan results, typically available by early morning UTC for the previous day's measurements. Because of this architecture, CP cannot contribute to the initial real-time corroboration step described above.

Every night at 02:00 UTC we run a retroactive pass over the CP exports that became available that day. For every event published in the last 48 hours with initial confidence below 0.90, we query whether CP's measurements confirm or contradict the event. The 48-hour lookback window accounts for CP's publication lag: an event that happened at 23:00 on day N may not have CP coverage until the export lands on day N+2.

CP confirmation can upgrade an event from Corroborated to Verified if it crosses the 0.75 composite threshold after the CP weight is applied. More commonly, CP provides additional interference-type metadata that the real-time step cannot capture. CP's scanning methodology is particularly good at detecting HTTP-layer blocking — cases where a DNS resolution succeeds and a TCP connection is established, but the HTTP response is tampered (a block page injection, a redirect, or a content-length mismatch). Our probes catch DNS and TLS failures reliably, but HTTP-layer blocking can slip through if the block page returns a 200 status code.

CP data also serves as a quality control layer for our own measurements. If CP disagrees with 5 or more Voidly events in the same country on the same day — events where Voidly reported a block and CP found nothing, or vice versa — we flag the batch for manual review. This most commonly surfaces when a probe's WireGuard tunnel was experiencing packet loss during the relevant measurement window, producing false positives that CP's vantage points do not replicate.

Because CP is retroactive, the public dataset shows a cp_checked boolean field and a cp_checked_at timestamp on each event. Researchers downloading the dataset can filter to events wherecp_checked = true if they want the highest-confidence subset, accepting that this excludes events from the last 24–48 hours. For real-time monitoring, the OONI + IODA corroboration is what matters; CP is a quality enrichment layer, not a real-time signal.

For the vantage network that generates the measurements entering this pipeline: Voidly probe vantage selection: ASN diversity, operator safety, and reaching hard-to-measure countries →

How probe measurements are classified before entering this pipeline: The Voidly anomaly classifier: five interference classes and why we optimize for recall →

How the confidence score is calculated across sources: Cross-source censorship verification: reconciling OONI, CensoredPlanet, and IODA →

How these verified events become training data for the 7-day forecast: Seven-day internet shutdown forecasting: how Voidly predicts connectivity outages →

How thousands of raw measurements deduplicate into the single incident_id the pipeline publishes: Incident clustering and deduplication: how Voidly avoids counting the same censorship event twice →

For the inference API that sits in the hot path of this pipeline — ONNX Runtime serving, feature extraction in 5ms, and the full p50/p99 latency breakdown: Voidly's real-time inference API: classifying censorship measurements at 50ms →

For the complete probe-side execution path that produces the measurements this pipeline processes — DNS, TCP, TLS, HTTP phases, ProbeResult assembly, and upload batching: Voidly probe run lifecycle: from scheduled task to classifier input →

For how the scheduler decides which domains feed into this pipeline — OONI category-code priorities, anomaly-driven boosts, ±15% jitter, per-country task budgets, and the urgent injection path that fires when the pipeline detects an anomaly: The Voidly measurement scheduler: how we decide which domains to probe and when →