Technical writing

Detecting Election Anomalies Using Statistical Methods

September 12, 2024· 10 min read· AI Analytics

ElectionsStatisticsBenfordOSINT

During the 2024 election cycle, we monitored 47 races across 23 states. We detected 12 statistical anomalies worth investigating. This is how the system works.

What We're Looking For

Not fraud detection—that's politically loaded and requires ground truth we don't have. We're looking for anomalies: patterns that deviate significantly from expected behavior.

Types of anomalies:

Unexpected turnout patterns (spikes in specific precincts)
Ballot rejection rates outside historical norms
Vote totals that don't match registration trends
Timing anomalies (results reported faster/slower than typical)
Down-ballot inconsistencies (president vs local races)

We don't conclude anything from anomalies alone. We flag them for human investigation.

Data Sources

We pull from:

State election websites (official results, updated hourly)
Voter registration databases (where publicly available)
Census data (demographics, used for modeling expected turnout)
Historical election results (2016, 2018, 2020, 2022)
Social media sentiment (from our OSINT pipeline)

Total: ~890M data points across 47 races. Stored in PostgreSQL with PostGIS for geographic queries.

Benford's Law Analysis

First-digit distribution test. In naturally occurring numbers, the digit "1" appears ~30% of the time, "2" appears ~18%, etc. Fabricated numbers don't follow this distribution.

We apply Benford's Law to precinct-level vote totals. Code:

import numpy as np
from scipy import stats

def benford_test(numbers):
    """
    Returns chi-squared statistic and p-value
    """
    first_digits = [int(str(n)[0]) for n in numbers if n > 0]

    # Expected distribution (Benford's Law)
    expected = [30.1, 17.6, 12.5, 9.7, 7.9, 6.7, 5.8, 5.1, 4.6]

    # Observed distribution
    observed = [first_digits.count(d) for d in range(1, 10)]
    observed_pct = [100 * o / len(first_digits) for o in observed]

    # Chi-squared test
    chi2, p_value = stats.chisquare(observed, expected)

    return chi2, p_value

# Example: test vote totals for candidate X
vote_totals = get_precinct_votes(candidate_id)
chi2, p = benford_test(vote_totals)

if p < 0.01:  # Significant deviation
    flag_anomaly("benford_violation", chi2=chi2, p_value=p)

This caught 3 anomalies in 2024. All turned out to be data entry errors (trailing zeros added), not fraud. Still worth flagging.

Turnout Modeling

We build a regression model for each precinct predicting expected turnout based on:

Historical turnout (previous 3 elections)
Demographics (age, income, education from census)
Registration changes (new voters vs dropped registrations)
Early voting numbers (where available)
Social media sentiment (weak signal but helps at the margin)

Model: Gradient boosted trees (XGBoost). Trained on 2016-2022 data, tested on 2024.

import xgboost as xgb

# Features for each precinct
X = [
    historical_turnout_mean,
    historical_turnout_std,
    median_age,
    median_income,
    pct_college_educated,
    registration_delta,
    early_vote_count,
    social_sentiment_score
]

# Train model
model = xgb.XGBRegressor(n_estimators=100, max_depth=6)
model.fit(X_train, y_train)

# Predict expected turnout
predicted = model.predict(X_test)
actual = get_actual_turnout()

# Flag outliers (>2 standard deviations)
residuals = actual - predicted
z_scores = (residuals - residuals.mean()) / residuals.std()

anomalies = [(i, z) for i, z in enumerate(z_scores) if abs(z) > 2]

Model accuracy: RMSE of 3.2% (i.e., predictions within 3.2 percentage points of actual turnout). Good enough to detect meaningful deviations.

Ballot Rejection Analysis

Mail-in ballots get rejected for signature mismatch, missing info, late arrival, etc. Rejection rates vary by state (0.5% to 5% is typical).

We track rejection rates by:

County
Demographic group (where data available)
Time period (early vs election day)

Then compare to historical baselines. Significant increases = anomaly.

Example from 2024: County X had 8.2% rejection rate vs 2.1% historical average. Investigation revealed new signature verification contractor with stricter standards. Not nefarious, but worth documenting.

Down-Ballot Inconsistencies

Most voters vote straight-ticket or leave down-ballot races blank. It's rare for someone to vote for president but skip senator, or vote different parties.

We look for precincts where this happens at unusual rates:

def analyze_downballot(precinct_data):
    """
    Check for unusual split-ticket or undervote patterns
    """
    president_votes = precinct_data['president_total']
    senate_votes = precinct_data['senate_total']

    # Undervote rate (people who voted president but not senate)
    undervote_rate = (president_votes - senate_votes) / president_votes

    # Compare to historical average
    historical_avg = get_historical_undervote_rate(precinct_data['id'])

    if abs(undervote_rate - historical_avg) > 0.10:  # >10pp difference
        flag_anomaly("downballot_inconsistency", 
                     current=undervote_rate, 
                     historical=historical_avg)

This caught 2 anomalies in 2024. Both were in precincts with controversial local races (sheriff, DA) that drove split-ticket voting. Makes sense in context.

Time-Series Analysis

Vote counts get reported over time (as precincts finish counting). We model the expected reporting curve based on historical data.

Deviations indicate:

Technical issues (delays in reporting system)
Procedural changes (county changed counting workflow)
Potential manipulation (unlikely but worth checking)

We use ARIMA models to predict vote totals at each time step:

from statsmodels.tsa.arima.model import ARIMA

# Historical vote reporting timeseries
historical = get_reporting_timeseries(county_id, election_type)

# Fit ARIMA model
model = ARIMA(historical, order=(2,1,2))
model_fit = model.fit()

# Predict expected vote count at time T
predicted = model_fit.forecast(steps=1)
actual = get_current_vote_count()

# Check deviation
if abs(actual - predicted) > 2 * model_fit.resid.std():
    flag_anomaly("reporting_deviation")

This is noisy—lots of legitimate reasons for reporting delays. But combined with other signals, it helps identify precincts worth investigating.

Results from 2024

Of 12 flagged anomalies:

3 were data entry errors (fixed by county officials)
4 were procedural changes (new voting equipment, changed workflows)
2 were demographic shifts (unexpected population changes)
2 were legitimate unusual circumstances (weather events, local controversies)
1 remains unexplained (likely noise, not worth pursuing)

Zero confirmed instances of fraud or manipulation. That's expected—election fraud at scale is extremely difficult in modern US elections.

What This System Isn't

This is not a fraud detection system. It's an anomaly detection system. Most anomalies are boring (data errors, procedural changes). That's fine. The point is to surface things worth looking at.

It's also not a substitute for traditional election security (paper ballots, audits, chain of custody). This is a supplementary tool for researchers and journalists.

Data collection scripts are not published (to avoid gaming by bad actors). Core statistical methods are documented above; reference implementations are available on request via the contact channel below.

Questions about methodology: contact@ai-analytics.org