← Back to VoteValue

πŸ“Š Data Sources & Methodology

Understanding how VoteValue works and where our data comes from

Real Data The VoteValue Index is computed from real election data: congressional and senate results from MIT Election Lab (1976–2024), state legislative results from MEDSL (2022) and Princeton/Klarner historical archives (1968–2018), district boundaries from U.S. Census TIGER/Line, demographic data from Census ACS, and compactness metrics computed from official shapefiles. VoteValue scores may not be 100% accurate. Do not use VoteValue as your sole source for electoral decisions.

Data freshness

District boundaries U.S. Census TIGER/Line Β· 118th Congress (2023–2024 cycle)
Demographics ACS 5-year Β· 2023 release (pools 2019–2023)
Federal election results MIT Election Data and Science Lab Β· 2024 general election
State legislative results Ballotpedia + secretaries of state Β· Most recent state cycle
Precinct geometry VEST 2020 Β· 2020 cycle
Compactness metrics Computed in-house from TIGER/Line Β· Recomputed on each shapefile refresh

Data Overview

VoteValue uses authoritative data sources to provide comprehensive voting district information with 100% real precinct data coverage across all 50 states plus DC.

51
States & DC Covered
163,904
Real Precincts (VEST 2020)
156M
2020 Presidential Votes
100%
Vote Conservation
51/51
States with Real Data

βœ… Real Data Achievement

January 2026: VoteValue now has complete nationwide coverage using VEST 2020 precinct data from Harvard Dataverse. All 51 states (50 states + DC) have real, certified 2020 Presidential election results at the precinct level with true precinct boundaries and 100% vote conservation.

163,904 precincts with 156,144,718 actual votes from official state election authorities.

⚠️ Important Notes:
  • Precinct Data: 100% real from VEST 2020 (certified state election results)
  • District Metrics: Some compactness and demographic metrics may use estimates where official data is unavailable
  • Representative Info: Candidate information should be verified independently
  • Due Diligence: Always verify voting procedures with official sources

VoteValue Index (VVI) Methodology

The VoteValue Index quantifies voting power on a 0-100 scale by analyzing five key factors that determine how much impact your individual vote has in shaping electoral outcomes.

Competitiveness 35%

How close recent elections have been, based on margin closeness, historical volatility, and party-flip frequency. Data from MIT Election Lab, MEDSL, and Princeton/Klarner historical archives.

Mobilization Potential 20%

Room for increased turnout to shift outcomes. Based on turnout gap vs. national median and whether the race is contested. Data from election results and Census ACS voting-eligible population.

Electoral Leverage 15%

How much one vote can swing the result. Based on absolute vote differential and district population size. Smaller districts with closer margins give each voter more structural weight.

District Integrity 20%

How fairly the district is drawn. Combines geometric compactness (Polsby-Popper, Reock, Convex Hull from Census TIGER shapefiles) with partisan efficiency gap analysis (Stephanopoulos & McGhee 2015).

Race Significance 10%

How much this race matters for shifting chamber control. Based on chamber partisan margin (from OpenStates), contestedness, and incumbency vulnerability.

Scoring Formula

The VVI is a linear weighted sum of the five subscores, each normalized to [0, 1]:

VVI = 100 Γ— (0.35 Γ— competitiveness + 0.20 Γ— mobilization + 0.15 Γ— leverage + 0.20 Γ— integrity + 0.10 Γ— significance)

No nonlinear scaling or score inflation is applied. A score of 72 means the weighted average of normalized metrics is 0.72.

District Boundary Data

Primary Source: U.S. Census Bureau TIGER/Line

All district boundaries are sourced from the U.S. Census Bureau's TIGER/Line Shapefiles, the official geographic database for U.S. political boundaries.

Data Type Source Year Update Frequency
Congressional Districts Census TIGER/Line (CD) 2024 (119th Congress) After redistricting
State Senate Districts Census TIGER/Line (SLDU) 2024 After redistricting
State House Districts Census TIGER/Line (SLDL) 2024 After redistricting
Voting Precincts (VTDs) Census TIGER/Line (VTD) 2020 Decennial Census

πŸ—ΊοΈ Boundary Processing

Raw shapefiles are converted to GeoJSON format and simplified for web performance (tolerance: 0.0005Β°) while preserving district boundary accuracy. Coordinates use WGS84 (EPSG:4326).

Precinct-Level Partisan Data

Primary Source: VEST 2020 (100% Coverage)

All 51 states (50 states + DC) now have real, certified 2020 Presidential election results at the precinct level from the VEST (Voting and Election Science Team) dataset hosted by Harvard Dataverse.

πŸ—³οΈ What is VEST?

VEST (Voting and Election Science Team) compiles official, certified election results from state election authorities and matches them to true precinct boundaries. This is the gold standard for precinct-level election data.

Data Quality: 100% vote conservation (source data, no estimation or distribution), true precinct boundaries (not VTD approximations), official results verified by state election authorities.

Geographic Level Coverage Data Source Data Type
Precincts 51/51 States VEST 2020 (Harvard Dataverse) Real Votes - 100% Conservation
↳ Precinct Count 163,904 precincts Certified state election results 156,144,718 votes
↳ Fields Included dem_votes, rep_votes, total_votes, margin, dem_pct, rep_pct, turnout
↳ Metadata data_source: "VEST 2020", is_estimated: false, election_year: 2020

Data Cascade Priority

The precinct service uses the following priority cascade when loading data for a state:

πŸ”„ Precinct Data Loading Priority

1. VEST 2020 (Primary) β†’ Used for all 51 states βœ…

2. NY Times Upshot 2020 (Fallback) β†’ Not needed, VEST covers all states

3. State-specific sources (Fallback) β†’ Not needed, VEST covers all states

4. OpenPrecincts (Fallback) β†’ Not needed, VEST covers all states

5. Census VTD boundaries only (Last resort) β†’ Geometry only, no partisan data

States Covered by VEST 2020

All 51 states: Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, District of Columbia, Florida, Georgia, Hawaii, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin, Wyoming

Partisan Margin Calculation

πŸ“ Formula

Margin = (Democratic % - Republican %) of two-party vote

Positive values indicate Democratic lean (D+X), negative values indicate Republican lean (R+X).

Example: A precinct with 60% Democratic and 40% Republican has a margin of +20 (D+20).

Derived Geographic Levels

Using the real VEST precinct data as a foundation, we can derive partisan data for finer and coarser geographic levels through spatial analysis.

Census Blocks: Population-Weighted Distribution

Census blocks are the finest geographic unit (~11 million blocks nationwide), but the Census Bureau does not collect vote data at this level. We distribute VEST precinct votes to census blocks using population-weighted spatial overlay:

πŸ”¬ Distribution Algorithm

Step 1: Spatially overlay precinct boundaries with census block boundaries to identify intersections

Step 2: For each precinct, sum the 2020 Census population of all intersecting blocks

Step 3: Distribute votes proportionally by population:

block.dem_votes = precinct.dem_votes Γ— (block.POP20 / precinct_total_pop)

block.rep_votes = precinct.rep_votes Γ— (block.POP20 / precinct_total_pop)

Validation: Vote conservation within 0.0-2.0% of precinct totals

⚠️ Important: Census block data is derived/estimated from real precinct votes, not actual block-level vote counts (which don't exist). This provides finer spatial resolution for analysis but should not be interpreted as actual census block voting records.

Availability: Census block data can be generated for any state with VEST precinct data (currently available for 13 states, can be expanded to all 51).

Metadata: is_estimated: true, data_source: "VEST 2020 (population-weighted distribution)", distribution_method: "population"

Congressional & State Legislative Districts: Area-Weighted Aggregation

District-level partisan data is calculated by aggregating VEST precinct votes that fall within each district boundary using area-weighted spatial overlay:

πŸ—³οΈ Aggregation Algorithm

Step 1: For each district, identify all VEST precincts that intersect the district boundary

Step 2: Calculate the overlap area ratio for each intersecting precinct:

area_ratio = (precinct ∩ district area) / total_precinct_area

Step 3: Aggregate votes using area weights:

district.dem_votes = Ξ£ (precinct.dem_votes Γ— area_ratio)

district.rep_votes = Ξ£ (precinct.rep_votes Γ— area_ratio)

Step 4: Calculate margin and percentages from aggregated totals

Validation: Vote conservation within 0.01% for all states

Availability: District partisan data can be generated for any state with VEST precinct data (currently available for 13 states, can be expanded to all 51).

Metadata: is_estimated: false (real votes, spatially aggregated), data_source: "VEST 2020 (precinct aggregation)", vote_source_method: "area_weighted", coverage_pct

Data Processing Status

Level States Processed Processing Status
Precincts (VEST 2020) 51/51 Complete - All States
Census Blocks (derived) 13/51 Partial - Can expand to all 51
Congressional Districts (derived) 13/51 Partial - Can expand to all 51
State Senate (derived) 13/51 Partial - Can expand to all 51
State House (derived) 13/51 Partial - Can expand to all 51

Compactness Score (Polsby-Popper)

Compactness measures how efficiently shaped a district is. Irregular, sprawling shapes may indicate gerrymandering.

Formula

Compactness = 4Ο€ Γ— Area / PerimeterΒ²

Score ranges from 0 to 1 (displayed as 0-100%), where 100% is a perfect circle. Lower scores may indicate gerrymandering.

Score Range Rating Interpretation
β‰₯ 40% Compact Regular, efficient shape
25-40% Moderate Some irregularity, often follows natural features
15-25% Irregular Unusual shape, may warrant scrutiny
< 15% Potentially Gerrymandered Highly irregular, likely intentionally drawn

External Data Sources

Primary Data Sources (Used in Production)

Supplementary Sources

Academic Research & Methodology

Alternative/Fallback Sources (Not Currently Used)

These sources were evaluated but are not currently used in production since VEST 2020 provides complete coverage:

Full Citations (BibTeX)

A machine-readable BibTeX file with complete citations for all datasets is available in the repository at CITATIONS.bib. If you use VoteValue or its data outputs in academic work, please cite the original data providers.

Data Freshness & Updates

Data Type Current Version Last Updated Next Expected Update
VEST Precinct Partisan Data 2020 Presidential January 2026 (51/51 states) When VEST 2024 dataset released
↳ Data Source Certified state election results from November 2020 - 163,904 precincts, 156M votes
Congressional District Boundaries 119th Congress (2024) January 2024 After 2030 Census redistricting
State Legislative Boundaries 2023 redistricting January 2024 After 2030 Census redistricting
Precinct Boundaries (VEST) 2020 (true precincts) November 2020 When VEST 2024 released
Census Block Boundaries 2020 Census (TABBLOCK20) 2020 2030 Census
Census Block Partisan Data (derived) 13 states processed January 2026 Can expand to all 51 states
District Partisan Data (derived) 13 states processed January 2026 Can expand to all 51 states

πŸ“… 2024 Presidential Data

Future Update: VEST typically releases new precinct data 6-12 months after an election. The VEST 2024 dataset is expected in late 2025 or early 2026. When released, VoteValue will update to include 2024 Presidential election results.

Page last updated: April 4, 2026

Major Update: Achieved 100% nationwide precinct data coverage (51/51 states) using VEST 2020 dataset

Data compiled from authoritative public sources

How's VoteValue?

Thanks for your feedback!