Data Sources & Methodology

Real Data The VoteValue Index is computed from real election data: congressional and senate results from MIT Election Lab (1976–2024), state legislative results from MEDSL (2022) and Princeton/Klarner historical archives (1968–2018), district boundaries from U.S. Census TIGER/Line, demographic data from Census ACS, and compactness metrics computed from official shapefiles. VoteValue scores may not be 100% accurate. Do not use VoteValue as your sole source for electoral decisions.

Data freshness

District boundaries U.S. Census TIGER/Line · 119th Congress (2025–2026 cycle)

Demographics ACS 5-year · 2023 release (pools 2019–2023)

Federal election results MIT Election Data and Science Lab · 2024 general election

State legislative results Ballotpedia + secretaries of state · Most recent state cycle

Precinct geometry VEST 2020 · 2020 cycle

Compactness metrics Computed in-house from TIGER/Line · Recomputed on each shapefile refresh

Data Overview

VoteValue uses authoritative data sources to provide comprehensive voting district information with 100% real precinct data coverage across all 50 states plus DC.

States & DC Covered

163,904

Real Precincts (VEST 2020)

156M

2020 Presidential Votes

100%

Vote Conservation

51/51

States with Real Data

✅ Real Data Achievement

January 2026: VoteValue now has complete nationwide coverage using VEST 2020 precinct data from Harvard Dataverse. All 51 states (50 states + DC) have real, certified 2020 Presidential election results at the precinct level with true precinct boundaries and 100% vote conservation.

163,904 precincts with 156,144,718 actual votes from official state election authorities.

⚠️ Important Notes:

Precinct Data: 100% real from VEST 2020 (certified state election results)
District Metrics: Some compactness and demographic metrics may use estimates where official data is unavailable
Representative Info: Candidate information should be verified independently
Due Diligence: Always verify voting procedures with official sources

VoteValue Index (VVI) Methodology

The VoteValue Index quantifies voting power on a 0-100 scale by analyzing five key factors that determine how much impact your individual vote has in shaping electoral outcomes.

Competitiveness 35%

How close recent elections have been, based on margin closeness, historical volatility, and party-flip frequency. Data from MIT Election Lab, MEDSL, and Princeton/Klarner historical archives.

Mobilization Potential 20%

Room for increased turnout to shift outcomes. Based on turnout gap vs. national median and whether the race is contested. Data from election results and Census ACS voting-eligible population.

Electoral Leverage 15%

How much one vote can swing the result. Based on absolute vote differential and district population size. Smaller districts with closer margins give each voter more structural weight.

District Integrity 20%

How fairly the district is drawn. Combines geometric compactness (Polsby-Popper, Reock, Convex Hull from Census TIGER shapefiles) with partisan efficiency gap analysis (Stephanopoulos & McGhee 2015).

Race Significance 10%

How much this race matters for shifting chamber control. Based on chamber partisan margin (from OpenStates), contestedness, and incumbency vulnerability.

Scoring Formula

The VVI is a linear weighted sum of the five subscores, each normalized to [0, 1]:

VVI = 100 × (0.35 × competitiveness + 0.20 × mobilization + 0.15 × leverage + 0.20 × integrity + 0.10 × significance)

No nonlinear scaling or score inflation is applied. A score of 72 means the weighted average of normalized metrics is 0.72.

District Boundary Data

Primary Source: U.S. Census Bureau TIGER/Line

All district boundaries are sourced from the U.S. Census Bureau's TIGER/Line Shapefiles, the official geographic database for U.S. political boundaries.

Data Type	Source	Year	Update Frequency
Congressional Districts	Census TIGER/Line (CD)	2024 (119th Congress)	After redistricting
State Senate Districts	Census TIGER/Line (SLDU)	2024	After redistricting
State House Districts	Census TIGER/Line (SLDL)	2024	After redistricting
Voting Precincts (VTDs)	Census TIGER/Line (VTD)	2020	Decennial Census

🗺️ Boundary Processing

Raw shapefiles are converted to GeoJSON format and simplified for web performance (tolerance: 0.0005°) while preserving district boundary accuracy. Coordinates use WGS84 (EPSG:4326).

Precinct-Level Partisan Data

Primary Source: VEST 2020 (100% Coverage)

All 51 states (50 states + DC) now have real, certified 2020 Presidential election results at the precinct level from the VEST (Voting and Election Science Team) dataset hosted by Harvard Dataverse.

🗳️ What is VEST?

VEST (Voting and Election Science Team) compiles official, certified election results from state election authorities and matches them to true precinct boundaries. This is the gold standard for precinct-level election data.

Data Quality: 100% vote conservation (source data, no estimation or distribution), true precinct boundaries (not VTD approximations), official results verified by state election authorities.

Geographic Level	Coverage	Data Source	Data Type
Precincts	51/51 States	VEST 2020 (Harvard Dataverse)	Real Votes - 100% Conservation
↳ Precinct Count	163,904 precincts	Certified state election results	156,144,718 votes
↳ Fields Included	dem_votes, rep_votes, total_votes, margin, dem_pct, rep_pct, turnout
↳ Metadata	`data_source: "VEST 2020"`, `is_estimated: false`, `election_year: 2020`

Data Cascade Priority

The precinct service uses the following priority cascade when loading data for a state:

🔄 Precinct Data Loading Priority

1. VEST 2020 (Primary) → Used for all 51 states ✅

2. NY Times Upshot 2020 (Fallback) → Not needed, VEST covers all states

3. State-specific sources (Fallback) → Not needed, VEST covers all states

4. OpenPrecincts (Fallback) → Not needed, VEST covers all states

5. Census VTD boundaries only (Last resort) → Geometry only, no partisan data

States Covered by VEST 2020

All 51 states: Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, District of Columbia, Florida, Georgia, Hawaii, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin, Wyoming

Partisan Margin Calculation

📐 Formula

Margin = (Democratic % - Republican %) of two-party vote

Positive values indicate Democratic lean (D+X), negative values indicate Republican lean (R+X).

Example: A precinct with 60% Democratic and 40% Republican has a margin of +20 (D+20).

Derived Geographic Levels

Using the real VEST precinct data as a foundation, we can derive partisan data for finer and coarser geographic levels through spatial analysis.

Census Blocks: Population-Weighted Distribution

Census blocks are the finest geographic unit (~11 million blocks nationwide), but the Census Bureau does not collect vote data at this level. We distribute VEST precinct votes to census blocks using population-weighted spatial overlay:

🔬 Distribution Algorithm

Step 1: Spatially overlay precinct boundaries with census block boundaries to identify intersections

Step 2: For each precinct, sum the 2020 Census population of all intersecting blocks

Step 3: Distribute votes proportionally by population:

block.dem_votes = precinct.dem_votes × (block.POP20 / precinct_total_pop)

block.rep_votes = precinct.rep_votes × (block.POP20 / precinct_total_pop)

Validation: Vote conservation within 0.0-2.0% of precinct totals

⚠️ Important: Census block data is derived/estimated from real precinct votes, not actual block-level vote counts (which don't exist). This provides finer spatial resolution for analysis but should not be interpreted as actual census block voting records.

Availability: Census block data can be generated for any state with VEST precinct data (currently available for 13 states, can be expanded to all 51).

Metadata: is_estimated: true, data_source: "VEST 2020 (population-weighted distribution)", distribution_method: "population"

Congressional & State Legislative Districts: Area-Weighted Aggregation

District-level partisan data is calculated by aggregating VEST precinct votes that fall within each district boundary using area-weighted spatial overlay:

🗳️ Aggregation Algorithm

Step 1: For each district, identify all VEST precincts that intersect the district boundary

Step 2: Calculate the overlap area ratio for each intersecting precinct:

area_ratio = (precinct ∩ district area) / total_precinct_area

Step 3: Aggregate votes using area weights:

district.dem_votes = Σ (precinct.dem_votes × area_ratio)

district.rep_votes = Σ (precinct.rep_votes × area_ratio)

Step 4: Calculate margin and percentages from aggregated totals

Validation: Vote conservation within 0.01% for all states

Availability: District partisan data can be generated for any state with VEST precinct data (currently available for 13 states, can be expanded to all 51).

Metadata: is_estimated: false (real votes, spatially aggregated), data_source: "VEST 2020 (precinct aggregation)", vote_source_method: "area_weighted", coverage_pct

Data Processing Status

Level	States Processed	Processing Status
Precincts (VEST 2020)	51/51	Complete - All States
Census Blocks (derived)	13/51	Partial - Can expand to all 51
Congressional Districts (derived)	13/51	Partial - Can expand to all 51
State Senate (derived)	13/51	Partial - Can expand to all 51
State House (derived)	13/51	Partial - Can expand to all 51

Compactness Score (Polsby-Popper)

Compactness measures how efficiently shaped a district is. Irregular, sprawling shapes may indicate gerrymandering.

Formula

Compactness = 4π × Area / Perimeter²

Score ranges from 0 to 1 (displayed as 0-100%), where 100% is a perfect circle. Lower scores may indicate gerrymandering.

Score Range	Rating	Interpretation
≥ 40%	Compact	Regular, efficient shape
25-40%	Moderate	Some irregularity, often follows natural features
15-25%	Irregular	Unusual shape, may warrant scrutiny
< 15%	Potentially Gerrymandered	Highly irregular, likely intentionally drawn

External Data Sources

Primary Data Sources (Used in Production)

VEST 2020 (Harvard Dataverse)
Primary source for precinct-level 2020 Presidential election results. Covers all 51 states (50 states + DC) with certified results from state election authorities. Maintained by the Voting and Election Science Team.
MIT Election Lab. U.S. House 1976–2024 (Harvard Dataverse)
Congressional election returns spanning nearly five decades. Used for competitiveness scoring, margin history, volatility, and party-flip frequency in the VVI Competitiveness metric.
MIT Election Lab. U.S. Senate 1976–2020 (Harvard Dataverse)
Statewide Senate election returns. Used for Senate race competitiveness and historical margin trends.
MIT Election Lab. County Presidential Returns 2000–2024 (Harvard Dataverse)
County-level presidential election returns. Used for geographic partisan context and turnout analysis.
MIT Election Lab. U.S. Senate County-Level Results 2022 (Harvard Dataverse)
2022 midterm Senate results at county level.
MIT Election Lab. U.S. Senate Precinct-Level Returns 2022 (Harvard Dataverse)
Precinct-level 2022 Senate results.
Klarner. State Legislative Election Returns 1967–2016 (Harvard Dataverse)
Historical state senate election results used for margin volatility and competitiveness trends in the VVI.
Princeton Gerrymandering Project. State House Returns 1968–2018
Historical state house election results. Used alongside Klarner data for multi-decade competitiveness analysis at the state legislative level.
MEDSL 2022 Precinct-Level Election Results
2022 midterm results aggregated at precinct and district level. Supplements Princeton/Klarner archives with the most recent state legislative election cycle.
U.S. Census Bureau TIGER/Line Shapefiles (2024)
Official geographic boundaries for the 119th Congress: congressional districts (CD119), state senate (SLDU), and state house (SLDL). Also 2020 Voting Tabulation Districts (VTD). Used for district geometry, point-in-polygon lookups, and compactness metrics.
2020 Census PL 94-171 Redistricting Files
Census block population data used for population-weighted distribution of precinct votes to census blocks.
Census ACS 5-Year Estimates (2019–2023)
Population, education, and income estimates by geography. Used for voting-eligible population calculations.
U.S. Elections Project. Voter Turnout (VEP)
State-level voting-eligible population by year (Michael P. McDonald). Used for turnout elasticity in the VVI Mobilization Potential metric.
Open States API v3
State legislative representative data, district assignments, and chamber composition (party seat counts). Used for representative lookups and the VVI Race Significance metric.
Google Maps Platform
Geocoding and place data for address-to-coordinates conversion.

Supplementary Sources

Ballotpedia. Supplemental state legislative election results when MEDSL/Princeton/Klarner data does not cover a particular district-year
NCSL. Authoritative state legislature chamber seat counts for computing chamber partisan margin
Census Geocoder. Supplemental reverse geocoding for district lookups
Nominatim (OpenStreetMap). Fallback geocoding service

Academic Research & Methodology

Polsby, Daniel D. and Robert D. Popper. "The Third Criterion: Compactness as a Procedural Safeguard Against Partisan Gerrymandering." Yale Law & Policy Review 9(2): 301–353 (1991). Polsby-Popper compactness metric
Reock, Ernest C. "A Note: Measuring Compactness as a Requirement of Legislative Apportionment." Midwest Journal of Political Science 5(1): 70–74 (1961). Reock compactness metric (minimum bounding circle)
Stephanopoulos, Nicholas O. and Eric M. McGhee. "Partisan Gerrymandering and the Efficiency Gap." University of Chicago Law Review 82(2): 831–900 (2015). Efficiency gap metric
Redistricting Data Hub. Redistricting research and advocacy
Princeton Gerrymandering Project. Gerrymandering metrics and analysis
PlanScore. Redistricting plan evaluation

Alternative/Fallback Sources (Not Currently Used)

These sources were evaluated but are not currently used in production since VEST 2020 provides complete coverage:

NY Times Upshot 2020 Presidential Precinct Map. Alternative precinct-level data (VEST preferred for quality and coverage)
OpenPrecincts. Crowdsourced precinct boundaries (VEST preferred for official certification)
Federal Election Commission. Federal election results at state/CD level (VEST provides finer granularity)

Full Citations (BibTeX)

A machine-readable BibTeX file with complete citations for all datasets is available in the repository at CITATIONS.bib. If you use VoteValue or its data outputs in academic work, please cite the original data providers.

Data Freshness & Updates

Data Type	Current Version	Last Updated	Next Expected Update
VEST Precinct Partisan Data	2020 Presidential	January 2026 (51/51 states)	When VEST 2024 dataset released
↳ Data Source	Certified state election results from November 2020 - 163,904 precincts, 156M votes
Congressional District Boundaries	119th Congress (2024)	January 2024	After 2030 Census redistricting
State Legislative Boundaries	2023 redistricting	January 2024	After 2030 Census redistricting
Precinct Boundaries (VEST)	2020 (true precincts)	November 2020	When VEST 2024 released
Census Block Boundaries	2020 Census (TABBLOCK20)	2020	2030 Census
Census Block Partisan Data (derived)	13 states processed	January 2026	Can expand to all 51 states
District Partisan Data (derived)	13 states processed	January 2026	Can expand to all 51 states

📅 2024 Presidential Data

Future Update: VEST typically releases new precinct data 6-12 months after an election. The VEST 2024 dataset is expected in late 2025 or early 2026. When released, VoteValue will update to include 2024 Presidential election results.

Page last updated: April 4, 2026

Major Update: Achieved 100% nationwide precinct data coverage (51/51 states) using VEST 2020 dataset

Data compiled from authoritative public sources

📊 Data Sources & Methodology

Data Overview

✅ Real Data Achievement

VoteValue Index (VVI) Methodology

Competitiveness 35%

Mobilization Potential 20%

Electoral Leverage 15%

District Integrity 20%

Race Significance 10%

Scoring Formula

District Boundary Data

Primary Source: U.S. Census Bureau TIGER/Line

🗺️ Boundary Processing

Precinct-Level Partisan Data

Primary Source: VEST 2020 (100% Coverage)

🗳️ What is VEST?

Data Cascade Priority

🔄 Precinct Data Loading Priority

States Covered by VEST 2020

Partisan Margin Calculation

📐 Formula

Derived Geographic Levels

Census Blocks: Population-Weighted Distribution

🔬 Distribution Algorithm

Congressional & State Legislative Districts: Area-Weighted Aggregation

🗳️ Aggregation Algorithm

Data Processing Status

Compactness Score (Polsby-Popper)

Formula

External Data Sources

Primary Data Sources (Used in Production)

Supplementary Sources

Academic Research & Methodology

Alternative/Fallback Sources (Not Currently Used)

Full Citations (BibTeX)

Data Freshness & Updates

📅 2024 Presidential Data

How's VoteValue?