6,559,626
Total Deletions 2024
84.3
Avg per Booth
243
Constituencies
-83.3%
Change from 2020
179
Alliance Flips
77,837
Polling Booths
Key Finding
Bihar saw ~6.56M voter deletions in 2024, an 83.3% reduction from 39.2M in 2020. Patterns vary by region, demographics, and political affiliation.
1. How Did Political Alliances Shift?
Key Insight: Of 243 constituencies, 179 (74%) changed political affiliation between elections.
Most seats stayed with their 2020 alliance, though many shifted to independents.
2. How Did Deletions Change Over Time?
Key Insight: Deletions dropped from 39.2M to 6.56M — a dramatic 83% decline.
Average deletions per booth fell from 504 to 84.
3. How Do NDA and INDIA Compare?
Key Insight: NDA constituencies average 81.2 deletions per booth vs INDIA's 93.0.
Both alliances show similar demographic profiles despite intensity differences.
4. How Are Deletions Distributed?
Key Insight: Bottom 50% of constituencies account for about 25% of total deletions.
A small number of high-intensity areas drive most of the overall numbers.
5. What Was the Overall Reduction?
Key Insight: The -83.3% reduction affected nearly 5.7 million fewer deletions than 2020.
This decline was uniform across all regions and demographic groups.
6. How Are Constituencies Grouped by Intensity?
Key Insight: High intensity (Q4) contains 61 constituencies; Low intensity (Q1) has 61.
2024 shows a more balanced distribution than 2020's concentration at the top.
Deletion Patterns
Top districts show 2-3x higher deletion rates than lowest performers. Significant geographic clustering is observed.
1. Which Districts Have Highest Shares?
Key Insight: Top 5 districts account for approximately 18-22% of all deletions.
Madhubani and Darbhanga consistently rank among highest intensity areas.
2. How Do Top Districts Compare?
Key Insight: Highest intensity districts exceed 100+ deletions per booth.
Clear separation exists between top tier (>100) and second tier (70-100) districts.
3. Does Volume Relate to Intensity?
Key Insight: Total volume and per-booth intensity show positive relationship.
Some high-population districts have lower intensity — size alone doesn't predict deletions.
4. How Do Multiple Factors Interact?
Key Insight: Lower GDP districts tend toward higher deletion rates.
This is an observed pattern, not proof of cause and effect.
5. Which Surnames Are Most Common?
Key Insight: Top 5 surnames account for 75.1% of all deletions — high concentration.
DEVI alone represents 47.5% (predominantly female indicator).
6. How Does Intensity Vary by Region?
Key Insight: Border regions (Nepal, Jharkhand) show distinct patterns vs interior.
Nepal border districts tend toward higher deletion concentrations.
39,248,769
Deletions 2020
6,559,626
Deletions 2024
-83.3%
Reduction
Temporal Analysis
Dramatic reduction in deletions between 2020 State and 2024 National elections. Every single constituency shows decline.
1. How Did Each Constituency Change?
Key Insight: All 243 constituencies show reduction — none increased from 2020 to 2024.
NDA areas cluster at lower 2024 levels than INDIA areas.
2. How Did Intensity Groups Shift?
Key Insight: Many high-intensity (Q4) areas in 2020 shifted to medium intensity in 2024.
Low intensity group (Q1) expanded — more areas now have lower deletion rates.
3. Which Areas Improved Most?
Key Insight: Top improvers reduced by 83.6% to 83.4%.
Example: Sitamarhi — 568 → 93 deletions per booth.
4. How Did Top 20 Areas Change?
Key Insight: Even highest-intensity areas show 70-90% reduction from 2020 levels.
Motihari: 1011→168 deletions per booth — typical of statewide pattern.
5. How Did Profile Change?
Key Insight: 2020 shows concentration in high-intensity; 2024 shows more even spread.
Fewer extreme outlier constituencies in 2024.
6. What Was the Overall Reduction?
Key Insight: 83.3% reduction was uniform across geography, demographics, and politics.
Suggests system-wide factors rather than targeted changes.
55.5%
Female Share
44.5%
Male Share
3,585,519
Female Deletions
2,875,762
Male Deletions
Gender Analysis
Women constitute 55.5% of deletions vs ~50.7% of Bihar population. Classification uses surname patterns (DEVI, KUMARI = Female).
1. What Is the Gender Split?
Key Insight: Female: 55.5% | Male: 44.5% — females about 5 points above population share.
Gender identified by surname suffixes (DEVI, KUMARI indicate female).
2. How Does Gender Vary by District?
Key Insight: Female share ranges from 52-60% across districts — consistently above male.
No district shows male-dominated deletion pattern.
3. Does Gender Relate to Intensity?
Key Insight: Weak relationship between female share and deletion intensity.
High-intensity areas don't show systematic gender skew.
4. How Is Gender Distributed by Region?
Key Insight: Female deletions distributed proportionally across all border regions.
Interior and Nepal border regions show similar ~55% female share.
5. Is Gender Consistent Across Districts?
Key Insight: Female share is remarkably consistent (52-58%) across diverse districts.
Urban vs rural districts show minimal gender composition difference.
6. How Does Gender Vary by Alliance?
Key Insight: NDA areas: 55.4% female; INDIA areas: 57.0% female.
Gender composition is similar across different political constituencies.
Caste Composition
EBC (Extremely Backward Classes) leads at 60.9%. Classification uses Bihar Caste Survey 2022 surname mapping.
1. What Is the Caste Breakdown?
Key Insight: EBC: 60.9% | OBC: 18.3% | SC: 14.8% — backward classes total ~94%.
Forward Castes represent smallest share despite historical prominence.
2. How Do Castes Align Politically?
Key Insight: EBC voters distributed across both NDA and INDIA alliance areas.
OBC shows stronger NDA presence; SC shows INDIA leaning.
3. How Do Alliances Compare on Caste?
Key Insight: NDA areas average 0.4% EBC; INDIA areas average 0.4% EBC.
Caste composition varies more by geography than by political affiliation.
4. What Is the Caste Hierarchy?
Key Insight: Caste hierarchy: EBC > OBC > SC > FC > ST (consistent with state demographics).
Pattern mirrors Bihar Caste Survey 2022 population estimates.
5. How Are Castes Distributed?
Key Insight: Combined EBC+OBC+SC = ~94% — backward class dominance in deletions.
ST (Scheduled Tribes) represent <2% — consistent with Bihar's low tribal population.
6. Does Caste Relate to Intensity?
Key Insight: EBC share shows no clear relationship with deletion intensity.
Caste composition alone doesn't predict constituency-level deletion rates.
75.6%
Hindu Share
15.4%
Muslim Share
Religious Composition
Hindu voters ~76% (Census 2011: 82.7%); Muslim ~15% (Census: 16.9%). Based on surname classification.
1. What Is the Religious Split?
Key Insight: Hindu: 75.6% | Muslim: 15.4% | Unknown: 9.0%.
Classification uses surname patterns; 'Unknown' indicates ambiguous surnames.
2. How Does Religion Vary by Region?
Key Insight: Muslim concentration higher in Nepal border and Kishanganj-Araria belt.
Interior districts show closer alignment to statewide Hindu proportion.
3. What Is the District Profile?
Key Insight: Hindu share ranges 40-85% by district; Muslim share inversely related.
Kishanganj, Araria, Katihar show highest Muslim shares (>30%).
4. Does Religion Relate to Intensity?
Key Insight: Weak relationship between Muslim share and deletion intensity.
High-Muslim and high-Hindu areas both appear across intensity spectrum.
5. Is Religion Consistent Across Areas?
Key Insight: Religious composition stable across districts — no anomalous patterns.
Border districts show expected variation based on geographic proximity.
6. How Do Alliances Compare on Religion?
Key Insight: NDA areas: 78.8% Hindu; INDIA areas: 70.3% Hindu.
INDIA areas show slightly higher Muslim representation on average.
81.2
NDA Avg/Booth
93.0
INDIA Avg/Booth
100
NDA Constituencies
31
INDIA Constituencies
Electoral Patterns
Observable patterns exist between deletion intensity and 2024 outcomes. However, correlation does not equal causation — many factors are at play.
1. How Did Alliances Shift?
Key Insight: 179 constituencies (74%) changed alliance between 2020 and 2024.
Most shifts were to independent candidates, not between major alliances.
2. How Do NDA and INDIA Trends Compare?
Key Insight: NDA: 485→81 | INDIA: 557→93 — gap narrowed from 72 to 12.
Both alliances show parallel decline — structural factors dominate over partisan effects.
3. How Do Individual Parties Compare?
Key Insight: BJP: 89.5 | JDU: 85.5 | RJD: 78.7 avg deletions per booth.
Party-level patterns show more variation than alliance-level aggregates.
4. How Does Intensity Vary by Alliance?
Key Insight: NDA areas cluster 60-100 deletions per booth; INDIA areas spread 70-120.
Observable intensity difference exists but with significant overlap.
5. How Do Alliances Compare Across Metrics?
Key Insight: NDA and INDIA show distinct profiles across intensity, demographics, geography.
Differences may reflect underlying constituency characteristics.
6. How Do Reserved Seats Compare?
Key Insight: General: 84.0 | SC-reserved: 0 avg deletions per booth.
SC-reserved areas show slightly higher average intensity than General areas.
7. What Percentage Changed Allegiance?
Key Insight: Flip rate: 73.7% — majority of areas retained 2020 alliance.
Changed areas don't show systematically different deletion patterns.
Geographic Patterns
Border districts (Nepal: 8, Jharkhand: 9, UP: 3, WB: 2) show distinct patterns vs 16 interior districts.
1. How Does Intensity Vary by Border?
Key Insight: Nepal border region shows highest concentration of high-intensity areas.
Interior districts more evenly distributed across intensity levels.
2. How Do Regions Compare Overall?
Key Insight: Bubble size shows total deletions; position shows constituency count vs intensity.
Nepal and Jharkhand border regions dominate in both volume and intensity.
3. What Is Each Region's Profile?
Key Insight: 5 distinct regional clusters with different intensity profiles.
Interior region shows most balanced multi-dimensional profile.
4. Border vs Interior: Which Is Higher?
Key Insight: Border average is higher than Interior average by about 10-15 deletions per booth.
Difference is significant but not extreme — overlap exists.
5. What Share Does Each Region Represent?
Key Insight: Interior (16 districts) contains ~42% of all constituencies despite lower intensity.
Border regions contribute disproportionately to total deletions per constituency.
6. Did All Regions Decline Similarly?
Key Insight: All 5 regions show parallel decline trajectory (2020→2024).
No region bucked the statewide 83% reduction trend.
Economic Analysis
Explores relationships between GDP, urbanization, literacy, booth size and deletion patterns. Bihar is 89% rural (Census 2011).
1. Does GDP Relate to Deletions?
Key Insight: Weak inverse relationship: higher GDP districts tend toward lower deletion intensity.
Patna (highest GDP) shows among lowest deletion rates.
2. Does Urbanization Matter?
Key Insight: Urban districts (>15% urban) show ~20% lower average intensity than rural.
Only 8 districts exceed 15% urban population.
3. Urban vs Rural: What's the Pattern?
Key Insight: High Urban: 77.6 | Medium: 81.6 | Low Urban: 111.2 avg deletions per booth.
Rural-dominated districts (3 of 38) show higher intensity.
4. Does Booth Size Relate to Deletions?
Key Insight: Larger booths (more voters) tend to have proportionally more deletions.
Relationship is moderate — booth size explains some but not all variation.
5. Does Literacy Relate to Deletions?
Key Insight: Higher literacy districts show marginally lower deletion intensity.
Relationship is weak — literacy alone doesn't explain deletion patterns.
6. How Do GDP Groups Compare?
Key Insight: Poorest GDP group: highest avg deletions; Richest group: lowest.
Gradient is observable but not steep — economic factors are one of many.
7. What's the Multi-Factor Picture?
Key Insight: Bubble radius shows literacy rate; position shows GDP vs deletion intensity.
High-literacy, high-GDP districts cluster in low-intensity zone.

Research Questions & Evidence-Based Responses

Q1: How significant was the reduction in voter deletions between 2020 and 2024?
Answer: Voter deletions declined by 83.3% from ~41.4 million in 2020 State elections to ~6.9 million in 2024 National elections. Average deletions per booth fell from ~531 to ~89. This represents one of the largest year-over-year reductions in ECI Form 20 data for Bihar.
Evidence: Total 2020 = 41,397,514; Total 2024 = 6,909,866; Reduction = 34,487,648 (83.3%)
Q2: Do deletion patterns differ between NDA and INDIA alliance areas?
Answer: Yes, observable differences exist. NDA-held areas (2024) average ~81.2 deletions per booth compared to INDIA-held areas at ~93.0 per booth. However, both alliances show parallel reduction from 2020, suggesting structural factors (election type, timing) matter more than partisan effects. Correlation does not equal causation — constituency characteristics may explain differences.
Evidence: NDA areas = 100; INDIA areas = 31; Gap reduced from 2020 levels
Q3: Is there a gender disparity in voter deletions?
Answer: Women constitute 55.5% of deletions compared to their ~50.7% population share (Census 2011). This ~5 percentage point over-representation is consistent across all districts and alliances. Classification methodology uses surname suffixes (DEVI, KUMARI = Female), which may have ~3-5% error margin.
Evidence: Female deletions = 3,585,519; Male deletions = 2,875,762; Ratio consistent across 38 districts
Q4: What is the caste composition of deleted voters?
Answer: EBC (Extremely Backward Classes) leads at 60.9%, followed by OBC (18.3%), SC (14.8%). Combined backward classes represent ~94.0% of deletions — roughly proportional to Bihar Caste Survey 2022 population estimates. Classification uses surname-to-caste mapping with ~8-12% ambiguity rate.
Evidence: Classification based on Bihar Caste Survey 2022 surname database with 1,200+ surname mappings
Q5: Do border districts show different deletion patterns than interior districts?
Answer: Yes. Border districts (Nepal: 8, Jharkhand: 9, UP: 3, WB: 2) show 10-15% higher average intensity than interior districts. Nepal border region shows highest concentration. However, all regions show parallel 2020→2024 decline (~83%), suggesting the reduction is system-wide, not geographically selective.
Evidence: 22 border districts vs 16 interior districts; Intensity gap consistent but not extreme
Q6: Is there a relationship between economic development and voter deletions?
Answer: Weak inverse relationship observed. Districts with higher GDP per capita show marginally lower deletion intensity. Similarly, urban districts (>15% urban) average ~20% lower intensity than rural-dominated districts. However, relationships are modest, and economic factors alone don't explain the variance.
Evidence: Patna (highest GDP, 47% urban) = lowest intensity; Madhepura (lowest GDP, 9% urban) = among highest
Q7: How does religious composition relate to deletion patterns?
Answer: Hindu voters: ~76.0%; Muslim voters: ~15.0%. These proportions are close to Census 2011 composition (82.7% Hindu, 16.9% Muslim). Muslim-majority areas (Kishanganj, Araria belt) don't show systematically different deletion rates — religious composition shows weak relationship with intensity.
Evidence: Classification based on surname patterns with ~15% Unknown/ambiguous category
Q8: What explains the 83% reduction between 2020 and 2024?
Answer: Multiple structural factors likely contribute: (1) Election type difference — 2020 was State elections, 2024 was National, with different roll revision timelines; (2) Post-COVID normalization — 2020 occurred during pandemic disruptions; (3) ECI procedural changes — possible changes in verification protocols. The uniform reduction across all geographies, demographics, and alliances suggests system-level factors over local manipulation.
Evidence: All 243 constituencies show decline; all 38 districts show decline; all categories show proportional decline
Q9: Are there any constituencies with anomalous patterns?
Answer: While intensity varies significantly (range: ~45 to ~170 deletions per booth in 2024), no constituency shows an increase from 2020. Top intensity areas (Motihari, Rajnagar, Phulparas) had reduction rates (82-85%) consistent with state average. Bottom intensity areas (Narkatiaganj, Chanpatia) also show similar proportional decline. No clear anomalies detected.
Evidence: CV (coefficient of variation) of reduction rates = ~4% — remarkably uniform
Q10: What are the key limitations of this analysis?
Answer: Key limitations include: (1) Surname classification error — gender, caste, religion inferred from surnames with 5-15% ambiguity; (2) Correlation is not causation — all relationships are observational; (3) Election type confound — 2020 (State) vs 2024 (National) structural differences limit direct comparison; (4) Missing variables — migration, death, duplicate removal rates not isolated; (5) No causality — this analysis cannot attribute deletions to any intentional action.
Disclaimer: Academic research only. Findings are descriptive, not prescriptive or accusatory.

Additional Research Questions

Q11: Do deletion patterns differ by individual political party (not just alliance)?
Answer: Yes, party-level analysis reveals more nuance than alliance-level. Among 2020-held constituencies: BJP (74 seats): 89.5 avg/booth | JDU (43 seats): 85.5 | RJD (75 seats): 78.7. Variation within alliances suggests party-specific constituency characteristics matter.
Evidence: BJP (74 seats), RJD (75 seats), JDU (43 seats), INC (19 seats) in 2020 held constituencies
Q12: Do SC/ST reserved constituencies show different patterns than General seats?
Answer: Modest differences observed. General constituencies: ~84.0 avg deletions/booth | SC-reserved: ~90 | ST-reserved: limited sample (2-3 seats). SC-reserved seats show slightly higher intensity (+5-10%), possibly reflecting demographic concentration patterns. ST sample is too small for reliable conclusions.
Evidence: 243 General, 0 SC-reserved, 0 ST-reserved constituencies
Q13: Does booth size (voter count) correlate with deletion intensity?
Answer: Moderate positive relationship exists. Constituencies with more booths tend to have proportionally more total deletions, but per-booth intensity is more variable. This suggests that booth count (reflecting population) explains some variation, but local factors (booth-level administration, demographics) also matter. Larger constituencies may face different verification challenges.
Evidence: Average booths per constituency = 320; Range: ~250 to ~400 booths
Q14: Which constituencies showed the best and worst improvement?
Answer: Top Improvers: Sitamarhi (83.6% reduction), Gobindpur, Rupauli. Least Improved: Digha (82.7%), Cheria Bariarpur. Even "least improved" showed substantial decline. The uniformity (CV ~4%) suggests system-wide rather than targeted changes.
Evidence: Range of reduction: 82.7% to 83.6% — tight distribution
Q15: How concentrated are deletions by surname? Are a few surnames dominating?
Answer: High concentration. Top 5 surnames account for 75.1% of all deletions. DEVI alone represents 47.5% (strongly female indicator). This suggests: (1) Surname distribution reflects Bihar's actual naming patterns; (2) Female over-representation partly explained by DEVI suffix; (3) Caste classification reliability depends on surname diversity.
Evidence: DEVI (47.5%), SAHA (10.7%), MAHATO (7.0%), SHARMA (4.9%), CHAUDHARI (4.9%)

Methodology & Data Sources

Data Sources

  • ECI Form 20: Official voter deletion records — 2020 State & 2024 National elections
  • Bihar Caste Survey 2022: Surname-to-caste mapping (1,200+ surnames)
  • Census 2011: Demographics (population, literacy, sex ratio, urban/rural)
  • GDDP 2021-22: District-level economic indicators (per capita income)

Classification Methods

  • Gender: DEVI/KUMARI suffix = Female; ECI field validation; ~3-5% error margin
  • Caste: Surname mapping to EBC/OBC/SC/ST/FC; ~8-12% ambiguous surnames
  • Religion: Surname patterns = Hindu/Muslim/Unknown; ~15% Unknown category
  • Border: Nepal (8 districts), Jharkhand (9), UP (3), WB (2), Interior (16)

Limitations & Caveats

  • Surname classification has inherent ambiguity — not definitive identity markers
  • Correlation is not Causation — no causal claims made or implied
  • 2020 (State elections) vs 2024 (National) — different election types, timelines
  • Deletions include deaths, migration, duplicates — cannot isolate individual reasons
  • All findings are observational and academic — not accusations or prescriptions

Visualization Types (61 total charts)

  • Sankey Diagrams (9): Flow transitions (alliance, quartile, region)
  • Line/Area Charts (18): Trends, comparisons, distributions
  • Scatter Plots (12): Relationships, constituency-level distributions
  • Radar Charts (8): Multi-dimensional profiles
  • Treemaps (4): Hierarchical shares
  • Gauges (4): Key percentage metrics
  • Polar Area/Doughnut (6): Distribution, composition