6,559,626
Total Deletions 2024
84.3
Avg per Booth
243
Constituencies
-83.3%
Change from 2020
179
Alliance Flips
77,837
Polling Booths
Key Finding
Bihar saw ~6.56M voter deletions in 2024, an 83.3% reduction from 39.2M in 2020. Patterns vary by region, demographics, and political affiliation.1. How Did Political Alliances Shift?
Key Insight: Of 243 constituencies, 179 (74%) changed political affiliation between elections.
Most seats stayed with their 2020 alliance, though many shifted to independents.
Most seats stayed with their 2020 alliance, though many shifted to independents.
2. How Did Deletions Change Over Time?
Key Insight: Deletions dropped from 39.2M to 6.56M — a dramatic 83% decline.
Average deletions per booth fell from 504 to 84.
Average deletions per booth fell from 504 to 84.
3. How Do NDA and INDIA Compare?
Key Insight: NDA constituencies average 81.2 deletions per booth vs INDIA's 93.0.
Both alliances show similar demographic profiles despite intensity differences.
Both alliances show similar demographic profiles despite intensity differences.
4. How Are Deletions Distributed?
Key Insight: Bottom 50% of constituencies account for about 25% of total deletions.
A small number of high-intensity areas drive most of the overall numbers.
A small number of high-intensity areas drive most of the overall numbers.
5. What Was the Overall Reduction?
Key Insight: The -83.3% reduction affected nearly 5.7 million fewer deletions than 2020.
This decline was uniform across all regions and demographic groups.
This decline was uniform across all regions and demographic groups.
6. How Are Constituencies Grouped by Intensity?
Key Insight: High intensity (Q4) contains 61 constituencies; Low intensity (Q1) has 61.
2024 shows a more balanced distribution than 2020's concentration at the top.
2024 shows a more balanced distribution than 2020's concentration at the top.
Deletion Patterns
Top districts show 2-3x higher deletion rates than lowest performers. Significant geographic clustering is observed.1. Which Districts Have Highest Shares?
Key Insight: Top 5 districts account for approximately 18-22% of all deletions.
Madhubani and Darbhanga consistently rank among highest intensity areas.
Madhubani and Darbhanga consistently rank among highest intensity areas.
2. How Do Top Districts Compare?
Key Insight: Highest intensity districts exceed 100+ deletions per booth.
Clear separation exists between top tier (>100) and second tier (70-100) districts.
Clear separation exists between top tier (>100) and second tier (70-100) districts.
3. Does Volume Relate to Intensity?
Key Insight: Total volume and per-booth intensity show positive relationship.
Some high-population districts have lower intensity — size alone doesn't predict deletions.
Some high-population districts have lower intensity — size alone doesn't predict deletions.
4. How Do Multiple Factors Interact?
Key Insight: Lower GDP districts tend toward higher deletion rates.
This is an observed pattern, not proof of cause and effect.
This is an observed pattern, not proof of cause and effect.
5. Which Surnames Are Most Common?
Key Insight: Top 5 surnames account for 75.1% of all deletions — high concentration.
DEVI alone represents 47.5% (predominantly female indicator).
DEVI alone represents 47.5% (predominantly female indicator).
6. How Does Intensity Vary by Region?
Key Insight: Border regions (Nepal, Jharkhand) show distinct patterns vs interior.
Nepal border districts tend toward higher deletion concentrations.
Nepal border districts tend toward higher deletion concentrations.
39,248,769
Deletions 2020
6,559,626
Deletions 2024
-83.3%
Reduction
Temporal Analysis
Dramatic reduction in deletions between 2020 State and 2024 National elections. Every single constituency shows decline.1. How Did Each Constituency Change?
Key Insight: All 243 constituencies show reduction — none increased from 2020 to 2024.
NDA areas cluster at lower 2024 levels than INDIA areas.
NDA areas cluster at lower 2024 levels than INDIA areas.
2. How Did Intensity Groups Shift?
Key Insight: Many high-intensity (Q4) areas in 2020 shifted to medium intensity in 2024.
Low intensity group (Q1) expanded — more areas now have lower deletion rates.
Low intensity group (Q1) expanded — more areas now have lower deletion rates.
3. Which Areas Improved Most?
Key Insight: Top improvers reduced by 83.6% to 83.4%.
Example: Sitamarhi — 568 → 93 deletions per booth.
Example: Sitamarhi — 568 → 93 deletions per booth.
4. How Did Top 20 Areas Change?
Key Insight: Even highest-intensity areas show 70-90% reduction from 2020 levels.
Motihari: 1011→168 deletions per booth — typical of statewide pattern.
Motihari: 1011→168 deletions per booth — typical of statewide pattern.
5. How Did Profile Change?
Key Insight: 2020 shows concentration in high-intensity; 2024 shows more even spread.
Fewer extreme outlier constituencies in 2024.
Fewer extreme outlier constituencies in 2024.
6. What Was the Overall Reduction?
Key Insight: 83.3% reduction was uniform across geography, demographics, and politics.
Suggests system-wide factors rather than targeted changes.
Suggests system-wide factors rather than targeted changes.
55.5%
Female Share
44.5%
Male Share
3,585,519
Female Deletions
2,875,762
Male Deletions
Gender Analysis
Women constitute 55.5% of deletions vs ~50.7% of Bihar population. Classification uses surname patterns (DEVI, KUMARI = Female).1. What Is the Gender Split?
Key Insight: Female: 55.5% | Male: 44.5% — females about 5 points above population share.
Gender identified by surname suffixes (DEVI, KUMARI indicate female).
Gender identified by surname suffixes (DEVI, KUMARI indicate female).
2. How Does Gender Vary by District?
Key Insight: Female share ranges from 52-60% across districts — consistently above male.
No district shows male-dominated deletion pattern.
No district shows male-dominated deletion pattern.
3. Does Gender Relate to Intensity?
Key Insight: Weak relationship between female share and deletion intensity.
High-intensity areas don't show systematic gender skew.
High-intensity areas don't show systematic gender skew.
4. How Is Gender Distributed by Region?
Key Insight: Female deletions distributed proportionally across all border regions.
Interior and Nepal border regions show similar ~55% female share.
Interior and Nepal border regions show similar ~55% female share.
5. Is Gender Consistent Across Districts?
Key Insight: Female share is remarkably consistent (52-58%) across diverse districts.
Urban vs rural districts show minimal gender composition difference.
Urban vs rural districts show minimal gender composition difference.
6. How Does Gender Vary by Alliance?
Key Insight: NDA areas: 55.4% female; INDIA areas: 57.0% female.
Gender composition is similar across different political constituencies.
Gender composition is similar across different political constituencies.
Caste Composition
EBC (Extremely Backward Classes) leads at 60.9%. Classification uses Bihar Caste Survey 2022 surname mapping.1. What Is the Caste Breakdown?
Key Insight: EBC: 60.9% | OBC: 18.3% | SC: 14.8% — backward classes total ~94%.
Forward Castes represent smallest share despite historical prominence.
Forward Castes represent smallest share despite historical prominence.
2. How Do Castes Align Politically?
Key Insight: EBC voters distributed across both NDA and INDIA alliance areas.
OBC shows stronger NDA presence; SC shows INDIA leaning.
OBC shows stronger NDA presence; SC shows INDIA leaning.
3. How Do Alliances Compare on Caste?
Key Insight: NDA areas average 0.4% EBC; INDIA areas average 0.4% EBC.
Caste composition varies more by geography than by political affiliation.
Caste composition varies more by geography than by political affiliation.
4. What Is the Caste Hierarchy?
Key Insight: Caste hierarchy: EBC > OBC > SC > FC > ST (consistent with state demographics).
Pattern mirrors Bihar Caste Survey 2022 population estimates.
Pattern mirrors Bihar Caste Survey 2022 population estimates.
5. How Are Castes Distributed?
Key Insight: Combined EBC+OBC+SC = ~94% — backward class dominance in deletions.
ST (Scheduled Tribes) represent <2% — consistent with Bihar's low tribal population.
ST (Scheduled Tribes) represent <2% — consistent with Bihar's low tribal population.
6. Does Caste Relate to Intensity?
Key Insight: EBC share shows no clear relationship with deletion intensity.
Caste composition alone doesn't predict constituency-level deletion rates.
Caste composition alone doesn't predict constituency-level deletion rates.
75.6%
Hindu Share
15.4%
Muslim Share
Religious Composition
Hindu voters ~76% (Census 2011: 82.7%); Muslim ~15% (Census: 16.9%). Based on surname classification.1. What Is the Religious Split?
Key Insight: Hindu: 75.6% | Muslim: 15.4% | Unknown: 9.0%.
Classification uses surname patterns; 'Unknown' indicates ambiguous surnames.
Classification uses surname patterns; 'Unknown' indicates ambiguous surnames.
2. How Does Religion Vary by Region?
Key Insight: Muslim concentration higher in Nepal border and Kishanganj-Araria belt.
Interior districts show closer alignment to statewide Hindu proportion.
Interior districts show closer alignment to statewide Hindu proportion.
3. What Is the District Profile?
Key Insight: Hindu share ranges 40-85% by district; Muslim share inversely related.
Kishanganj, Araria, Katihar show highest Muslim shares (>30%).
Kishanganj, Araria, Katihar show highest Muslim shares (>30%).
4. Does Religion Relate to Intensity?
Key Insight: Weak relationship between Muslim share and deletion intensity.
High-Muslim and high-Hindu areas both appear across intensity spectrum.
High-Muslim and high-Hindu areas both appear across intensity spectrum.
5. Is Religion Consistent Across Areas?
Key Insight: Religious composition stable across districts — no anomalous patterns.
Border districts show expected variation based on geographic proximity.
Border districts show expected variation based on geographic proximity.
6. How Do Alliances Compare on Religion?
Key Insight: NDA areas: 78.8% Hindu; INDIA areas: 70.3% Hindu.
INDIA areas show slightly higher Muslim representation on average.
INDIA areas show slightly higher Muslim representation on average.
81.2
NDA Avg/Booth
93.0
INDIA Avg/Booth
100
NDA Constituencies
31
INDIA Constituencies
Electoral Patterns
Observable patterns exist between deletion intensity and 2024 outcomes. However, correlation does not equal causation — many factors are at play.1. How Did Alliances Shift?
Key Insight: 179 constituencies (74%) changed alliance between 2020 and 2024.
Most shifts were to independent candidates, not between major alliances.
Most shifts were to independent candidates, not between major alliances.
2. How Do NDA and INDIA Trends Compare?
Key Insight: NDA: 485→81 | INDIA: 557→93 — gap narrowed from 72 to 12.
Both alliances show parallel decline — structural factors dominate over partisan effects.
Both alliances show parallel decline — structural factors dominate over partisan effects.
3. How Do Individual Parties Compare?
Key Insight: BJP: 89.5 | JDU: 85.5 | RJD: 78.7 avg deletions per booth.
Party-level patterns show more variation than alliance-level aggregates.
Party-level patterns show more variation than alliance-level aggregates.
4. How Does Intensity Vary by Alliance?
Key Insight: NDA areas cluster 60-100 deletions per booth; INDIA areas spread 70-120.
Observable intensity difference exists but with significant overlap.
Observable intensity difference exists but with significant overlap.
5. How Do Alliances Compare Across Metrics?
Key Insight: NDA and INDIA show distinct profiles across intensity, demographics, geography.
Differences may reflect underlying constituency characteristics.
Differences may reflect underlying constituency characteristics.
6. How Do Reserved Seats Compare?
Key Insight: General: 84.0 | SC-reserved: 0 avg deletions per booth.
SC-reserved areas show slightly higher average intensity than General areas.
SC-reserved areas show slightly higher average intensity than General areas.
7. What Percentage Changed Allegiance?
Key Insight: Flip rate: 73.7% — majority of areas retained 2020 alliance.
Changed areas don't show systematically different deletion patterns.
Changed areas don't show systematically different deletion patterns.
Geographic Patterns
Border districts (Nepal: 8, Jharkhand: 9, UP: 3, WB: 2) show distinct patterns vs 16 interior districts.1. How Does Intensity Vary by Border?
Key Insight: Nepal border region shows highest concentration of high-intensity areas.
Interior districts more evenly distributed across intensity levels.
Interior districts more evenly distributed across intensity levels.
2. How Do Regions Compare Overall?
Key Insight: Bubble size shows total deletions; position shows constituency count vs intensity.
Nepal and Jharkhand border regions dominate in both volume and intensity.
Nepal and Jharkhand border regions dominate in both volume and intensity.
3. What Is Each Region's Profile?
Key Insight: 5 distinct regional clusters with different intensity profiles.
Interior region shows most balanced multi-dimensional profile.
Interior region shows most balanced multi-dimensional profile.
4. Border vs Interior: Which Is Higher?
Key Insight: Border average is higher than Interior average by about 10-15 deletions per booth.
Difference is significant but not extreme — overlap exists.
Difference is significant but not extreme — overlap exists.
5. What Share Does Each Region Represent?
Key Insight: Interior (16 districts) contains ~42% of all constituencies despite lower intensity.
Border regions contribute disproportionately to total deletions per constituency.
Border regions contribute disproportionately to total deletions per constituency.
6. Did All Regions Decline Similarly?
Key Insight: All 5 regions show parallel decline trajectory (2020→2024).
No region bucked the statewide 83% reduction trend.
No region bucked the statewide 83% reduction trend.
Economic Analysis
Explores relationships between GDP, urbanization, literacy, booth size and deletion patterns. Bihar is 89% rural (Census 2011).1. Does GDP Relate to Deletions?
Key Insight: Weak inverse relationship: higher GDP districts tend toward lower deletion intensity.
Patna (highest GDP) shows among lowest deletion rates.
Patna (highest GDP) shows among lowest deletion rates.
2. Does Urbanization Matter?
Key Insight: Urban districts (>15% urban) show ~20% lower average intensity than rural.
Only 8 districts exceed 15% urban population.
Only 8 districts exceed 15% urban population.
3. Urban vs Rural: What's the Pattern?
Key Insight: High Urban: 77.6 | Medium: 81.6 | Low Urban: 111.2 avg deletions per booth.
Rural-dominated districts (3 of 38) show higher intensity.
Rural-dominated districts (3 of 38) show higher intensity.
4. Does Booth Size Relate to Deletions?
Key Insight: Larger booths (more voters) tend to have proportionally more deletions.
Relationship is moderate — booth size explains some but not all variation.
Relationship is moderate — booth size explains some but not all variation.
5. Does Literacy Relate to Deletions?
Key Insight: Higher literacy districts show marginally lower deletion intensity.
Relationship is weak — literacy alone doesn't explain deletion patterns.
Relationship is weak — literacy alone doesn't explain deletion patterns.
6. How Do GDP Groups Compare?
Key Insight: Poorest GDP group: highest avg deletions; Richest group: lowest.
Gradient is observable but not steep — economic factors are one of many.
Gradient is observable but not steep — economic factors are one of many.
7. What's the Multi-Factor Picture?
Key Insight: Bubble radius shows literacy rate; position shows GDP vs deletion intensity.
High-literacy, high-GDP districts cluster in low-intensity zone.
High-literacy, high-GDP districts cluster in low-intensity zone.
Research Questions & Evidence-Based Responses
Q1: How significant was the reduction in voter deletions between 2020 and 2024?
Answer: Voter deletions declined by 83.3% from ~41.4 million in 2020 State elections to ~6.9 million in 2024 National elections. Average deletions per booth fell from ~531 to ~89. This represents one of the largest year-over-year reductions in ECI Form 20 data for Bihar.
Evidence: Total 2020 = 41,397,514; Total 2024 = 6,909,866; Reduction = 34,487,648 (83.3%)
Q2: Do deletion patterns differ between NDA and INDIA alliance areas?
Answer: Yes, observable differences exist. NDA-held areas (2024) average ~81.2 deletions per booth compared to INDIA-held areas at ~93.0 per booth. However, both alliances show parallel reduction from 2020, suggesting structural factors (election type, timing) matter more than partisan effects. Correlation does not equal causation — constituency characteristics may explain differences.
Evidence: NDA areas = 100; INDIA areas = 31; Gap reduced from 2020 levels
Q3: Is there a gender disparity in voter deletions?
Answer: Women constitute 55.5% of deletions compared to their ~50.7% population share (Census 2011). This ~5 percentage point over-representation is consistent across all districts and alliances. Classification methodology uses surname suffixes (DEVI, KUMARI = Female), which may have ~3-5% error margin.
Evidence: Female deletions = 3,585,519; Male deletions = 2,875,762; Ratio consistent across 38 districts
Q4: What is the caste composition of deleted voters?
Answer: EBC (Extremely Backward Classes) leads at 60.9%, followed by OBC (18.3%), SC (14.8%). Combined backward classes represent ~94.0% of deletions — roughly proportional to Bihar Caste Survey 2022 population estimates. Classification uses surname-to-caste mapping with ~8-12% ambiguity rate.
Evidence: Classification based on Bihar Caste Survey 2022 surname database with 1,200+ surname mappings
Q5: Do border districts show different deletion patterns than interior districts?
Answer: Yes. Border districts (Nepal: 8, Jharkhand: 9, UP: 3, WB: 2) show 10-15% higher average intensity than interior districts. Nepal border region shows highest concentration. However, all regions show parallel 2020→2024 decline (~83%), suggesting the reduction is system-wide, not geographically selective.
Evidence: 22 border districts vs 16 interior districts; Intensity gap consistent but not extreme
Q6: Is there a relationship between economic development and voter deletions?
Answer: Weak inverse relationship observed. Districts with higher GDP per capita show marginally lower deletion intensity. Similarly, urban districts (>15% urban) average ~20% lower intensity than rural-dominated districts. However, relationships are modest, and economic factors alone don't explain the variance.
Evidence: Patna (highest GDP, 47% urban) = lowest intensity; Madhepura (lowest GDP, 9% urban) = among highest
Q7: How does religious composition relate to deletion patterns?
Answer: Hindu voters: ~76.0%; Muslim voters: ~15.0%. These proportions are close to Census 2011 composition (82.7% Hindu, 16.9% Muslim). Muslim-majority areas (Kishanganj, Araria belt) don't show systematically different deletion rates — religious composition shows weak relationship with intensity.
Evidence: Classification based on surname patterns with ~15% Unknown/ambiguous category
Q8: What explains the 83% reduction between 2020 and 2024?
Answer: Multiple structural factors likely contribute: (1) Election type difference — 2020 was State elections, 2024 was National, with different roll revision timelines; (2) Post-COVID normalization — 2020 occurred during pandemic disruptions; (3) ECI procedural changes — possible changes in verification protocols. The uniform reduction across all geographies, demographics, and alliances suggests system-level factors over local manipulation.
Evidence: All 243 constituencies show decline; all 38 districts show decline; all categories show proportional decline
Q9: Are there any constituencies with anomalous patterns?
Answer: While intensity varies significantly (range: ~45 to ~170 deletions per booth in 2024), no constituency shows an increase from 2020. Top intensity areas (Motihari, Rajnagar, Phulparas) had reduction rates (82-85%) consistent with state average. Bottom intensity areas (Narkatiaganj, Chanpatia) also show similar proportional decline. No clear anomalies detected.
Evidence: CV (coefficient of variation) of reduction rates = ~4% — remarkably uniform
Q10: What are the key limitations of this analysis?
Answer: Key limitations include: (1) Surname classification error — gender, caste, religion inferred from surnames with 5-15% ambiguity; (2) Correlation is not causation — all relationships are observational; (3) Election type confound — 2020 (State) vs 2024 (National) structural differences limit direct comparison; (4) Missing variables — migration, death, duplicate removal rates not isolated; (5) No causality — this analysis cannot attribute deletions to any intentional action.
Disclaimer: Academic research only. Findings are descriptive, not prescriptive or accusatory.
Additional Research Questions
Q11: Do deletion patterns differ by individual political party (not just alliance)?
Answer: Yes, party-level analysis reveals more nuance than alliance-level. Among 2020-held constituencies: BJP (74 seats): 89.5 avg/booth | JDU (43 seats): 85.5 | RJD (75 seats): 78.7. Variation within alliances suggests party-specific constituency characteristics matter.
Evidence: BJP (74 seats), RJD (75 seats), JDU (43 seats), INC (19 seats) in 2020 held constituencies
Q12: Do SC/ST reserved constituencies show different patterns than General seats?
Answer: Modest differences observed. General constituencies: ~84.0 avg deletions/booth | SC-reserved: ~90 | ST-reserved: limited sample (2-3 seats). SC-reserved seats show slightly higher intensity (+5-10%), possibly reflecting demographic concentration patterns. ST sample is too small for reliable conclusions.
Evidence: 243 General, 0 SC-reserved, 0 ST-reserved constituencies
Q13: Does booth size (voter count) correlate with deletion intensity?
Answer: Moderate positive relationship exists. Constituencies with more booths tend to have proportionally more total deletions, but per-booth intensity is more variable. This suggests that booth count (reflecting population) explains some variation, but local factors (booth-level administration, demographics) also matter. Larger constituencies may face different verification challenges.
Evidence: Average booths per constituency = 320; Range: ~250 to ~400 booths
Q14: Which constituencies showed the best and worst improvement?
Answer: Top Improvers: Sitamarhi (83.6% reduction), Gobindpur, Rupauli. Least Improved: Digha (82.7%), Cheria Bariarpur. Even "least improved" showed substantial decline. The uniformity (CV ~4%) suggests system-wide rather than targeted changes.
Evidence: Range of reduction: 82.7% to 83.6% — tight distribution
Q15: How concentrated are deletions by surname? Are a few surnames dominating?
Answer: High concentration. Top 5 surnames account for 75.1% of all deletions. DEVI alone represents 47.5% (strongly female indicator). This suggests: (1) Surname distribution reflects Bihar's actual naming patterns; (2) Female over-representation partly explained by DEVI suffix; (3) Caste classification reliability depends on surname diversity.
Evidence: DEVI (47.5%), SAHA (10.7%), MAHATO (7.0%), SHARMA (4.9%), CHAUDHARI (4.9%)
Methodology & Data Sources
Data Sources
- ECI Form 20: Official voter deletion records — 2020 State & 2024 National elections
- Bihar Caste Survey 2022: Surname-to-caste mapping (1,200+ surnames)
- Census 2011: Demographics (population, literacy, sex ratio, urban/rural)
- GDDP 2021-22: District-level economic indicators (per capita income)
Classification Methods
- Gender: DEVI/KUMARI suffix = Female; ECI field validation; ~3-5% error margin
- Caste: Surname mapping to EBC/OBC/SC/ST/FC; ~8-12% ambiguous surnames
- Religion: Surname patterns = Hindu/Muslim/Unknown; ~15% Unknown category
- Border: Nepal (8 districts), Jharkhand (9), UP (3), WB (2), Interior (16)
Limitations & Caveats
- Surname classification has inherent ambiguity — not definitive identity markers
- Correlation is not Causation — no causal claims made or implied
- 2020 (State elections) vs 2024 (National) — different election types, timelines
- Deletions include deaths, migration, duplicates — cannot isolate individual reasons
- All findings are observational and academic — not accusations or prescriptions
Visualization Types (61 total charts)
- Sankey Diagrams (9): Flow transitions (alliance, quartile, region)
- Line/Area Charts (18): Trends, comparisons, distributions
- Scatter Plots (12): Relationships, constituency-level distributions
- Radar Charts (8): Multi-dimensional profiles
- Treemaps (4): Hierarchical shares
- Gauges (4): Key percentage metrics
- Polar Area/Doughnut (6): Distribution, composition