Pakistan Crimes Analysis (2000–2024)
Comprehensive national crime trend analysis with socioeconomic correlations
Overview
This project examines 25 years of crime data in Pakistan (2000–2024) and explores how crime rates correlate with key socioeconomic indicators. Through statistical analysis and data visualization, I investigate whether literacy, economic development, unemployment, poverty, and urbanization have measurable relationships with national crime patterns.
The goal is to provide evidence-based insights that can inform policy decisions, resource allocation, and crime prevention strategies.
Dataset
The dataset integrates crime statistics with socioeconomic indicators spanning 25 years.
What's Included
Crime Data (9 categories tracked):
Murder
Attempt to Murder
Kidnapping / Abduction
Dacoity
Robbery
Burglary
Cattle Theft
Other Theft
Others (miscellaneous offenses)
Socioeconomic Variables: Population (Census data + projections) National literacy rate GDP per capita (USD) Unemployment rate Poverty rate Urbanization rate
Data Composition
| Period | Records | Source | Quality |
|---|---|---|---|
| 2012–2017 | 54 | Pakistan Bureau of Statistics | ✓ Verified official data |
| 2000–2011, 2018–2024 | 171 | Statistical projection | Modeled using OLS regression |
Total: 225 records (25 years × 9 crime types)
Key Findings
Based on analysis of verified data (2012–2017):
Crime & Socioeconomic Correlations
Literacy Rate
Correlation with crime rate: r = −0.633
Interpretation: Higher literacy is associated with lower crime rates
Significance: p = 0.178 (moderate evidence)
Economic Development
Correlation with crime rate: r = −0.468
Interpretation: Higher GDP per capita correlates with reduced crime
Significance: p = 0.349
Long-term Trends
Total crimes recorded (2000–2024): 16.3 million incidents
Population growth: +72.5% (142M → 245M)
Literacy improvement: +14.5 percentage points (45% → 59.5%)
Crime rate trend: Declining despite population growth
Crime Distribution
- Most common: "Others" category (79.6% of all crimes)
- Violent crimes: Murder + Attempt to Murder = 4.3% of total
- Property crimes: Burglary + Robbery + Theft = 10.4% of total
Visualizations
Comprehensive Dashboard
This dashboard presents:
- Key Statistics: Total crimes, average crime rate, population growth, literacy gains
- Time Series: Crime trends by category (2000–2024)
- Correlation Heatmap: Relationships between crime and socioeconomic factors
- Scatter Plots: Crime vs. literacy and crime vs. economic development
- Crime Distribution: Breakdown by offense type
- Contextual Trends: Evolution of literacy, unemployment, and poverty
Methodology
Data Collection
Crime data comes from the Pakistan Bureau of Statistics (PBS), the official government source for national statistics. Socioeconomic indicators were compiled from:
Population: Pakistan Census 1998, 2017, 2023 + PBS demographic projections Literacy: Pakistan Economic Survey (annual publications 2000–2024), UNESCO Institute for Statistics, Pakistan Social & Living Standards Measurement (PSLM) surveys Economic Data: World Bank Open Data, State Bank of Pakistan annual reports Poverty & Development: World Bank poverty estimates, UNDP Human Development Reports
Statistical Approach
For verified period (2012–2017): Direct extraction from PBS Crime Statistics Portal No estimation or modeling required
For extended coverage (2000–2011, 2018–2024):
- Calculated per-capita crime rates from verified data
- Fitted linear regression models for each crime type
- Applied trends to historical and future population estimates
- Added realistic variance (±5%) to avoid artificial smoothness
- Clearly tagged all projections in Data_Source column
Correlation Analysis: Pearson correlation coefficients calculated Two-tailed significance tests performed Limited to verified data only (2012–2017) for robustness
Data Integrity
All projections are: Based on established statistical methods Transparent and reproducible Clearly documented in the datasetConservative (using linear trends, not complex models)
- Python 3.12 – Data analysis and visualization
- Pandas – Data manipulation and cleaning
- NumPy – Numerical computations
- SciPy – Statistical testing (correlation, regression)
- Matplotlib – Visualization framework
- Seaborn – Statistical data visualization
- Jupyter Notebook – Interactive analysis environment
- Microsoft Excel / LibreOffice Calc – Multi-sheet data workbook
All data is from official, publicly accessible sources:
-
Pakistan Bureau of Statistics (PBS)
Crime Statistics Portal: https://social.data.gov.pk/
Pakistan Census 1998, 2017, 2023 -
World Bank Open Data
GDP, poverty, urbanization: https://data.worldbank.org/ -
Pakistan Economic Survey
Annual reports (2000–2024): https://www.finance.gov.pk/ -
UNESCO Institute for Statistics
Literacy data: http://uis.unesco.org/ -
State Bank of Pakistan
Economic indicators: https://www.sbp.org.pk/
- Fully verified: 2012–2017 (PBS Crime Portal)
- Projected: 2000–2011, 2018–2024 (statistical extrapolation clearly documented)
This analysis has important constraints to keep in mind:
-
Projected Data: Crime figures for 2000–2011 and 2018–2024 are statistical estimates, not official records. They assume linear trends, which may not capture non-linear changes.
-
Reporting Accuracy: Crime statistics depend on reporting rates. Underreporting (especially of certain crimes) may affect accuracy.
-
National Level Only: Provincial or district-level breakdowns are not available for the full 25-year period. Regional variations are not captured.
-
Correlation vs. Causation: Statistical correlations do not prove causation. Multiple factors influence crime beyond those analyzed here.
-
COVID-19 Impact: The 2020–2021 period may show unusual patterns due to pandemic-related lockdowns and reporting changes.
-
"Others" Category: This broad category (79.6% of crimes) includes many offense types, limiting granular analysis.
Potential extensions of this analysis:
- Incorporate provincial-level data (2012–2017) for regional comparisons
- Add time-series forecasting models (ARIMA, Prophet)
- Machine learning classification: predict crime category based on socioeconomic factors
- Panel regression analysis with fixed effects
- Integrate additional variables: education spending, police per capita, judicial efficiency
- Interactive dashboard using Plotly/Dash for real-time exploration
- Comparison with neighboring countries (India, Bangladesh, Afghanistan)
Contributions are welcome! See Contributing below.
I welcome suggestions, corrections, and contributions to improve this analysis.
- Fork this repository
- Create a feature branch:
git checkout -b feature/your-feature-name - Commit your changes:
git commit -m "Add your message" - Push to the branch:
git push origin feature/your-feature-name - Open a Pull Request
If you find errors or have questions:
- Open an issue on GitHub
- Provide detailed information (what you found, what you expected, screenshots if relevant)
This project is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0).
You are free to:
- ✓ Share — copy and redistribute the material
- ✓ Adapt — remix, transform, and build upon the material for any purpose, even commercially
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made
- ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license
See LICENSE file for full details.
If you use this dataset or analysis in academic work, please cite:
Rabbani, M. (2024). Pakistan Crimes Analysis (2000–2024): National trends with socioeconomic correlations. GitHub repository. https://github.com/Rabbani-bot/pakistan-crimes-analysis
Mohammad Rabbani
Data Analyst & Research Consultant
- LinkedIn: www.linkedin.com/in/rabbani-analyst
- Kaggle: (https://www.kaggle.com/rabbani0123)
- Email: mrabbani777tcd@gmail.com
- Location: Mingora, Khyber Pakhtunkhwa, Pakistan
I am always interested in discussing data analysis, criminology research, development economics, or potential collaboration opportunities. Feel free to reach out!
⭐ If you find this project useful, please consider starring it on GitHub! It helps others discover this work.
Last updated: February 2026
