Predicting the dominant religion of a country using flag design features and geopolitical attributes with EDA and deep learning.
This project explores the relationship between a country's flag design and its dominant religion, leveraging exploratory data analysis (EDA) and neural network models. Using both visual (image-based) and numerical (CSV-based) data, we implement and evaluate models like custom CNNs, MobileNetV2, and multi-layer perceptrons to classify religion across countries.
The core idea is to evaluate whether visual elements of national flags can be used to predict a country's dominant religion.
- Conduct EDA on geopolitical and flag-related features.
- Build ML models that classify religion based on:
- Flag images (custom CNN, MobileNetV2)
- Geopolitical numeric data (Multi-layer Neural Network)
- Compare model performance and draw conclusions.
The dataset includes 177 countries with 30+ features, such as:
- Geopolitical attributes: continent, population, area, language, religion
- Flag design features: number of colors, dominant color, presence of symbols (crosses, stars, crescents, suns, circles)
A custom dataset of flag images (from 1986) is compiled and aligned with the CSV metadata.
- Removed irrelevant files and images
- Mapped image names to countries
- Standardized categories and labels
- Distribution of colors, symbols, and religion types
- Correlation matrices (e.g., between symbols and religion)
- Comparative visualizations across continents and religions
| Model Type | Input | Description |
|---|---|---|
| Custom CNN | Flag Images | Designed from scratch to classify religion |
| MobileNetV2 | Flag Images | Pre-trained model adapted to our dataset |
| MLP | CSV Data | Neural network trained on numerical features |
- Red and white are the most frequent flag colors globally.
- Christian countries often feature crosses on their flags.
- Flags of Muslim-majority countries frequently feature green and crescent symbols.
| Model | Accuracy (Global) | Accuracy (Africa) |
|---|---|---|
| Custom CNN | 50% | 80% |
| MobileNetV2 | 20% | 20% |
| MLP (CSV) | 56% (peak) | - |
⚠️ Note: Low accuracy is due to limited dataset size and visual diversity among flags.
- Flag design does show some correlation with religion, but it's not conclusive.
- Dividing countries by region improves model accuracy.
- Complex geopolitical and historical factors limit prediction reliability.
- Use higher-resolution and modern flag images
- Expand dataset for better generalization
- Add contextual geopolitical data
- Fine-tune and ensemble models
- Python 3.8+
- Libraries:
tensorflow,keras,pandas,numpy,matplotlib,seaborn,scikit-learn
You have two options for running the project:
- Simply upload the main code file and the
datasets/folder (containingflag_images/andflags_csv/) to your Colab session. - Then, run the cells in the provided notebooks for EDA and data preprocessing.
- You can also train models (CNN, MobileNetV2, MLP) by running the corresponding code cells in the notebook.
- Clone the GitHub repository:
git clone https://github.com/your-username/religious-classification.git cd religious-classification - Make sure your environment is set up (e.g., with
requirements.txtor usingconda/venv). - Run the notebook for EDA and preprocessing.
- For model training, run the script cells in the provided notebooks.
- Place your datasets in the following structure:
religious-classification/
├── datasets/
│ ├── flag_images/
│ └── flags_csv/
- UCI Flags Dataset
- Flag Images: Flaglog 1986
- TensorFlow & Keras Documentation
- MobileNetV2 Paper