This project analyzes data from the Divar platform (an advertising company in Iran), including Exploratory Data Analysis (EDA), statistical analysis, recommender system, and price/rent prediction. The main goal is to use machine learning techniques to better understand the data and provide predictive models. π
The dataset used in this project is not available in the repository due to its large size (approximately 1 million records π). It consists of 64 columns, including:
- 20 numerical columns π’
- 44 categorical columns π·οΈ
The divar_project repository is divided into 5 main sections in src/:
- EDA (Exploratory Data Analysis) π: Exploratory analysis of data to understand distributions, relationships, and patterns.
- Statistical_analysis π: Advanced statistical analyses such as statistical tests and statistical modeling.
- recommender_system π€: Implementation of a recommender system for suggesting products or ads.
- prediction_price π° and prediction_rent π : Predictive models for purchase price and rent. These two sections are in a shared folder.
- Data Processing: pandas π, numpy π’, scipy π
- Visualization: matplotlib π, seaborn π¨, plotly π, geopandas πΊοΈ
- Machine Learning: sklearn π€, scipy π¬
- Algorithms and Models: k-means π, DBSCAN π, LightGBM π, Random Forest Regressor π²
- Python 3.11 π
- Install required libraries via
pip install -r requirements.txt(the requirements.txt file should be available in the repository).
- Clone the repository:
git clone https://github.com/username/divar_project.gitπ₯ - Navigate to the project directory:
cd divar_projectπ - Create a virtual environment (optional):
python -m venv envποΈ - Install libraries:
pip install -r requirements.txtπ¦ - For each section, run the corresponding scripts (e.g., for EDA:
python eda/main.pyβΆοΈ ).
- The original data is not uploaded to the repository due to its size. Please download the data from the relevant source and place it in the
data/folder. πΎ - For questions or collaboration, use Issues or Pull Requests. π¬
This project is released under the MIT License. π