Property Recommendation System

A machine learning-based system for recommending comparable properties for real estate appraisals. The system uses various property features to find and rank similar properties, providing detailed explanations for each recommendation.

Features

Advanced Feature Engineering: Combines property characteristics, location data, and textual descriptions
Multiple ML Models: Support for Random Forest, Gradient Boosting, SVM, and ensemble methods
Explainable AI: Detailed explanations for each recommendation using feature importance
Comprehensive Evaluation: Multiple metrics including precision, recall, and similarity scores
Real-time Predictions: Generate recommendations for any property in the dataset

Project Structure

Property-Recommendation-System/
├── src/
│   ├── data/
│   │   ├── data_loader.py           # Data loading and preprocessing
│   │   └── feature_engineering.py   # Feature extraction and transformation
│   ├── models/
│   │   └── comp_recommender.py      # ML model implementations
│   ├── evaluation/
│   │   └── evaluator.py             # Model evaluation and metrics
│   ├── utils/
│   │   └── explainability.py       # Explanation generation
│   └── config/
│       └── config.py                # Configuration settings
├── models/                          # Trained model files
├── results/                         # Evaluation results and recommendations
├── explanations/                    # Generated explanations
├── train.py                         # Model training script
├── predict.py                       # Prediction and recommendation script
└── requirements.txt                 # Python dependencies

Installation

Clone the repository:

git clone <repository-url>
cd Property-Recommendation-System

Install dependencies:

pip install -r requirements.txt

Ensure you have the dataset file appraisals_dataset.json in the project root directory.

Usage

Training Models

Train a model using the training script:

# Train a Random Forest model
python3 train.py --data-path appraisals_dataset.json --model-types random_forest --n-comps 3

# Train multiple models
python3 train.py --data-path appraisals_dataset.json --model-types random_forest gradient_boosting --n-comps 5

# Train with ensemble method
python3 train.py --data-path appraisals_dataset.json --model-types random_forest gradient_boosting --use-ensemble

Training Options:

--data-path: Path to the dataset file (default: appraisals_dataset.json)
--model-types: Types of models to train (random_forest, gradient_boosting, svm)
--n-comps: Number of comparable properties to recommend (default: 3)
--use-ensemble: Use ensemble of multiple models

Generating Recommendations

Generate recommendations for a specific property:

# Basic recommendation
python3 predict.py --subject-id 4762597 --model-path models/random_forest_model.pkl

# With custom output path
python3 predict.py --subject-id 4762597 --model-path models/random_forest_model.pkl --output-path results/my_recommendations.json

# More recommendations with LLM explanations
python3 predict.py --subject-id 4762597 --model-path models/random_forest_model.pkl --n-comps 5 --use-llm

Prediction Options:

--subject-id: ID of the property to generate recommendations for (required)
--model-path: Path to the trained model file
--data-path: Path to the dataset file (default: appraisals_dataset.json)
--n-comps: Number of recommendations to generate (default: 3)
--use-llm: Use LLM for enhanced explanations
--output-path: Path to save recommendations JSON file

Example Output

The system generates detailed recommendations with explanations:

===== PROPERTY RECOMMENDATIONS =====

SUBJECT PROPERTY:
ID: 4762597
Address: 142-950 Oakview Ave Kingston ON K7M 6W8
Structure Type: Townhouse
Year Built: 1976.0
GLA: 1044.0
Bedrooms: 3
Bathrooms: 1:1

RECOMMENDED COMPARABLE PROPERTIES:

1. 311 Janette St
   Similarity Score: 0.2840
   Structure Type: Freehold Townhouse
   Year Built: None
   GLA: 1500.0
   Bedrooms: 3.0
   Price: 585000.0

   Key Factors:
   - Heating forced air: 6.82 (positive)
   - Structure type freehold townhouse: 9.10 (positive)
   - Basement features: -0.06 (negative)

Available Subject IDs

To find available subject IDs in your dataset:

python3 -c "import json; data = json.load(open('appraisals_dataset.json', 'r')); print('Available IDs:', [appraisal['orderID'] for appraisal in data['appraisals'][:5]])"

Model Performance

The system has been evaluated on multiple metrics:

Precision & Recall: Model accuracy in finding relevant comparables
Similarity Scores: Quantitative similarity between properties
Feature Importance: Understanding which factors drive recommendations
Coverage: Percentage of properties that can receive recommendations

Results are saved in the results/ directory with detailed evaluation reports.

Configuration

Key configuration options in src/config/config.py:

Model parameters (n_estimators, max_depth, etc.)
Feature engineering settings
Evaluation metrics
File paths and directories

Advanced Features

Feature Engineering

The system automatically extracts and engineers features including:

Property characteristics (bedrooms, bathrooms, GLA, etc.)
Location features (coordinates, municipality, etc.)
Text features from property descriptions (TF-IDF)
Categorical encodings
Numerical scaling and normalization

Model Explanations

Each recommendation includes:

Similarity score
Top contributing features
Feature impact analysis
Detailed property comparisons

Evaluation Metrics

Comprehensive evaluation using:

Classification metrics (precision, recall, F1)
Ranking metrics (MAP, NDCG)
Similarity metrics
Feature importance analysis

Troubleshooting

Common Issues:

Subject ID not found: Ensure the subject ID exists in your dataset
Model not found: Train a model first using train.py
Feature engineer not found: Retrain the model to save the feature engineer
JSON serialization errors: The system automatically handles timestamp conversions

Getting Help:

Check the evaluation results in results/ for model performance metrics and potential issues with specific properties.

Dependencies

See requirements.txt for the complete list of dependencies. Key packages include:

pandas, numpy: Data processing
scikit-learn: Machine learning models
matplotlib, seaborn: Visualization
joblib: Model serialization

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Property Recommendation System

Features

Project Structure

Installation

Usage

Training Models

Generating Recommendations

Example Output

Available Subject IDs

Model Performance

Configuration

Advanced Features

Feature Engineering

Model Explanations

Evaluation Metrics

Troubleshooting

Common Issues:

Getting Help:

Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
explanations		explanations
models		models
results		results
src		src
README.md		README.md
appraisals_dataset.json		appraisals_dataset.json
predict.py		predict.py
requirements.txt		requirements.txt
train.py		train.py

mhashir03/Property-Recommendation-System

Folders and files

Latest commit

History

Repository files navigation

Property Recommendation System

Features

Project Structure

Installation

Usage

Training Models

Generating Recommendations

Example Output

Available Subject IDs

Model Performance

Configuration

Advanced Features

Feature Engineering

Model Explanations

Evaluation Metrics

Troubleshooting

Common Issues:

Getting Help:

Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages