Skip to content

Machine learning-based system for recommending comparable properties for real estate appraisals

Notifications You must be signed in to change notification settings

mhashir03/Property-Recommendation-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Property Recommendation System

A machine learning-based system for recommending comparable properties for real estate appraisals. The system uses various property features to find and rank similar properties, providing detailed explanations for each recommendation.

Features

  • Advanced Feature Engineering: Combines property characteristics, location data, and textual descriptions
  • Multiple ML Models: Support for Random Forest, Gradient Boosting, SVM, and ensemble methods
  • Explainable AI: Detailed explanations for each recommendation using feature importance
  • Comprehensive Evaluation: Multiple metrics including precision, recall, and similarity scores
  • Real-time Predictions: Generate recommendations for any property in the dataset

Project Structure

Property-Recommendation-System/
├── src/
│   ├── data/
│   │   ├── data_loader.py           # Data loading and preprocessing
│   │   └── feature_engineering.py   # Feature extraction and transformation
│   ├── models/
│   │   └── comp_recommender.py      # ML model implementations
│   ├── evaluation/
│   │   └── evaluator.py             # Model evaluation and metrics
│   ├── utils/
│   │   └── explainability.py       # Explanation generation
│   └── config/
│       └── config.py                # Configuration settings
├── models/                          # Trained model files
├── results/                         # Evaluation results and recommendations
├── explanations/                    # Generated explanations
├── train.py                         # Model training script
├── predict.py                       # Prediction and recommendation script
└── requirements.txt                 # Python dependencies

Installation

  1. Clone the repository:
git clone <repository-url>
cd Property-Recommendation-System
  1. Install dependencies:
pip install -r requirements.txt
  1. Ensure you have the dataset file appraisals_dataset.json in the project root directory.

Usage

Training Models

Train a model using the training script:

# Train a Random Forest model
python3 train.py --data-path appraisals_dataset.json --model-types random_forest --n-comps 3

# Train multiple models
python3 train.py --data-path appraisals_dataset.json --model-types random_forest gradient_boosting --n-comps 5

# Train with ensemble method
python3 train.py --data-path appraisals_dataset.json --model-types random_forest gradient_boosting --use-ensemble

Training Options:

  • --data-path: Path to the dataset file (default: appraisals_dataset.json)
  • --model-types: Types of models to train (random_forest, gradient_boosting, svm)
  • --n-comps: Number of comparable properties to recommend (default: 3)
  • --use-ensemble: Use ensemble of multiple models

Generating Recommendations

Generate recommendations for a specific property:

# Basic recommendation
python3 predict.py --subject-id 4762597 --model-path models/random_forest_model.pkl

# With custom output path
python3 predict.py --subject-id 4762597 --model-path models/random_forest_model.pkl --output-path results/my_recommendations.json

# More recommendations with LLM explanations
python3 predict.py --subject-id 4762597 --model-path models/random_forest_model.pkl --n-comps 5 --use-llm

Prediction Options:

  • --subject-id: ID of the property to generate recommendations for (required)
  • --model-path: Path to the trained model file
  • --data-path: Path to the dataset file (default: appraisals_dataset.json)
  • --n-comps: Number of recommendations to generate (default: 3)
  • --use-llm: Use LLM for enhanced explanations
  • --output-path: Path to save recommendations JSON file

Example Output

The system generates detailed recommendations with explanations:

===== PROPERTY RECOMMENDATIONS =====

SUBJECT PROPERTY:
ID: 4762597
Address: 142-950 Oakview Ave Kingston ON K7M 6W8
Structure Type: Townhouse
Year Built: 1976.0
GLA: 1044.0
Bedrooms: 3
Bathrooms: 1:1

RECOMMENDED COMPARABLE PROPERTIES:

1. 311 Janette St
   Similarity Score: 0.2840
   Structure Type: Freehold Townhouse
   Year Built: None
   GLA: 1500.0
   Bedrooms: 3.0
   Price: 585000.0

   Key Factors:
   - Heating forced air: 6.82 (positive)
   - Structure type freehold townhouse: 9.10 (positive)
   - Basement features: -0.06 (negative)

Available Subject IDs

To find available subject IDs in your dataset:

python3 -c "import json; data = json.load(open('appraisals_dataset.json', 'r')); print('Available IDs:', [appraisal['orderID'] for appraisal in data['appraisals'][:5]])"

Model Performance

The system has been evaluated on multiple metrics:

  • Precision & Recall: Model accuracy in finding relevant comparables
  • Similarity Scores: Quantitative similarity between properties
  • Feature Importance: Understanding which factors drive recommendations
  • Coverage: Percentage of properties that can receive recommendations

Results are saved in the results/ directory with detailed evaluation reports.

Configuration

Key configuration options in src/config/config.py:

  • Model parameters (n_estimators, max_depth, etc.)
  • Feature engineering settings
  • Evaluation metrics
  • File paths and directories

Advanced Features

Feature Engineering

The system automatically extracts and engineers features including:

  • Property characteristics (bedrooms, bathrooms, GLA, etc.)
  • Location features (coordinates, municipality, etc.)
  • Text features from property descriptions (TF-IDF)
  • Categorical encodings
  • Numerical scaling and normalization

Model Explanations

Each recommendation includes:

  • Similarity score
  • Top contributing features
  • Feature impact analysis
  • Detailed property comparisons

Evaluation Metrics

Comprehensive evaluation using:

  • Classification metrics (precision, recall, F1)
  • Ranking metrics (MAP, NDCG)
  • Similarity metrics
  • Feature importance analysis

Troubleshooting

Common Issues:

  1. Subject ID not found: Ensure the subject ID exists in your dataset
  2. Model not found: Train a model first using train.py
  3. Feature engineer not found: Retrain the model to save the feature engineer
  4. JSON serialization errors: The system automatically handles timestamp conversions

Getting Help:

Check the evaluation results in results/ for model performance metrics and potential issues with specific properties.

Dependencies

See requirements.txt for the complete list of dependencies. Key packages include:

  • pandas, numpy: Data processing
  • scikit-learn: Machine learning models
  • matplotlib, seaborn: Visualization
  • joblib: Model serialization

About

Machine learning-based system for recommending comparable properties for real estate appraisals

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages