ReciSys: Hybrid Recipe Recommendation with Attention & NLP

This project studies rating prediction on the Food.com dataset using modern recommender system techniques.

I explore two complementary predictive tasks:

Hybrid Recommendation (Collaborative + Content-Based)
- Predict user–recipe ratings using an attention-based neural recommender
- Combines user embeddings, item embeddings, and rich recipe metadata
- Model: SideFeatAttAE
NLP-Based Rating Prediction
- Predict ratings directly from review text
- Models: Fine-tuned DeBERTa and TF-IDF + Ridge regression

Together, these tasks examine how structured interactions and natural language capture user preferences from different perspectives.

Dataset: Food.com (Majumder et al., EMNLP 2019)

230K+ recipes
1M+ user interactions
18 years of data (2000–2018)

After filtering:

~180K recipes
~700K interactions

Original dataset goal: personalized recipe generation
This project focuses on rating prediction and representation learning.

Goal: Predict a user's rating for a recipe.

Inputs

User ID
Recipe ID
Recipe side features:
- cooking time
- number of steps
- number of ingredients
- nutrition vector
- recipe tags

Model

User & item embeddings
Side-feature embedding via MLP
Multi-head attention over [user, item, side] tokens
MLP head + bias terms

Baselines

Global mean
User mean
Item mean
Matrix Factorization (SVD)

Metric

Mean Squared Error (MSE)

Goal: Predict rating from review text alone.

Models

Fine-tuned DeBERTa-v3 (regression)
TF-IDF + Ridge regression

Why NLP?

Captures sentiment, negation, and subtle phrasing
Complements collaborative signals

Metric

Mean Squared Error (MSE)

Model	Test MSE
SideFeatAttAE	0.712
Matrix Factorization (SVD)	0.753
Global Mean	0.845
TF-IDF + Ridge (NLP)	0.378
DeBERTa (NLP)	0.286

Attention-based hybrid models outperform classical CF baselines
Rich side information improves robustness and cold-start behavior
NLP models can recover ratings extremely well from language alone
The architecture generalizes beyond Food.com to many recommendation domains
notebook/ReciSys.ipynb – full implementation and experiments
slides/ReciSys.pptx – presentation walkthrough
report/ReciSys_Report.pdf – written report
watch: video presentation

Dataset Citation

This project uses the Food.com dataset introduced in:

Bodhisattwa Prasad Majumder*, Shuyang Li*, Jianmo Ni, Julian McAuley.
Generating Personalized Recipes from Historical User Preferences.
EMNLP–IJCNLP, 2019.

If you use this dataset, please cite the paper above.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
code		code
notebook		notebook
slides		slides
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReciSys: Hybrid Recipe Recommendation with Attention & NLP

Dataset Citation

About

Uh oh!

Releases

Packages

Languages

rohanarcot/ReciSys

Folders and files

Latest commit

History

Repository files navigation

ReciSys: Hybrid Recipe Recommendation with Attention & NLP

Dataset Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages