Skip to content

sccjrd/ir-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IKEA Hacks Search Engine

A full-stack search engine for the project of the course "Information Retrieval" of Università della Svizzera Italiana

The goal was discovering and exploring IKEA hacks from multiple sources including dedicated websites and Reddit's r/ikeahacks community.

Project Structure

├── frontend/          # React + Vite frontend application
├── backend/           # FastAPI backend with MongoDB
├── crawler/           # Scrapy web crawlers
└── README.md          # This file

Features

  • Full-text search across thousands of IKEA hack projects
  • Smart categorization using LLM-powered tagging
  • Similar hacks discovery using MongoDB Atlas Search
  • Category browsing with popular categories
  • Multi-source aggregation from ikeahackers.net, loveproperty.com, tosize.it, and Reddit
  • Responsive UI with dark mode support

Tech Stack

  • Frontend: React, Material-UI, Vite
  • Backend: FastAPI, Python 3.x
  • Database: MongoDB Atlas (with Search indexes)
  • Web Scraping: Scrapy
  • LLM Tagging: Ollama (local LLM inference)

Quick Start

Prerequisites

  • Node.js 18+ and npm
  • Python 3.9+
  • MongoDB Atlas account
  • Ollama (for LLM tagging)

1. Clone the repository

git clone git@github.com:sccjrd/ir-project.git
cd ir-project

2. Set up the Backend

See the Backend README for detailed setup instructions.

cd backend
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

# Create .env file with your MongoDB credentials
# See backend/README.md for required environment variables

# Start the server
uvicorn app.main:app --reload

Backend runs on http://localhost:8000

3. Set up the Frontend

See the Frontend README for detailed setup instructions.

cd frontend
npm install

# Create .env file with backend URL
# See frontend/README.md for required environment variables

# Start the dev server
npm run dev

Frontend runs on http://localhost:5173

4. Run the Crawler (Optional)

See the Crawler README for detailed crawler documentation.

cd crawler
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Create .env file with MongoDB credentials
# See crawler/README.md for configuration details

# Run all crawlers
python scraper/run_crawlers.py

# Build combined collection
python scraper/build_hacks_all.py

5. Tokenize Data with LLM (Optional)

See the Backend README - LLM Tokenization section for Ollama setup.

cd backend
python -m app.tokenization.cli --limit 100

Authors

  • Sacco Francesc Jordi
  • Vavassori Theodor

License

Academic project for Information Retrieval 2025 course.


For detailed setup and usage instructions, please refer to the individual component READMEs:

About

BCS @ USI | Information Retrieval

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •