A machine learning-powered game recommendation system built with Python, Streamlit, and scikit-learn. This application analyzes Steam game data to provide personalized game recommendations based on genres and user ratings.
- Smart Recommendations: Uses TF-IDF vectorization and cosine similarity
- Interactive Web Interface: Built with Streamlit for easy interaction
- Large Dataset: Analyzes over 27,000 Steam games
- Real-time Processing: Cached data loading for optimal performance
- User-friendly: Simple dropdown selection and instant results
Before running this application, ensure you have:
- Python 3.13 or higher (required as specified in
pyproject.toml) - Git (for cloning the repository)
- Package manager: Either
piporuv(recommended)
# Clone the repository
git clone <repository-url>
cd game-rec
# Or download and extract the project files manually# Install uv if you don't have it
pip install uv
# Install all dependencies
uv sync# Create a virtual environment (recommended)
python -m venv venv
# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtEnsure you have these files in your project directory:
game-rec/
โโโ app.py # Main Streamlit application
โโโ steam.csv # Dataset with ~27,076 Steam games
โโโ pyproject.toml # Project configuration
โโโ requirements.txt # Python dependencies
โโโ uv.lock # Lock file (if using uv)
โโโ README.md # This file
# Start the Streamlit application
streamlit run app.pyThe application will:
- Start a local web server (usually at
http://localhost:8501) - Open your default browser automatically
- Display the Steam Game Recommendation interface
- Select a Game: Choose from a random sample of 10 games in the dropdown
- Get Recommendations: Click the "Rekomendasikan Game Serupa" button
- View Results: See 6 similar games with their details including:
- Game name and genre
- Similarity score
- Positive and negative ratings
- Rating ratio
- Python 3.13+: Main programming language
- Streamlit: Web application framework
- Pandas: Data manipulation and analysis
- Scikit-learn: Machine learning algorithms
- NumPy: Numerical computing
-
Data Preprocessing:
- Loads Steam game data from CSV
- Combines genres, positive ratings, and negative ratings
- Handles missing data
-
Feature Engineering:
- Creates combined text features from game attributes
- Prepares data for vectorization
-
Machine Learning Pipeline:
- TF-IDF Vectorization: Converts text features to numerical vectors
- Cosine Similarity: Calculates similarity between games
- Ranking: Returns top 6 most similar games
-
Caching: Uses Streamlit's
@st.cache_datafor performance optimization
The steam.csv dataset contains:
- 27,076 Steam games with comprehensive metadata
- Key features:
name,genres,positive_ratings,negative_ratings - Additional data: release dates, developers, publishers, platforms, etc.
โ File steam.csv tidak ditemukan! Pastikan file berada di direktori yang sama dengan aplikasi ini.
Solution: Ensure steam.csv is in the same directory as app.py
# Check your Python version
python --version
# Should be 3.13 or higherSolution: Install Python 3.13+ from python.org
# Update pip first
pip install --upgrade pip
# Try installing with verbose output
pip install -r requirements.txt -vSolution: Use a virtual environment to avoid conflicts
# If port 8501 is busy, Streamlit will use the next available port
# Check terminal output for the correct URL
Solution: Check the terminal output for the correct localhost URL
Solution: The application is optimized with caching, but ensure you have at least 2GB RAM available
- First Run: May be slower due to initial data loading and model creation
- Subsequent Runs: Faster due to Streamlit caching
- Memory Usage: Efficiently handles 27K+ games
- Response Time: Recommendations appear within seconds
For production deployment, consider these enhancements:
- Database Integration: Replace CSV with PostgreSQL/MongoDB
- Redis Caching: Implement Redis for better caching
- API Integration: Connect to Steam API for real-time data
- Input Validation: Add comprehensive input sanitization
- Rate Limiting: Implement request throttling
- Authentication: Add user authentication system
- Error Tracking: Integrate Sentry or similar
- Performance Monitoring: Add application metrics
- Logging: Implement structured logging
# Example Dockerfile
FROM python:3.13-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8501
CMD ["streamlit", "run", "app.py"]- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is open source and available under the MIT License.
- Steam for providing the game data
- Streamlit team for the excellent web framework
- Scikit-learn community for the machine learning tools
If you encounter any issues or have questions:
- Check the troubleshooting section above
- Search existing issues in the repository
- Create a new issue with detailed information about your problem
Happy Gaming! ๐ฎ