A hands-free Spotify controller(currently) that uses hand gestures and face tracking to manage your music while you work. The system automatically pauses music when you're distracted and resumes when you're focused—helping you maintain deep work sessions.
- ☝️ One Finger: Play music
- ✌️ Two Fingers (hold 2s): Next track
- 🤟 Three Fingers: Pause music
- 🖖 Four Fingers (hold 2s): Previous track
- 🤏 Pinch & Hold: Drag to adjust volume (vertical control)
- Auto-Pause: Automatically pauses music when you look away or leave the screen
- Smart Resume: Waits 5 seconds for you to settle back before resuming (prevents false triggers)
- Grace Period: 1.5-second buffer before pausing (ignores quick glances away)
- Task Tracking: Set your focus task and track session time
- Real-time gesture feedback
- Live volume control slider
- Progress indicators for held gestures
- Session timer
- Focus status display
- macOS (uses AppleScript for Spotify control)
- Python 3.7+
- Spotify desktop app installed and running
- Webcam
- Clone the repository
git clone https://github.com/AadiPandey/flow_state.git
cd flow_state- Run the setup script
chmod +x setup.sh
./setup.shThe script will automatically:
- Create a virtual environment
- Install all Python dependencies
- Download MediaPipe model files
Click to expand manual installation steps
- Clone the repository
git clone https://github.com/AadiPandey/flow_state.git
cd flow_state- Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate- Install dependencies
pip install opencv-python mediapipe numpy- Download MediaPipe models
Download these two files and place them in the project root:
# Quick download commands
curl -o hand_landmarker.task https://storage.googleapis.com/mediapipe-models/hand_landmarker/hand_landmarker/float16/1/hand_landmarker.task
curl -o face_landmarker.task https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/1/face_landmarker.task./run.sh- Start Spotify on your Mac
- Activate virtual environment (if not already active)
source venv/bin/activate- Run the controller
python main.py- Set your task (optional) when prompted
- Position yourself in front of the webcam
- Use gestures to control playback
- Press 'q' to quit
Adjust settings in modules/config.py:
# Timing
COOLDOWN_ACTION = 1.5 # Seconds between gesture commands
HOLD_DURATION = 2.0 # Seconds to hold for next/prev track
SETTLING_TIME = 5.0 # Seconds to wait before auto-resume
GRACE_PERIOD = 1.5 # Seconds before auto-pause triggers
# Sensitivity
YAW_THRESHOLD = 0.35 # How far you can look away (lower = stricter)
PINCH_THRESHOLD = 0.05 # Pinch detection sensitivityspotify_controller/
├── main.py # Main application entry point
├── hand_landmarker.task # MediaPipe hand detection model
├── face_landmarker.task # MediaPipe face detection model
├── modules/
│ ├── config.py # Configuration and constants
│ ├── spotify.py # Spotify control via AppleScript
│ ├── state.py # Application state management
│ ├── ui.py # Visual interface rendering
│ └── vision.py # Hand and face detection logic
└── README.md
- Vision Layer: Uses MediaPipe to detect hand gestures and face orientation in real-time
- Logic Layer: Processes gestures and focus state to determine actions
- Control Layer: Sends commands to Spotify via AppleScript
- UI Layer: Renders visual feedback with OpenCV
The focus detection uses head yaw (left/right rotation) to determine if you're looking at the screen. When you look away for more than 1.5 seconds, music auto-pauses. Upon returning, the system waits 5 seconds for you to settle before resuming.
Camera not working:
- Check System Preferences → Security & Privacy → Camera
- Make sure Terminal/iTerm has camera access
Spotify not responding:
- Ensure Spotify desktop app is running
- Check that Spotify is not in a restricted state
Gestures not detected:
- Ensure good lighting
- Keep hand within camera frame
- Try adjusting
CONFIDENCEin config.py
Models not loading:
- Verify
.taskfiles are in the root directory - Check file names match exactly:
hand_landmarker.taskandface_landmarker.task
Feel free to open issues or submit pull requests with improvements!
MIT License - feel free to use this project however you'd like.
- Built with MediaPipe for gesture/face detection
- Uses OpenCV for video processing
- macOS Spotify control via AppleScript
Made with ❤️ to stop myself from getting distracted, ended up getting distracted by this for a whole day