-I don't click, I command
This is a voice-activated AI Assistant that listens to your commands, understands them with GPT, executes them using Playwright, and responds back to you - like a real assistant should. It's not Alexa, It's not Siri. It's yours. Built with open tools. Runs on locally. Works on your voice.
~🔎 Wake word detection via Porcupine
~🗣 Speech to text using OpenAI Whisper
~❝ ❞ Command parsing via GPT (openAI API)
~🦾 Automation using Playwright(open browser, search, scrap, etc.)
~📖 Text to speech output via Edge TTS
~📏 Modular structure: easy plugin new commands, functions
~🌐 Local-Only design -no internet dependency except GPT(parsing)
- Wake word - Porcupine(custom keyword model)
- Speech recognition - OpenAI Whisper(local model "small/medium")
- Natural Language Parsing - OpenAI GPT API
- Automation - Playwrigth
- Text-to-speech - Edge TTS
- Runtime - Python 3.10+
|-main.py --> entry point, connects
|-wakeword.py --> porcupine
|-listener.py --> Whisper + parser
|-automate.py --> playwright
|-requirements.txt --> all the required libraries to run the program
- Wake word detect via porcupine custom model
- Your voice is recorded and passed to whisper
- Whisper transcribes the voice to text
- GPT AI parses the text and determines the action
- Playwright runs the automation task
- Result is converted into Speech by Edge TTS
- Assistant responds - like a friend who codes.
- Currently optimized for single microphone model
- Mic detection/quality varies across systems
- GPT API call requires internet connection(for command parsing)
- No GUI yet - purely CLI and audio-driven -I focused on building intelligence, not appearance - for now
- Add local NLP fallback to replace GPT
- Add GUI with Tkinter or React dashboard(post MERN)
- Add multi mic/device compatibility
- Plugin-style commands(e.g., emails, music)
- Dockerize the assistant for deployment
--> I built this project not as a clone of something else, but as a milestone of my capability.
--> I wanted to see if i could connect multiple systems - speech, logic, automation and voice - into a single, usable assistant.
--> AND YES I DID IT 💪🏻
Personal use. Modify. Improve. Break it. Learn from it. This project was made to learn, not to sell.
1. OpenAI Whisper & GPT
2. Porcupine (by Picovoice)
3. Microsoft Edge TTS
4. Playwright
5. Priyanshu