JARVIS is a powerful voice-controlled AI assistant that can control your entire laptop, from opening applications to playing specific music and interacting with various programs.
- Open any installed application on your system
- Close applications and windows
- Minimize, maximize, and restore windows
- Switch between windows and applications
- Perform app-specific actions and searches
- Send WhatsApp messages to contacts
- Send SMS messages
- Compose emails
- Make voice and video calls
- Manage contacts (add, find, delete, list)
- Add events to your calendar
- View upcoming events
- Check schedule for specific dates
- Delete or modify events
- Play specific songs by name and artist
- Control music playback (play, pause, next, previous)
- Adjust volume
- Search for music on Spotify or YouTube Music
- Open websites
- Search the web (Google, YouTube)
- Control browser tabs (new tab, close tab, switch tabs)
- Navigate to specific URLs
- Refresh pages
- Search within specific websites and services
- Create new documents in Microsoft Office applications
- Save documents
- Format text (bold, italic, underline)
- Copy, cut, and paste content
- Select all content
- Undo and redo actions
- Shutdown, restart, or sleep your computer
- Take screenshots
- Get system information
- List installed applications
- Type text in any application
- Press special keys (Enter, Tab, Escape, arrows)
- Send keyboard shortcuts
- Create folders
- Delete files
- List files in directories
- Voice recognition and text-to-speech capabilities
- Natural language understanding
- Contextual responses using local LLM (when available)
- Python 3.8 or higher
- Ollama (for running the local LLM)
- A microphone for voice input
- Speakers for voice output
-
Clone this repository:
git clone https://github.com/yourusername/jarvis.git cd jarvis -
Run the setup script to install dependencies and set up Ollama:
python setup_ollama.pyThis script will:
- Install Python dependencies
- Help you install Ollama if it's not already installed
- Start the Ollama server
- Download the default language model (llama3)
-
Start JARVIS:
python main.py -
Speak to JARVIS using natural language commands:
- "Open Chrome"
- "Open Microsoft Word"
- "Open Spotify"
- "Open Calculator"
- "Open File Explorer"
- "Send message to John on WhatsApp saying I'll be late"
- "Send SMS to Mom saying I'll call later"
- "Send email to boss@example.com subject Meeting report body Here's the report"
- "Call Sarah on WhatsApp"
- "Video call Mike on Teams"
- "Add contact named John Smith with phone 555-1234 email john@example.com"
- "Find contact named Sarah"
- "List contacts"
- "Add event called Team Meeting on tomorrow at 3pm"
- "What's on today?"
- "Show my upcoming events"
- "Show events for next Monday"
- "Delete event Team Meeting on tomorrow"
- "Search for vacation photos in Google Photos"
- "Search for budget spreadsheet in Google Drive"
- "Search for Python tutorials in YouTube"
- "Search for Taylor Swift in Spotify"
- "Open presentation in PowerPoint"
- "Create new document in Word"
- "Play Shape of You by Ed Sheeran"
- "Play some music"
- "Play the next song"
- "Pause the music"
- "Volume up"
- "Volume down"
- "Open YouTube"
- "Search for Python tutorials"
- "Open a new tab"
- "Close this tab"
- "Go to github.com"
- "Refresh the page"
- "Close this window"
- "Minimize window"
- "Maximize window"
- "Switch window"
- "What window is this?"
- "Type Hello, how are you?"
- "Press Enter"
- "Press Tab"
- "Press Escape"
- "Copy"
- "Paste"
- "Select all"
- "Undo"
- "Redo"
- "Shutdown computer"
- "Restart computer"
- "Sleep computer"
- "Take a screenshot"
- "List installed apps"
- To exit JARVIS, say "Exit" or "Quit".
JARVIS can use a local LLM (Language Model) through Ollama to provide more intelligent responses. To enable this feature:
- Install Ollama from https://ollama.com/download
- Start the Ollama application
- Pull a model:
ollama pull llama3(or another model of your choice) - JARVIS will automatically use the local LLM when available
By default, JARVIS uses the "jarvis" model (or falls back to "llama3"). To use a different model:
-
Pull the model using Ollama:
ollama pull mistral -
Edit the
core/local_brain.pyfile and change the default model name in theask_local_llmfunction:def ask_local_llm(query, model_name="mistral"):
JARVIS can automate interactions with various applications using PyAutoGUI. This allows it to:
- Click on specific positions or images on the screen
- Send keyboard shortcuts to applications
- Interact with specific applications like browsers, Office apps, and media players
To add new commands, edit the core/commands.py file and add your command to the run_command function.
JARVIS automatically detects installed applications on your system. You can customize the app detection by:
- Adding common paths to the
get_common_program_pathsfunction incore/app_finder.py - Enabling full system scanning by uncommenting the line in
get_installed_apps_listfunction
- Make sure your microphone is properly connected and set as the default input device
- Speak clearly and at a moderate pace
- Try to minimize background noise
- Some applications may require administrator privileges to be controlled
- Certain applications may have unique interfaces that JARVIS is not specifically programmed to handle
- For best results, make sure applications are in their default window state
- If Ollama is not installed or running, JARVIS will fall back to using predefined responses
- Make sure you have enough disk space and RAM to run the local LLM
- Larger models provide better responses but require more resources
You can test if Ollama is working correctly by running:
python test_ollama.py
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for the GPT models that helped create this project
- The developers of all the libraries used in this project
- The open-source community for their continuous support and inspiration