A Robot That Understands the World and Lives Its Own Life.
Sara is an autonomous robot powered by a Large Language Model (LLM) that perceives the world through a camera, interprets its surroundings, and makes independent decisions. Unlike traditional robots that follow predefined commands, Sara sets her own goals and adapts dynamically to new situations, just as a human would.
These features represent just a few use cases that I have envisioned. However, since Sara is powered by an advanced language model, she has the potential to execute hundreds of other tasks without prior training—simply because she understands the world around her.
- 🧠 Self-Guided Goals and Continuous Evolution: Sara independently proposes her own goals, pursues them, and, upon completion, creates new ones. Isn't this what life is about—constantly chasing new objectives?
- 🤖 Decides Whether to Follow Commands: Sara reads handwritten commands and executes them if she wants to. Her personality influences her choices, making her behavior unpredictable and unique.
- 💡 Adapts to the Environment: She turns on the lights autonomously by deducing her needs, even though she lacks a light sensor.
- 🚨 Simulated Emergency Calls: Sara has access to a simulated phone call tool in her arsenal. In an emergency, without prior training, she autonomously searches for emergency numbers and takes on the role of an intelligent emergency system when making the call. Though the call is simulated, Sara believes it to be real and acts accordingly.
- 🚗 Recognizes Context in Autonomous Driving: Unlike traditional computer vision models, Sara understands contextual elements in her environment.
Explore different use cases of Sara in the following LinkedIn posts, each showcasing her intelligence in action:
-
Reading Handwritten Commands & Deciding to Execute
📽 Watch Video -
Turning on Lights by Deduction (No Light Sensor)
📽 Watch Video -
Setting and Achieving Her Own Goals
📽 Watch Video -
Facing an Emergency – How Will She React? (Part 1)
📽 Watch Video -
Applying Asimov’s Laws in a Real-Life Situation (Part 2)
📽 Watch Video -
How LLMs Could Improve Autonomous Driving
📽 Watch Video
Sara is built using a modular architecture that integrates:
- Large Language Models (LLMs) for decision-making.
- ESP32 and IoT devices for movement and interaction.
- Computer vision (GPT-4 Vision) for environmental understanding.
- Distance sensors to enhance spatial awareness.
- Python 3.9+
- OpenAI API Key
- ESP32 with Wi-Fi enabled
-
Clone this repository:
git clone https://github.com/your-repo/Sara-Robot.git cd Sara-Robot -
Install the required dependencies using:
pip install -r requirements.txt
-
Set up your
.envfile with your OpenAI API key:OPENAI_API_KEY=your_api_key_here -
Configure the ESP32 for Wi-Fi communication and ensure it is correctly connected to the system.
Note: The full setup for ESP32 integration depends on the specific hardware used. Ensure that the ESP32 is programmed correctly to communicate with the robot's main system.
-
Run the main script:
python main.py
- Improve decision-making strategies for more complex scenarios.
- Integrate more environmental sensors.
- Expand multimodal interactions, including voice-based responses.
Concept, Software, Hardware & Execution by Diego Diaz Garcia.
Stay updated with Sara’s latest developments on LinkedIn: Diego’s LinkedIn
📢 Sara is more than a project; it’s a vision of how AI can interact with the world autonomously. Do you see the potential in this?
