This repository contains the implementation of a multi-lingual voice-controlled drone system built using Microsoft AirSim, Unreal Engine, and Python. The project demonstrates three different approaches to drone control: Rule-based, NLP-based, and Zero-shot classification. It supports a wide range of Indian languages for voice commands, making drone technology more accessible to diverse communities.
- Introduction
- Features
- Approaches
- Setup Instructions
- Screenshots
- Demo Videos
- Software and Libraries Used
- Supported Languages
- System Specifications
- Contributing
- License
This project aims to simplify drone control by enabling multi-lingual voice commands. By leveraging Python Automatic Speech Recognition (ASR)(which uses Google Web Speech API as one of its backends for automatic speech recognition (ASR) ) and advanced natural language processing techniques, the drone interprets and executes voice commands in various Indian languages.
This innovation targets users like farmers, disaster managers, and local communities, empowering them with an intuitive and accessible way to control drones.
- Multi-lingual Voice Support: Commands can be given in many Indian languages supported by Google Speech API.
- Three Control Mechanisms:
- Rule-based for predefined mappings.
- NLP-based for contextual understanding.
- Zero-shot classification for dynamic command recognition.
- Simulation: The project is developed and tested in a simulated environment using Microsoft AirSim in Unreal Engine.
- File:
hello_drone_ruleBased.py - Matches commands to predefined synonyms mapped to drone actions.
- Example: "Fly up", "Ascend", and "Go higher" are all mapped to the same action.
- Simple and effective for predictable, well-defined command sets.
- File:
hello_drone_nlp.py - Utilizes SpaCy for natural language processing.
- Parses and interprets commands with semantic understanding.
- Enables more flexible interactions than rule-based approaches.
- File:
hello_drone_zero.py - Uses Facebook's Zero-shot Classification Model (
facebook/bart-large-mnli). - Dynamically classifies commands without requiring prior training for specific phrases.
- Highly adaptable for unexpected or novel command sets.
- Microsoft AirSim
- Unreal Engine 4.27
- Anaconda
- Visual Studio (not Visual Studio Code)
git clone https://github.com/Microsoft/AirSim.gitFollow the instructions on the AirSim Build Documentation.
Create a Python environment using Anaconda and install the required libraries:
conda create -n airsim_env python=3.8
conda activate airsim_env
pip install spacy transformers SpeechRecognition- Use the
Landscape Mountainsproject for the 3D simulation environment. - Ensure proper connection between Unreal Engine and AirSim using Visual Studio.
Locate the project in Unreal Projects folder . Open Landscape.soln ( Visual Studio file ) .Press F5 (This starts Unreal Engine along with AirSim).
Now, just press Play button and the drone seems to spwan at the Player Start point .
Open Anaconda Prompt , Activate the created environment(here airsim_env) .
Navigate to AirSim/PythonClient/multirotor , paste the given python codes instead of hello_drone.py (sample code given my Airsim) and execute one of the control scripts(Rule_Based, NLP Based, Zero-Shot).
Follow these steps to run the drone control scripts:
a. Launch Unreal Engine with AirSim:
- Locate the project in the Unreal Projects folder
- Open
Landscape.slnin Visual Studio - Press
F5to start Unreal Engine with AirSim - Press the Play button to spawn the drone at the Player Start point
b. Prepare Python Environment:
- Open Anaconda Prompt
- Activate the created environment:
conda activate airsim_env
c. Navigate and Execute Control Scripts:
- Navigate to the project directory:
cd AirSim/PythonClient/multirotor - Replace the default
hello_drone.pywith one of the provided control scripts:
python hello_drone_ruleBased.py
python hello_drone_nlp.py
python hello_drone_zero.pyWatch the drone in action in our demo videos:
- Demo Video 1: Rule-based Control using Telugu
- Demo Video 2: Hindi language
- Demo Video 3: NLP processing, English
- Microsoft AirSim
- Unreal Engine 4.27
- Anaconda
- Visual Studio (NOT Visual Studio Code)
- AirSim Python API
- SpaCy
- SpeechRecognition (Google Speech Recognition API)
- Transformers (
facebook/bart-large-mnli)
The project supports voice commands in many Indian languages. Here are the language codes :
- Hindi (
hi) - Bengali (
bn) - Telugu (
te) - Marathi (
mr) - Tamil (
ta) - Gujarati (
gu) - Kannada (
kn) - Malayalam (
ml) - Odia (
or) - Punjabi (
pa) - Assamese (
as) - Urdu (
ur)
- Operating System: Windows 11
- Laptop: HP Victus
- Processor: Intel Core i7 (12th Gen)
- GPU: NVIDIA RTX 3050
- RAM: 16 GB
Contributions are welcome! Please fork this repository and submit a pull request with your changes.
This project is licensed under the MIT License. See the LICENSE file for details.
Feel free to reach out if you have any questions or suggestions. Happy flying!


