Drone Voice Interaction System Based on ERNIE Bot LLM and Multimodal Models


Help me find the coke	Where is my laptop

our project website

If you run into any problems during the installation process, please file a GitHub Issue or send an email to me.

Project Overview

This project leverages the Baidu ERNIE Bot LLM to control a TELLO drone for object-finding tasks and other operations. The modules are outlined as follows:

Speech Recognition Module: VOSK
Object Detection and Recognition Module: Utilizing the multimodal models RAM and Grounding DINO
Monocular Depth Estimation Module: Based on GLPN

Create and activate a Conda environment: (python: recommend 3.8 or 3.9 (This project was set by 3.9))
```
conda create -n tello python=3.9 -y
conda activate tello
```

git clone our project:

git clone https://github.com/youngfriday/ernie_tello.git

Deploy and test each model, we have download them in the link below( for some Internet reasons),

https://bhpan.buaa.edu.cn/link/AAD4F562ABBE5B4648A81BE2FF50DD18C3

just download and unzip them , then put the 5 folders in your workplace folder.

or you can deploy them from the scratch. (if you have a good Internet or a stable source)Whatever, you should carefully follow their professional setting up instructions:
- RAM
- Grounding DINO (if you want to use cuda, do pay attention on its instruction!)
  
  No matter which device you use, you have to do :
  1. Change the current directory to the GroundingDINO folder.
```
cd GroundingDINO/
```
  2. Install the required dependencies in the current directory.
```
pip install -e .
```
You can find ram4test.py , dino4test.py ,depth4test.py if you download all given.

Make sure that you can run them successfully, which means all models have been deployed successfully in your computer, then try to run the main.py .

Acknowledgements

Thanks to their extraordinary work:

RAM
Grounding DINO
GLPN
GPT for Robotics
【Hackathon 5th】提示词的魅力：用文心大模型飞飞机！
If you can read Chinese, we highly recommend you to read the last one instruction series, without which we have no way to finish this job.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
model4test		model4test
prompts		prompts
system_prompts		system_prompts
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
depth.py		depth.py
ernie_airsim.py		ernie_airsim.py
find_target.py		find_target.py
main.ipynb		main.ipynb
main.py		main.py
main_voice.py		main_voice.py
tello2.py		tello2.py
tello_wrapper_ob.py		tello_wrapper_ob.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Drone Voice Interaction System Based on ERNIE Bot LLM and Multimodal Models

Project Overview

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

License

youngfriday/ernie_tello

Folders and files

Latest commit

History

Repository files navigation

Drone Voice Interaction System Based on ERNIE Bot LLM and Multimodal Models

Project Overview

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages