Minor Typo in Figure Reference

### **Minor Typo in Figure Reference**

A minor typo was found in the paper "VITA: Towards Open-Source Interactive Omni Multimodal LLM", specifically in a figure reference. This issue aims to correct the reference for clarity and accuracy.

**Paper:** *VITA: Towards Open-Source Interactive Omni Multimodal LLM* 
**ArXiv Version:** `arXiv:2408.05211v3` (30 May 2025) 

---

### **Issue Details**

**Section:** 3.4.2 Audio Interrupt Interaction 

**Original Text:**
> To achieve this, we propose the duplex deployment framework... As illustrated in **Fig.1**, two VITA models are deployed concurrently.

**Proposed Correction:**
> To achieve this, we propose the duplex deployment framework... As illustrated in **Fig. 2**, two VITA models are deployed concurrently.

---

### **Reasoning**

* The duplex deployment scheme, which involves two concurrently deployed VITA models, is explicitly shown in **Figure 2**.
* The paper's "Introduction" section correctly references this architecture, stating: "As shown in Fig. 2, two VITA models are deployed simultaneously: one is responsible for generating responses to user queries, and the other continuously tracks environmental inputs...".
* **Figure 1** is titled "Interaction of VITA" and demonstrates the user interaction flow, not the underlying two-model architecture.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor Typo in Figure Reference #129