[Model Request] Add Qwen3-VL-32B-Instruct for Vision/Document OCR

## Request

Please add **Qwen3-VL-32B-Instruct** (Vision-Language Model) to Cocoon network.

## Use Case: Legal Document Processing

We are building **Starec** - an AI legal assistant for Russian legal system:
- OCR of legal documents (court decisions, laws, case files)
- Processing scanned PDFs from pravo.gov.ru (Russian official legal portal)
- Extracting structured data from criminal case volumes
- Multi-page document analysis with 256K context

## Why Qwen3-VL-32B?

| Feature | Value |
|---------|-------|
| DocVQA accuracy | ~96% (near human-level 98%) |
| OCRBench | 88.8% |
| Languages | 32 (including Russian/Cyrillic) |
| Context | 256K tokens (extendable to 1M) |
| Architecture | Dense (more stable than MoE) |

**Key advantages for legal OCR:**
- Robust in low light, blur, tilted text (scanned documents)
- Improved rare/ancient character recognition (legal terminology)
- Long document structure parsing (multi-page court decisions)
- JSON extraction accuracy matches GPT-4o (~75%)

## Technical Details

- **Model**: `Qwen/Qwen3-VL-32B-Instruct`
- **HuggingFace**: https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct
- **Parameters**: 32B (dense)
- **VRAM**: ~70GB FP16, ~40GB INT8
- **vLLM compatible**: Yes

## Alternative (smaller)

If 32B is too heavy initially:
- **Qwen3-VL-8B-Instruct** (~18GB VRAM) - still excellent for OCR

## Contact

- Project: Starec (Legal AI Assistant)
- Wallet: `UQDtVYbmmARixnYhtDQGljyoXSEtM_NoJxQ4NlwDQPuLubnG`

## References

- [Qwen3-VL GitHub](https://github.com/QwenLM/Qwen3-VL)
- [Qwen3-VL Technical Report](https://arxiv.org/abs/2511.21631)
- [OmniAI OCR Benchmarks](https://getomni.ai/blog/benchmarking-open-source-models-for-ocr)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Model Request] Add Qwen3-VL-32B-Instruct for Vision/Document OCR #34

Request

Use Case: Legal Document Processing

Why Qwen3-VL-32B?

Technical Details

Alternative (smaller)

Contact

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature	Value
DocVQA accuracy	~96% (near human-level 98%)
OCRBench	88.8%
Languages	32 (including Russian/Cyrillic)
Context	256K tokens (extendable to 1M)
Architecture	Dense (more stable than MoE)

[Model Request] Add Qwen3-VL-32B-Instruct for Vision/Document OCR #34

Description

Request

Use Case: Legal Document Processing

Why Qwen3-VL-32B?

Technical Details

Alternative (smaller)

Contact

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions