Skip to content

[Model Request] Add Qwen3-VL-32B-Instruct for Vision/Document OCR #34

@illogicallogic4

Description

@illogicallogic4

Request

Please add Qwen3-VL-32B-Instruct (Vision-Language Model) to Cocoon network.

Use Case: Legal Document Processing

We are building Starec - an AI legal assistant for Russian legal system:

  • OCR of legal documents (court decisions, laws, case files)
  • Processing scanned PDFs from pravo.gov.ru (Russian official legal portal)
  • Extracting structured data from criminal case volumes
  • Multi-page document analysis with 256K context

Why Qwen3-VL-32B?

Feature Value
DocVQA accuracy ~96% (near human-level 98%)
OCRBench 88.8%
Languages 32 (including Russian/Cyrillic)
Context 256K tokens (extendable to 1M)
Architecture Dense (more stable than MoE)

Key advantages for legal OCR:

  • Robust in low light, blur, tilted text (scanned documents)
  • Improved rare/ancient character recognition (legal terminology)
  • Long document structure parsing (multi-page court decisions)
  • JSON extraction accuracy matches GPT-4o (~75%)

Technical Details

Alternative (smaller)

If 32B is too heavy initially:

  • Qwen3-VL-8B-Instruct (~18GB VRAM) - still excellent for OCR

Contact

  • Project: Starec (Legal AI Assistant)
  • Wallet: UQDtVYbmmARixnYhtDQGljyoXSEtM_NoJxQ4NlwDQPuLubnG

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions