Open Xtract

Extract structured data from documents, images, audio, and video using LLMs.

Installation

uv add open-xtract

Usage

from pydantic import BaseModel
from open_xtract import extract

class PdfInfo(BaseModel):
    summary: str
    language: str

result = extract(
    schema=PdfInfo,
    model="google-gla:gemini-3-flash-preview",
    url="https://example.com/document.pdf",
    instructions="return a 2 sentence summary and the primary language of the document",
)
print(result)

Logging

To enable logfire instrumentation for tracing:

from open_xtract import configure_logging

configure_logging()

Error Handling

from open_xtract import (
    extract,
    ExtractionError,
    ModelError,
    SchemaValidationError,
    UrlFetchError,
)

try:
    result = extract(...)
except UrlFetchError as e:
    print(f"Failed to fetch URL: {e}")
except SchemaValidationError as e:
    print(f"Output didn't match schema: {e}")
except ModelError as e:
    print(f"Model API error: {e}")
except ExtractionError as e:
    print(f"Extraction failed: {e}")

Supported Media Types

Type	Extensions
Documents	`.pdf`, `.doc`, `.docx`, `.txt`, `.html`, `.csv`, `.xls`, `.xlsx`
Images	`.jpg`, `.jpeg`, `.png`, `.gif`, `.webp`, `.bmp`, `.svg`
Audio	`.mp3`, `.wav`, `.ogg`, `.flac`, `.aac`, `.m4a`
Video	`.mp4`, `.mov`, `.avi`, `.mkv`, `.webm`, `.wmv`

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
docs		docs
src/open_xtract		src/open_xtract
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Open Xtract

Installation

Usage

Logging

Error Handling

Supported Media Types

Contributing

About

Uh oh!

Releases 4

Packages

Contributors 2

Uh oh!

Languages

License

Mellow-Artificial-Intelligence/open-xtract

Folders and files

Latest commit

History

Repository files navigation

Open Xtract

Installation

Usage

Logging

Error Handling

Supported Media Types

Contributing

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 2

Uh oh!

Languages

Packages