LangExtract is an extension of Google's LangExtract package that provides a powerful and unified interface for extracting structured information from unstructured text using Large Language Models (LLMs). Built with enterprise-grade scalability and reliability in mind, it seamlessly integrates with all major LLM providers including OpenAI, Anthropic, Google, or local models via Ollama. Open weight models like OpenAI's gpt-oss-120b and gpt-oss-20b via HuggingFace's OpenAI-compatible API is also supported.
The library enables developers to transform raw text into structured data through natural language instructions and example-driven guidance, making it ideal for information extraction, entity recognition, relationship mapping, and content analysis tasks across various domains.
- Multi-Provider Support: Works with OpenAI GPT models, Anthropic Claude, Google Gemini, HuggingFace OpenAI-compatible API, and local Ollama models. Compatible with latest models like GPT-5, Claude-4 and more.
- Example-Driven Few-Shot Learning: LangExtract minimizes the need for extensive data labeling and model fine-tuning, making it accessible to users with varying technical expertise. Uses high-quality examples to guide extraction quality and consistency
- Long-Context Processing: The tool efficiently handles large datasets while maintaining contextual accuracy, making it ideal for complex NLP tasks.
- Parallel Processing: Concurrent API calls with configurable worker pools for high-throughput processing.
- Multi-Pass Extraction: Sequential extraction passes to improve recall and find additional entities
- Flexible Input: Process strings, documents, or URLs with automatic content downloading
- Rich Visualization: Interactive HTML visualizations of extraction results
- Production Ready: Environment variable management, error handling, and comprehensive testing
git clone <this repo url>
cd langextract
pip install -e .Or using uv (recommended for development):
uv init && uv syncHere's a complete example based on the included example.py:
uv run example.py --provider openai --model gpt-5-nanoExpected outputs:
- JSONL:
output/{provider}_extraction_results.jsonl - HTML visualization:
output/{provider}_extraction_results.html
Tip: The prompt and the in‑code example in example.py show how to nudge models toward high‑quality, consistent entity and relationship extraction using exact text spans.
LangExtract requires Python 3.8+ and installs the following key dependencies:
google-genai- Google Gemini API clientanthropic- Anthropic Claude API clientopenai- OpenAI GPT API clienthuggingface-hub- HuggingFace API clientollama- Ollama API client
LangExtract supports four different language model backends:
from langextract.inference import ClaudeLanguageModel
model = ClaudeLanguageModel(
model_id='claude-3-5-haiku-latest', # or claude-3-5-sonnet-latest
api_key='your-api-key', # or set ANTHROPIC_API_KEY
temperature=0.0,
max_workers=10,
structured_schema=None, # Optional schema for structured output
format_type=data.FormatType.JSON
)from langextract.inference import OpenAILanguageModel
model = OpenAILanguageModel(
model_id='gpt-5-nano', # or gpt-4o
api_key='your-api-key', # or set OPENAI_API_KEY
organization='your-org-id', # Optional
temperature=0.0,
max_workers=10,
structured_schema=None, # Optional schema for structured output
format_type=data.FormatType.JSON
)from langextract.inference import GeminiLanguageModel
model = GeminiLanguageModel(
model_id='gemini-2.5-flash', # or gemini-1.5-pro
api_key='your-api-key', # or set GOOGLE_API_KEY
temperature=0.0,
max_workers=10,
structured_schema=None, # Optional schema for structured output
format_type=data.FormatType.JSON
)from langextract.inference import HFLanguageModel
model = HFLanguageModel(
model_id='openai/gpt-oss-120b:cerebras', # or other HF models
api_key='your-hf-token', # or set HF_TOKEN environment variable
base_url='https://router.huggingface.co/v1', # Default HF router
temperature=0.0,
max_workers=10,
structured_schema=None, # Optional schema for structured output
format_type=data.FormatType.JSON
)Popular HuggingFace Models:
openai/gpt-oss-120b:cerebras- High-performance open modelmeta-llama/llama-3.2-3b-instruct- Meta's Llama 3.2microsoft/DialoGPT-medium- Microsoft's conversational modelmistralai/mixtral-8x7b-instruct- Mistral's mixture of experts
from langextract.inference import OllamaLanguageModel
model = OllamaLanguageModel(
model='gemma2:latest', # or llama3.1, mistral, etc.
model_url='http://localhost:11434', # Default Ollama endpoint
structured_output_format='json',
temperature=0.8,
constraint=None # Optional schema constraint
)Set up API keys using environment variables:
# For Anthropic Claude
export ANTHROPIC_API_KEY="your-claude-api-key"
# For OpenAI GPT models
export OPENAI_API_KEY="your-openai-api-key"
# For Google Gemini
export GOOGLE_API_KEY="your-google-api-key"
# For HuggingFace models
export HF_TOKEN="your-huggingface-token"
# Ollama doesn't require an API key (local models)Or create a .env file in your project root:
cp .env.example .env
# Edit .env with your API keysImprove recall by running multiple extraction passes:
result = lx.extract(
text_or_documents=input_text,
prompt_description=prompt,
examples=examples,
extraction_passes=3, # Run 3 independent passes, and merge them
model_id='claude-3-5-haiku-latest',
language_model_type=inference.ClaudeLanguageModel
)Configure parallel workers for high-throughput processing:
result = lx.extract(
text_or_documents=input_text,
prompt_description=prompt,
examples=examples,
max_workers=20, # Parallel API calls
batch_length=50, # Chunks per batch
model_id='gpt-4o-mini',
language_model_type=inference.OpenAILanguageModel
)Access cutting-edge open-source models via HuggingFace's router:
result = lx.extract(
text_or_documents=input_text,
prompt_description=prompt,
examples=examples,
model_id='openai/gpt-oss-120b:cerebras',
language_model_type=inference.HFLanguageModel,
language_model_params={
'api_key': 'your-hf-token', # or set HF_TOKEN env var
'temperature': 0.0
}
)Use structured schemas for consistent output format:
from langextract import schema
# Define custom schema
custom_schema = schema.StructuredSchema.from_examples(examples)
result = lx.extract(
text_or_documents=input_text,
prompt_description=prompt,
examples=examples,
use_schema_constraints=True, # Enable structured output
language_model_params={'structured_schema': custom_schema}
)Process various input types:
# Process URL
result = lx.extract(
text_or_documents="https://example.com/article",
prompt_description=prompt,
examples=examples
)
# Process multiple documents
documents = [
lx.data.Document(text="Document 1 content", metadata={"source": "doc1"}),
lx.data.Document(text="Document 2 content", metadata={"source": "doc2"})
]
results = lx.extract(
text_or_documents=documents,
prompt_description=prompt,
examples=examples
)API costs vary by provider and model. Key factors affecting cost:
- Token Volume: Larger
max_char_buffervalues reduce API calls but process more tokens per call - Extraction Passes: Each additional pass reprocesses tokens (3 passes = 3x token cost)
- Parallel Workers:
max_workersimproves speed without additional token costs - Model Selection: Larger models (GPT-4o, Claude-3 Opus) cost more than smaller ones (GPT-4o-mini, Claude-3 Haiku)
Cost Optimization Tips:
- Start with smaller models for testing (gpt-4o-mini, claude-3-5-haiku-latest)
- Use
extraction_passes=1initially, increase only if recall is insufficient - Monitor usage with small test runs before processing large datasets
- Consider models hosted on HuggingFace for access to open-source alternatives
- Consider local Ollama models for cost-sensitive applications
- Add support fo Azure OpenAI
- Improve example-driven extraction with more complex schemas
- Self improving prompt, e.g. how to incooporate GPT-5 Prompt Optimizer https://cookbook.openai.com/examples/gpt-5/prompt-optimization-cookbook
- fix the extraction instability in some LLMs.