Skip to content

yswa-var/DOCX-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

8 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

DOCX Agent - AI-Powered Document CRUD Operations

An intelligent document editing system that combines LangGraph, OpenAI, and docx2python to perform CRUD operations on Word documents using natural language.

Screenshot 2025-09-30 at 7 30 25โ€ฏPM

๐ŸŽฏ Key Features

  • โœ… Natural Language Interface - Edit documents with plain English
  • โœ… Anchor-Based Navigation - Precise paragraph addressing with depth-4 enumeration
  • โœ… Breadcrumb Trails - Hierarchical context for every paragraph
  • โœ… OpenAI-Powered CRUD - Intelligent operation routing and execution
  • โœ… LangGraph Integration - Agent-based workflows with tool calling
  • โœ… JSON Export - Structured document index with metadata

๐Ÿ—๏ธ Architecture

User Natural Language Query
          โ†“
   OpenAI LLM (Reasoning)
          โ†“
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   LangGraph Agent        โ”‚
   โ”‚   (graph.py)             โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
          โ†“
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   Agent Tools            โ”‚
   โ”‚   โ€ข get_paragraph()      โ”‚
   โ”‚   โ€ข update_paragraph()   โ”‚
   โ”‚   โ€ข get_document_outline()โ”‚
   โ”‚   โ€ข search_document()    โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
          โ†“
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   DOCX Manager           โ”‚
   โ”‚   (Index & Operations)   โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
          โ†“
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   DOCX Indexer           โ”‚
   โ”‚   (Structure Parser)     โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
          โ†“
      DOCX File

๐Ÿ“ฆ Installation

# Clone and setup
cd /Users/yash/Documents/rfp/DOCX-agent
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

๐Ÿš€ Quick Start

1. Test Basic Operations

python test_agent.py

This will:

  • Index the sample DOCX file
  • Test all CRUD operations
  • Generate document_index.json
  • Display example prompts

2. Run with LangGraph

# Set OpenAI API key
export OPENAI_API_KEY="sk-your-key-here"

# Start LangGraph dev server
cd main
langgraph dev

Open http://localhost:8000 in your browser.

3. Try Natural Language Queries

Read Operations:

"Show me the document outline"
"What's in section 2.1?"
"Find all mentions of CPX"

Update Operations:

"Change the title to 'Updated RFP Response'"
"Update section 2.1 to say 'Company Profile'"

Search Operations:

"Find paragraphs about implementation"
"Where is the pricing section?"

โš™๏ธ Demo Backend Setup

For the lightweight demo flow you can run just the FastAPI backend and the optional Microsoft Teams adapter.

  1. Configure environment
    cd /Users/yash/Documents/rfp/DOCX-agent
    cp .env.example .env  # update values if needed
  2. Start the backend API
    uvicorn backend.app:app --reload --port 8080
    • Uses in-memory LangGraph execution
    • Persists session metadata to backend/sessions.csv
    • Reads demo DOCX files from the directories listed in DOCUMENT_SEARCH_DIRS
  3. (Optional) Run the Teams bridge
    cd teams
    python app.py
    Set BACKEND_API_URL in your environment if the backend is not on http://localhost:8080.
  4. Load a document in chat Send /load master.docx (or another filename) from the web/Teams client to attach the sample document.

๐Ÿ“š How It Works

1. Document Indexing (docx2python)

from docx2python import docx2python

with docx2python("document.docx") as docx:
    pars = docx.body  # depth-4: [table][row][col][par]
    first = pars[0][0][0][0]  # First paragraph

2. Anchor System

Every paragraph gets a unique anchor:

{
  "anchor": ["body", 0, 0, 0, 5],
  "breadcrumb": "RFP PROPOSAL RESPONSE > Table of Contents",
  "style": "Heading 2",
  "text": "Table of Contents",
  "level": 2
}

Anchor Format: ["body", table, row, column, paragraph]

3. OpenAI Integration

The LLM acts as an intelligent router:

User: "Update the pricing section to include 20% discount"

OpenAI reasoning:
1. ๐Ÿ” Search for "pricing" โ†’ find anchor
2. ๐Ÿ“– Get current text โ†’ understand context
3. โœ๏ธ  Modify text with discount
4. ๐Ÿ’พ Call update_paragraph(anchor, new_text)
5. โœ… Return success message

4. Agent Tools

Four core tools in tools.py:

# READ
await get_paragraph(["body", 0, 0, 0, 5])
await get_document_outline()
await search_document("CPX")

# UPDATE
await update_paragraph(["body", 0, 0, 0, 5], "New text")

๐Ÿ’ก Why OpenAI for CRUD?

Traditional Approach โŒ

# Manual, error-prone
anchor = ["body", 2, 5, 0, 12]
update_paragraph(anchor, "New text")  # How do you know the anchor?

AI-Powered Approach โœ…

"Update the introduction paragraph"
โ†’ AI finds it, validates context, updates correctly

Benefits:

  1. No Manual Anchor Lookup - AI finds the right paragraph
  2. Context Understanding - "pricing section" vs "first pricing mention"
  3. Multi-Step Operations - "Compare sections 3 and 4"
  4. Error Prevention - Validates before updating
  5. Natural Language - "Change X to Y in section Z"

๐Ÿ“– API Reference

DocxIndexer

from react_agent.docx_indexer import DocxIndexer

indexer = DocxIndexer("document.docx")
paragraphs = indexer.index()
outline = indexer.get_outline()
matches = indexer.find_by_text("search term")
indexer.save_index("output.json")

DocxManager

from react_agent.docx_manager import get_docx_manager

manager = get_docx_manager("document.docx")
para = manager.get_paragraph(["body", 0, 0, 0, 5])
outline = manager.get_outline()
results = manager.search("query")
manager.update_paragraph(anchor, "new text")

Agent Tools

from react_agent.tools import (
    get_paragraph,
    update_paragraph,
    get_document_outline,
    search_document
)

# All tools are async
result = await get_document_outline()

๐Ÿงช Testing

Run All Tests

python test_agent.py

Test Individual Components

# Index a document
python main/src/react_agent/docx_indexer.py response/master.docx output.json

# Test manager
python -c "
from react_agent.docx_manager import get_docx_manager
manager = get_docx_manager('response/master.docx')
print(f'Found {len(manager.get_all_paragraphs())} paragraphs')
"

๐Ÿ“ Example Prompts

See TESTING.md for comprehensive prompt examples.

Simple Queries:

  • "Show me the outline"
  • "What's in section 3?"
  • "Find 'implementation'"

Complex Queries:

  • "Update all pricing sections to include 15% discount"
  • "Show me the breadcrumb for the team section"
  • "List all subsections under 'About CPX'"

๐Ÿ› ๏ธ Configuration

Change Default Document

Edit main/src/react_agent/docx_manager.py:

def get_docx_manager(docx_path: Optional[str] = None) -> DocxManager:
    if docx_path is None:
        docx_path = "/path/to/your/default.docx"  # Change here
    return DocxManager(docx_path)

Configure LLM

Edit main/langgraph.json or set environment variables:

export OPENAI_API_KEY="sk-..."
export OPENAI_MODEL="gpt-4"

Customize System Prompt

Edit main/src/react_agent/prompts.py:

DEFAULT_SYSTEM_PROMPT = """
Your custom instructions here...
"""

๐Ÿ“ Project Structure

DOCX-agent/
โ”œโ”€โ”€ main/
โ”‚   โ””โ”€โ”€ src/
โ”‚       โ””โ”€โ”€ react_agent/
โ”‚           โ”œโ”€โ”€ docx_indexer.py    # DOCX structure parser
โ”‚           โ”œโ”€โ”€ docx_manager.py    # Document operations
โ”‚           โ”œโ”€โ”€ tools.py           # LangGraph agent tools
โ”‚           โ”œโ”€โ”€ graph.py           # Agent graph definition
โ”‚           โ”œโ”€โ”€ state.py           # Agent state
โ”‚           โ””โ”€โ”€ prompts.py         # System prompts
โ”œโ”€โ”€ response/
โ”‚   โ””โ”€โ”€ master.docx                # Sample document
โ”œโ”€โ”€ test_agent.py                  # Comprehensive tests
โ”œโ”€โ”€ TESTING.md                     # Testing guide
โ”œโ”€โ”€ README.md                      # This file
โ””โ”€โ”€ requirements.txt               # Python dependencies

๐Ÿ”ง Dependencies

  • docx2python - Parse DOCX structure (depth-4)
  • python-docx - Edit DOCX content
  • langgraph - Agent orchestration
  • langchain - LLM integration
  • openai - GPT-4 API

๐Ÿ“Š Output Format

Index JSON Structure

[
  {
    "anchor": ["body", 0, 0, 0, 0],
    "breadcrumb": "RFP PROPOSAL RESPONSE",
    "style": "Heading 1",
    "text": "RFP PROPOSAL RESPONSE",
    "level": 1
  },
  {
    "anchor": ["body", 0, 0, 0, 5],
    "breadcrumb": "Table of Contents",
    "style": "Heading 2",
    "text": "Table of Contents",
    "level": 2
  }
]

Tool Response Format

{
  "success": true,
  "message": "Paragraph updated successfully"
}

๐Ÿšจ Troubleshooting

"Module not found" error

source venv/bin/activate
pip install -r requirements.txt

"Cannot access document" error

Update the path in docx_manager.py or pass as parameter.

OpenAI API errors

export OPENAI_API_KEY="sk-..."

๐ŸŽฏ Use Cases

  • ๐Ÿ“„ RFP Response Automation - Update proposals with client-specific info
  • ๐Ÿ“‹ Contract Management - Search and modify contract terms
  • ๐Ÿ“Š Report Generation - Populate templates with data
  • ๐Ÿ“ Document QA - Ask questions about document content
  • โœ๏ธ Batch Editing - Update multiple sections at once

๐Ÿ”ฎ Future Enhancements

  • Support for tables and images
  • Track change history
  • Multi-document operations
  • Export to PDF
  • Template system
  • Collaboration features

๐Ÿ“„ License

See LICENSE file for details.

๐Ÿค Contributing

This is a prototype system. Feel free to extend and customize for your needs!

๐Ÿ“ž Support

For questions or issues, refer to TESTING.md for detailed examples and troubleshooting.


Built with โค๏ธ using LangGraph, OpenAI, and docx2python

About

Professional worker docx agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •