-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Azure OpenAI
Access to multiple versions of GPT
Access to other LLMs within the Azure ecosystem
Offers integration with other Microsoft tools (e.g. Azure Cognitive Services)
Create a Resource: Log into the Azure portal and create an "Azure OpenAI" resource. Select the region where the service is available.
Deploy GPT-4: Within the resource, deploy the GPT-4 model. This will make the model available for use via REST APIs or the Azure SDK.
Preprocess Financial Documents
Use OCR to extract text from original format
Clean data
Split longer documents into manageable sections (that fit within GPT-4 token limits)
Call GPT-4 via Azure API
Use the Azure OpenAI REST API or python SDK to send prompts and receive responses
EXAMPLE CODE:
import openai
Set up Azure OpenAI configuration
openai.api_type = "azure"
openai.api_key = "<your_azure_api_key>"
openai.api_base = "https://<your_resource_name>.openai.azure.com/"
openai.api_version = "2023-06-01-preview"
Define the prompt
def analyze_invoice(ocr_text):
prompt = f"""
You are given the OCR result of an invoice with the following text:
{ocr_text}
Please carefully analyze the text and return a JSON object with the following fields:
- vendor (str): The company or person who issued the invoice (look for company name at top)
- date (str): The invoice date in YYYY-MM-DD format (convert from any format found)
- amount (float): The total amount due (look for "TOTAL DUE", "TOTAL", or sum at bottom)
- description (str): A concatenated description of the services/items
- subtotal (float): The subtotal before tax and fees
- tax (float): Any sales tax amount
- service_fee (float): Any service fee amount
Be especially careful to:
1. Look for numerical values near labels like "TOTAL DUE", "SALES TAX", etc.
2. Parse dates in format like "17/10/2024" to "2024-10-17"
3. Combine line items into the description
4. Extract the invoice number after 'INVOICE #'
If any field cannot be determined, use null for that field.
"""
# Call GPT-4 with the prompt
response = openai.ChatCompletion.create(
engine="gpt-4", # Replace with your GPT-4 deployment name
messages=[
{"role": "system", "content": "You are an expert financial document analyzer."},
{"role": "user", "content": prompt}
]
)
# Extract and return the response
result = response['choices'][0]['message']['content']
return result
Example OCR text
ocr_text = """
Invoice #
12345678
Date: 10/15/2024
Vendor: Acme Corp
Subtotal: $1000.00
Tax: $50.00
Service Fee: $25.00
TOTAL DUE: $1075.00
Description: Consulting services for October 2024
"""
Call the function
output = analyze_invoice(ocr_text)
Print the result
print(output)
Post-Processing and Integration
Validate Output: Use regex or other text analysis methods to verify and refine GPT-4’s output.
Integrate with Azure:
Store results in Azure Blob Storage or Azure SQL Database.
Use Power BI or Azure Logic Apps for data visualization or workflow automation.
Azure AI search
Allows for:
Document indexing
Document searching
AI-powered extraction using custom models
API Gateway/Middleware
Build middleware layer to facilitate communication between Ollama (running locally or on a server) and Microsoft cloud services
Handle document preprocessing, call Ollama for inference, then push results to an Azure storage service
Containerisation - Azure Kubernetes Service
Deploy Ollama within AKS on Microsoft Azure
https://learn.microsoft.com/en-us/azure/ai-services/openai/
https://learn.microsoft.com/en-us/azure/search/
https://learn.microsoft.com/en-us/azure/search/search-features-list
https://learn.microsoft.com/en-us/azure/aks/