diff --git a/README.md b/README.md index 9ffe112..e7cd2fc 100644 --- a/README.md +++ b/README.md @@ -10,9 +10,6 @@ Classic OCR (Object Character Recognition) models lack reasoning ability based o This solution uses Azure Document Intelligence combined with GPT4-Vision. Each of the tools have their strong points and the hybrid approach is better than any of them alone. -> Notes: -> - The Azure OpenAI model needs to be vision capable i.e. GPT-4T-0125, 0409 or Omni - ## Solution Overview @@ -22,17 +19,6 @@ This solution uses Azure Document Intelligence combined with GPT4-Vision. Each o ![architecture](docs/ArchitectureOverview.png) -## Prerequisites -### Azure OpenAI Resource - -Before deploying the solution, you need to create an OpenAI resource and deploy a model that is vision capable. - -1. **Create an OpenAI Resource**: - - Follow the instructions [here](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/create-resource) to create an OpenAI resource in Azure. - -2. **Deploy a Vision-Capable Model**: - - Ensure the deployed model supports vision, such as GPT-4T-0125, GPT-4T-0409 or GPT-4-Omni. - ## Deployment @@ -41,7 +27,6 @@ Before deploying the solution, you need to create an OpenAI resource and deploy 1. **Prerequisites**: - Install [Azure Developer CLI](https://learn.microsoft.com/en-us/azure/developer/azure-developer-cli/install-azd). - Ensure you have access to an Azure subscription. - - Create an OpenAI resource and deploy a vision-capable model. 2. **Deployment Steps**: - Run the following commands to login (if needed): @@ -68,7 +53,25 @@ Before deploying the solution, you need to create an OpenAI resource and deploy ## Running the Streamlit Frontend (recommended) To run the Streamlit app `app.py` located in the `frontend` folder, follow these steps: +* Set up a virtual environment (Preferred) +```bash +python -m venv argus +``` +Once you’ve created a virtual environment, you may activate it. +On Windows, run: +```bash +argus\Scripts\activate +``` +On Unix or MacOS, run: +```bash +source argus/bin/activate +``` +To deactivate : +```bash +deactivate +``` +> More information about virtual environments can be found [here](https://docs.python.org/3/tutorial/venv.html) 1. Install the required dependencies by running the following command in your terminal: ```sh pip install -r frontend/requirements.txt diff --git a/frontend/.env.temp b/frontend/.env.temp index 0b9b2d1..bafa535 100644 --- a/frontend/.env.temp +++ b/frontend/.env.temp @@ -1,6 +1,6 @@ -BLOB_ACCOUNT_URL="" +BLOB_ACCOUNT_URL="add here" # an example "https://sai664qk9lslfvbi.blob.core.windows.net/" CONTAINER_NAME="datasets" -COSMOS_URL="" +COSMOS_URL="add here" # an example "https://cbi44qk5lxyzvbi.documents.azure.com" COSMOS_DB_NAME="doc-extracts" COSMOS_DOCUMENTS_CONTAINER_NAME="documents" -COSMOS_CONFIG_CONTAINER_NAME="configuration" \ No newline at end of file +COSMOS_CONFIG_CONTAINER_NAME="configuration" diff --git a/infra/main.bicep b/infra/main.bicep index daa408e..fa2c76d 100644 --- a/infra/main.bicep +++ b/infra/main.bicep @@ -21,17 +21,15 @@ param functionAppName string = 'fa${uniqueString(resourceGroup().id)}' param appServicePlanName string = '${functionAppName}-plan' -// Define the Document Intelligence resource name -param documentIntelligenceName string = 'di${uniqueString(resourceGroup().id)}' +// Define the name of the Azure OpenAI resource name +param azureOpenaiResourceName string = 'arg-aoai' +// Define the name of the Azure OpenAI model name +param azureOpenaiDeploymentName string = 'gpt-4o' +// Define the Document Intelligence resource name +param documentIntelligenceName string = 'di${uniqueString(resourceGroup().id)}' -// Define the Azure OpenAI parameters -@secure() -param azureOpenaiEndpoint string -@secure() -param azureOpenaiKey string -param azureOpenaiModelDeploymentName string param timestamp string = utcNow('yyyy-MM-ddTHH:mm:ssZ') var sanitizedTimestamp = replace(replace(timestamp, '-', ''), ':', '') @@ -185,6 +183,42 @@ resource appServicePlan 'Microsoft.Web/serverfarms@2021-03-01' = { tags: commonTags } +//Define the OpenAI resource +resource openai 'Microsoft.CognitiveServices/accounts@2023-05-01' = { + name: azureOpenaiResourceName + location: location + + sku: { + name: 'S0' + } + kind: 'OpenAI' + properties: { + + } + +} +// Define the OpenAI deployment +resource openaideployment 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = { + name: azureOpenaiDeploymentName + sku: { + name: 'GlobalStandard' + capacity: 30 + } + parent: openai + properties: { + model: { + name: 'gpt-4o' + format: 'OpenAI' + version: '2024-05-13' + + } + raiPolicyName: 'Microsoft.Default' + versionUpgradeOption: 'OnceCurrentVersionExpired' + + } +} + + // Define the Document Intelligence resource resource documentIntelligence 'Microsoft.CognitiveServices/accounts@2021-04-30' = { name: documentIntelligenceName @@ -276,15 +310,15 @@ resource functionApp 'Microsoft.Web/sites@2021-03-01' = { } { name: 'AZURE_OPENAI_ENDPOINT' - value: azureOpenaiEndpoint + value: '${openai.properties.endpoint}openai/deployments/gpt-4o/chat/completions?api-version=2024-02-15-preview' } { name: 'AZURE_OPENAI_KEY' - value: azureOpenaiKey + value: listKeys(openai.id, '2024-10-01').key1 } { name: 'AZURE_OPENAI_MODEL_DEPLOYMENT_NAME' - value: azureOpenaiModelDeploymentName + value: openaideployment.name } { name: 'FUNCTIONS_WORKER_PROCESS_COUNT'