page_type | languages | products | name | description | |||||
---|---|---|---|---|---|---|---|---|---|
sample |
|
|
Document Processing with Azure AI Samples |
This collection of samples demonstrates how to use various Azure AI capabilities to build solution to extract structured data, classify, and analyze documents. |
This repository contains a collection of code samples that demonstrate how to use various Azure AI capabilities to process documents.
The samples are intended to help engineering teams establish techniques with Azure AI Foundry, Azure OpenAI, and Azure Document Intelligence to build solutions to extract structured data, classify, and analyze documents.
The techniques demonstrated take advance of various capabilities from each service to:
- Reduce complexity of custom model training by taking advantage of the capabilities of Generative AI models to analyze and classify documents.
- Improve reliability in document processing by utilizing combining AI service capbilities to extract structured data from any document type, with high accuracy and confidence.
- Simplify document processing workflows by providing reusable code and patterns that can be easily modified and evaluated for most use cases.
Note
All data extraction samples provide both an accuracy and confidence score for the extracted data. The accuracy score is calculated based on the similarity between the extracted data and the ground truth data. The confidence score is calculated based on OCR analysis confidence and logprobs
in Azure OpenAI requests.
Sample | Description | Example Use Cases |
---|---|---|
Data Extraction - Azure AI Document Intelligence + Azure OpenAI GPT-4o | Demonstrates how to use Azure AI Document Intelligence pre-built layout and Azure OpenAI GPT models to extract structured data from documents. | Predominantly text-based documents such as invoices, receipts, and forms. |
Data Extraction - Azure AI Document Intelligence + Phi-3.5 MoE | Demonstrates how to use Azure AI Document Intelligence pre-built layout and Microsoft's Phi-3 models to extract structured data from documents. | Predominantly text-based documents such as invoices, receipts, and forms. |
Data Extraction - Azure OpenAI GPT-4o with Vision | Demonstrates how to use Azure OpenAI GPT-4o and GPT-4o-mini models to extract structured data from documents using their built-in vision capabilities. | Complex documents with a mix of text and images, including diagrams, signatures, selection marks, etc. such as reports and contracts. |
Data Extraction - Comprehensive Azure AI Document Intelligence + Azure OpenAI GPT-4o with Vision | Demonstrates how to improve the accuracy and confidence in extracting structured data from documents by combining Azure AI Document Intelligence and Azure OpenAI GPT-4o models with vision capabilities. | Any structured or unstructured document type. |
Classification - Azure OpenAI GPT-4o with Vision | Demonstrates how to use Azure OpenAI GPT-4o and GPT-4o-mini models to classify documents using their built-in vision capabilities. | Processing multiple documents types or documents with varying purposes, such as contracts, legal documents, and emails. |
Classification - Azure AI Document Intelligence + Embeddings | Demonstrates how to use Azure AI Document Intelligence pre-built layout and embeddings models to classify documents based on their content. | Processing multiple documents types or documents with varying purposes, such as contracts, legal documents, and emails. |
The sample repository comes with a Dev Container that contains all the necessary tools and dependencies to run the sample.
Important
An Azure subscription is required to run these samples. If you don't have an Azure subscription, create an account.
To use the Dev Container in GitHub Codespaces, follow these steps:
- Click on the
Code
button in the repository and selectCodespaces
. - Click on the + button to create a new Codespace using the provided
.devcontainer\devcontainer.json
configuration. - Once the Codespace is created, continue to the Azure environment setup section.
To use the Dev Container, you need to have the following tools installed on your local machine:
- Install Visual Studio Code
- Install Docker Desktop
- Install Remote - Containers extension for Visual Studio Code
To setup a local development environment, follow these steps:
Important
Ensure that Docker Desktop is running on your local machine.
- Clone the repository to your local machine.
- Open the repository in Visual Studio Code.
- Press
F1
to open the command palette and typeDev Containers: Reopen in Container
.
Once the Dev Container is up and running, continue to the Azure environment setup section.
Once the Dev Container is up and running, you can setup the necessary Azure services and run the samples in the repository by running the following command in a pwsh
terminal:
Note
For the most optimal sample experience, it is recommended to run the samples in East US
which will provide support for all the services used in the samples. Find out more about region availability for Azure AI Document Intelligence, and GPT-4o
, Phi-3.5 MoE
, and text-embedding-3-large
models.
az login
./Setup-Environment.ps1 -DeploymentName <UniqueDeploymentName> -Location <AzureRegion>
Note
If a specific Azure tenant is required, use the --tenant <TenantId>
parameter in the az login
command.
az login --tenant <TenantId>
Tip
If you want to preview the changes without deployment, you can add the -WhatIf
parameter to the Setup-Environment.ps1
script.
./Setup-Environment.ps1 -DeploymentName <UniqueDeploymentName> -Location <AzureRegion> -WhatIf
The script will deploy the following resources to your Azure subscription:
- Azure AI Foundry Hub & Project, a development platform for building AI solutions that integrates with Azure AI Services in a secure manner using Microsoft Entra ID for authentication.
- Note: Phi-3.5 MoE will be deployed as a PAYG serverless endpoint in the Azure AI Foundry Project with its primary key stored in the associated Azure Key Vault.
- Azure AI Services, a managed service for all Azure AI Services, including Azure OpenAI and Azure AI Document Intelligence.
- Note: GPT-4o and GPT-4o-mini will be deployed as Global Standard models with 10K TPM quota allocation.
text-embedding-3-large
will be deployed as a Standard model with 115K TPM quota allocation. These can be adjusted based on your quota availability in the main.bicep file.
- Note: GPT-4o and GPT-4o-mini will be deployed as Global Standard models with 10K TPM quota allocation.
- Azure Storage Account, required by Azure AI Foundry.
- Azure Monitor, used to store logs and traces for monitoring and troubleshooting purposes.
- Azure Container Registry, used to store container images for the Azure AI Foundry environment.
Note
All resources are secured by default with Microsoft Entra ID using Azure RBAC. Your user client ID will be added with the necessary least-privilege roles to access the resources created. A user-assigned managed identity will also be deployed for the Azure AI Foundry environment.
After the script completes, you can run any of the samples in the repository by following their instructions.
You can contribute to the repository by opening an issue or submitting a pull request. For more information, see the Contributing guide.
This project is licensed under the MIT License.