All articles
AI & Automation7 min read2024-01-08

Processing 60+ Insurance Document Variants with Azure AI Document Intelligence

How we handled wildly inconsistent insurance certificates across 60+ format variants using Azure AI Document Intelligence custom models.

Azure AIDocument IntelligenceOCRAutomation

The Challenge

Every carrier and insurer uses a different insurance certificate format. At Corcentric we handled 60+ variants — different field positions, fonts, table structures, and even scanned vs digital PDFs.

Manual entry was costing ~3 minutes per document. At volume, that's thousands of hours annually.

Azure AI Document Intelligence

We used two approaches:

1. Prebuilt Invoice Model (for standard e-invoices)

Azure's prebuilt prebuilt-invoice model covered ~70% of our documents out of the box.

2. Custom Model Training (for the other 30%)

For carrier-specific certificates, we trained custom extraction models:

var client = new DocumentAnalysisClient(
    new Uri(endpoint), 
    new AzureKeyCredential(apiKey)
);
 
var operation = await client.AnalyzeDocumentAsync(
    WaitUntil.Completed,
    modelId: "insurance-cert-v3",
    document: insuranceCertStream
);
 
var result = operation.Value;
 
foreach (var document in result.Documents)
{
    var policyNumber = document.Fields["PolicyNumber"].Value.AsString();
    var coverageAmount = document.Fields["CoverageAmount"].Value.AsDouble();
    var effectiveDate = document.Fields["EffectiveDate"].Value.AsDate();
    
    await _insuranceRepository.UpsertAsync(new InsuranceCertificate
    {
        PolicyNumber = policyNumber,
        CoverageAmount = coverageAmount,
        EffectiveDate = effectiveDate
    });
}

Handling Low-Confidence Extractions

We implemented a confidence-threshold routing system:

  • >90% confidence → auto-processed
  • 70–90% → flagged for human review in dashboard
  • <70% → queued for manual entry with AI-prefilled form

Results

  • 85% reduction in manual data entry hours
  • 60+ document variants handled by 4 custom models
  • Average processing time dropped from 3 min → 8 seconds
  • ROI achieved within 3 months of deployment