How to use Azure AI Doc Intel to digitize historical documents
Starting point:
https://azure.microsoft.com/en-us/products/ai-services/ai-document-intelligence
Azure offers discounts for students
How to use Doc Intel Studio to train a custom extraction model:
https://www.microsoft.com/en-us/videoplayer/embed/RE5fX1c?postJsllMsg=true
Labelling:
https://www.microsoft.com/en-us/videoplayer/embed/RE5fZKB?postJsllMsg=true
Correct recognized values:
Train and Benchmark a custom Forms Recognizer using Forms Recognizer Studio - Microsoft Community Hub
Project sharing using Doc Intel Studio
https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/how-to-guides/project-share-custom-models?view=doc-intel-4.0.0#granted-access-and-permissions
Azure storage permission:
https://learn.microsoft.com/en-us/azure/role-based-access-control/role-assignments-portal?tabs=delegate-condition
Use LLMs to correct OCR errors
https://twitter.com/marquezxavier/status/1607615809274400771?s=20&t=1N_M5eeoNVZszOD_jKzcLw
Use LLMs to extract structured outputs from pdf
https://x.com/cameron_pfiffer/status/1854264341333418489?s=46&t=wd53aBgIFAKkLDoBZHD7Fw