Sunday, July 9, 2023

Hackathon Posts - Oracle AI Service: Document Understanding

 Few posts planned for use in expected Hackathons. 

We want to use Oracle AI  Service of Document Understanding. Personally it seem like a nice combination of the Language and Vision Services.

Oracle Cloud Infrastructure (OCI) Document Understanding is an AI service that enables developers to extract text, tables, and other key data from document files through APIs and command line interface tools. With OCI Document Understanding, you can automate tedious business processing tasks with prebuilt AI models and customize document extraction to fit your industry-specific needs. 

This is the documentation home, with most of the essentials.

 


Permissions

First, we need to gain access to the service. It can be granted to all OCI users of the tenancy or specific group you belong to. For this we need to create a Policy that allows to use ai-service-document-family in tenancy.

    To create a Policy, in the burger menu go to "Identity & Security" and select Policies. There press "Create Policy". Once you did enter Name and description of the Policy and turn on the "show manual editor" under Policy Builder. Now you have 2 options to allow use for specific group or all the users. Enter one of the following:
  •  allow group <group-name> to use ai-service-document-family in tenancy
  • allow any-user to use ai-service-document-family in tenancy 
 
For document we probably need access to object storage where the source document can be located.
In Hackaton we might be careless and grant the general access to object storage:
  • allow group <group_name> to use object-family in tenancy
 In real life we will probably restrict it to a specific compartment:
  • allow group <group_name> to use object-family in compartment <input_bucket_located_object_storage_compartment>
We are required to have a policy to access output location in object storage of a specific compartment
  • allow group <group_name> to manage object-family in compartment <output_bucket_located_object_storage_compartment>
 
In case you plan to use later the Oracle Cloud shell, you might want to do the same with "cloud-shell in tenancy". For example: allow any-user to use cloud-shell in tenancy 
 

 Intro

Now we can access the AI Service - Document Understanding (under the Analytics & AI menu).
 


I really liked the way the actual product in OCI seems to be self documented:


On the service page you can see links to:

 




 

At the bottom of the page we see the Service capabilities:

  • Text extraction -Provides word-level and line level text as well as the bounding box coordinates of where the text is located. Optionally, you can create a searchable PDF which embeds a transparent layer on top of a document image in PDF format to make it searchable by keywords.
  • Table extraction - Identifies tables and individual cells in order to extract content in tabular format.
  • Key value extraction - Identifies a predefined set of key fields from documents such as receipts, invoices, driver licenses, and passports. Note: some supported documents may be in limited availability (LA) only.
  • Document classification - Classifies documents into different types based on their visual appearance and high-level features, including invoice, receipt, bank statement, driver license, passport, tax form, and resume.
  • Custom key value extraction - Key value extraction model trained on your own labeled dataset for documents like domain-specific forms and intake documents.
  • Custom document classification - Document classification model trained on your own labeled dataset for industry-specific document types or granular types of one document


On the left part of the screen they are the link to use the console UI:


 The sources can be original Demo files or your Local files ot you Object storage.



 See also:

 


No comments:

Post a Comment