Few posts planned for use in expected Hackathons.
We want to use Oracle AI Service of Document Understanding. Personally it seem like a nice combination of the Language and Vision Services.
Oracle Cloud Infrastructure (OCI) Document Understanding is an AI service that enables developers to extract text, tables, and other key data from document files through APIs and command line interface tools. With OCI Document Understanding, you can automate tedious business processing tasks with prebuilt AI models and customize document extraction to fit your industry-specific needs.
This is the documentation home, with most of the essentials.
Permissions
First, we need to gain access to the service. It can be granted to all OCI users of the tenancy or specific group you belong to. For this we need to create a Policy that allows to use ai-service-document-family in tenancy.
- allow group <group-name> to use ai-service-document-family in tenancy
- allow any-user to use ai-service-document-family in tenancy
- allow group <group_name> to use object-family in tenancy
- allow group <group_name> to use object-family in compartment <input_bucket_located_object_storage_compartment>
- allow group <group_name> to manage object-family in compartment <output_bucket_located_object_storage_compartment>
Intro
I really liked the way the actual product in OCI seems to be self documented:
On the service page you can see links to:
- The service Playlist here Including 47 minutes intro, Guide how to set up policies and a Demo of Key value labeling.
- A link to self-paced workshop (github) that seems similar to the one on LiveLabs Oracle here.
- Service Overview document
- Pretrained Document AI Models Documentation
- Rest API documentation
- Python Sample code
At the bottom of the page we see the Service capabilities:
- Text extraction -Provides word-level and line level text as well as the bounding box coordinates of where the text is located. Optionally, you can create a searchable PDF which embeds a transparent layer on top of a document image in PDF format to make it searchable by keywords.
- Table extraction - Identifies tables and individual cells in order to extract content in tabular format.
- Key value extraction - Identifies a predefined set of key fields from documents such as receipts, invoices, driver licenses, and passports. Note: some supported documents may be in limited availability (LA) only.
- Document classification - Classifies documents into different types based on their visual appearance and high-level features, including invoice, receipt, bank statement, driver license, passport, tax form, and resume.
- Custom key value extraction - Key value extraction model trained on your own labeled dataset for documents like domain-specific forms and intake documents.
- Custom document classification - Document classification model trained on your own labeled dataset for industry-specific document types or granular types of one document
On the left part of the screen they are the link to use the console UI:
The sources can be original Demo files or your Local files ot you Object storage.
See also:
No comments:
Post a Comment