-
Notifications
You must be signed in to change notification settings - Fork 197
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Intelligent document processing v8.7-alpha5 (#4779)
* Initial setup * Initial setup of IDP structure * Add document extraction * Add initial content * Edits to unstructured extraction * Add document automation * Self-Managed * Edits * Wording edits * Storage limit * Add reference items * TW edits * Edits * Add image * TW edits * Meta and text edits * Add architecture diagram * Add architecture content * TW edits * TW edits * TW edits * Configure IDP section * TW edits * Update screenshots * TW edits * TW edits * TW edits * Unstructured data extraction pages * TW edits * Changes to remove document automation and structured extraction * Additional TW edits * TW edits * Integrate IDP edits * TW edits * TW edits * Add Configuration page * Configure page edits * Config edits * Config changes * Fix link format * Rename to document template from project and add manual anchor headings * Add versions * Add versions * Version edits * Version edits * TW edits * Update config guide * Add document storage and integration parameters * TW edit * Add extraction models * Add note about SM SC and remove references * Final edits * cluster note * Add versions detail * TW edit * Add review edits * Backport to 8.7 * Add keywords meta * Edits following Yana review
- Loading branch information
1 parent
a675a14
commit ce7c9bd
Showing
76 changed files
with
1,558 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
45 changes: 45 additions & 0 deletions
45
docs/components/modeler/web-modeler/idp/idp-applications.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
--- | ||
id: idp-applications | ||
title: IDP applications | ||
description: "Create and manage your intelligent document processing document extraction templates in an IDP application folder." | ||
--- | ||
|
||
import IdpApplicationImg from './img/idp-application.png'; | ||
import IdpApplicationModalImg from './img/idp-application-modal.png'; | ||
|
||
Create and manage your IDP document extraction templates in an **IDP application**. | ||
|
||
<img src={IdpApplicationImg} alt="IDP application screen" style={{marginTop: '0'}} /> | ||
|
||
## Create an IDP application | ||
|
||
To create an IDP application: | ||
|
||
1. In a Web Modeler project, select **Create new** > **IDP application** to open the **Create an IDP application** modal. | ||
<img src={IdpApplicationModalImg} alt="IDP application screen" width="550px" style={{marginTop: '0'}} /> | ||
- **Name**: Enter a name for the IDP application. | ||
- **Select a cluster**: Select the cluster you want to use for modeling and testing your document extraction. | ||
1. Click **Create** to create the IDP application. | ||
|
||
1. You can now create [document extraction](idp-document-extraction.md) templates inside your IDP application folder. | ||
|
||
<!-- 1. You can now create [document extraction](idp-document-extraction.md) and [document automation](idp-document-automation.md) projects inside your IDP application folder. --> | ||
|
||
:::note | ||
|
||
- Camunda recommends using a development (dev) cluster for your IDP applications. | ||
- You must [configure the required connector secrets](idp-configuration.md#configure-idp) on the selected cluster. | ||
- You cannot change the selected cluster for the IDP application once it has been created. | ||
|
||
::: | ||
|
||
## IDP application clusters | ||
|
||
The following requirements and limitations apply to IDP application clusters: | ||
|
||
| Requirement/limitation | Description | | ||
| :--------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| Connector secrets | You must [configure the required connector secrets](idp-configuration.md#configure-idp) on the selected cluster. | | ||
| Cluster health | IDP applications and projects are only fully operational when linked to a healthy, active cluster. You can select an unstable or unhealthy cluster when first creating an IDP application, and change to a stable cluster once available. | | ||
| Document handling | When creating an IDP application folder, you can only select a cluster that supports [Camunda document handling](/components/concepts/document-handling.md). For example, the cluster must be version 8.7 or higher. | | ||
| Changing cluster | You cannot change the selected cluster once the IDP application has been created. | |
41 changes: 41 additions & 0 deletions
41
docs/components/modeler/web-modeler/idp/idp-configuration.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
--- | ||
id: idp-configuration | ||
title: Configure IDP | ||
description: "Set up and configure intelligent document processing (IDP) in Camunda 8 SaaS and Self-Managed." | ||
--- | ||
|
||
import IdpSecretsImg from './img/idp-connector-secrets.png'; | ||
import TickImg from '/static/img/icon-list-tick.png'; | ||
import CrossImg from '/static/img/icon-list-cross.png'; | ||
|
||
Configure IDP for your Camunda 8 setup and make sure IDP can access the required components and credentials. | ||
|
||
:::note | ||
IDP is only supported for Camunda 8 SaaS with the 8.7.0-alpha5 release. Support for Camunda 8 Self-Managed and Camunda 8 Run is planned for delivery with the 8.7 release. | ||
::: | ||
|
||
## Prerequisites | ||
|
||
The following prerequisites are required for IDP: | ||
|
||
| Prerequisite | Description | | ||
| :------------------------ || | ||
| Amazon Web Services (AWS) | <ul><li><p>Create a valid [AWS Identity and Access Management (IAM) user](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users.html) with permissions configured to allow access to Amazon Bedrock, AWS S3, and Amazon Textract (for example, `AmazonBedrockFullAccess`).</p></li><li><p>Obtain and store the [access key pair](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html) (_access key_ and _secret access key_) for this IAM user. These are required during IDP configuration.</p></li><li><p>Create an [Amazon AWS S3 bucket](https://aws.amazon.com/s3/) named `idp-extraction-connector` that can be used by IDP for document storage during document analysis and extraction.</p></li></ul> | | ||
| Web Modeler | <ul><li><p>Web Modeler is required to create, manage, publish, and integrate [IDP applications](idp-applications.md) and [document extraction](idp-document-extraction.md) templates.</p></li><li><p>IDP does not support Desktop Modeler.</p></li></ul> | | ||
|
||
## Configure IDP | ||
|
||
Once you have completed all the required prerequisites, configure IDP in a `dev` cluster as follows: | ||
|
||
### Add AWS connector secrets to cluster {#aws-secrets} | ||
|
||
Add your Amazon AWS IAM user _access key_ and _secret key_ as [connector secrets](/components/console/manage-clusters/manage-secrets.md) to the cluster, using the following names: | ||
|
||
- _Access key_: `IDP_AWS_ACCESSKEY` | ||
- _Secret key_: `IDP_AWS_SECRETKEY` | ||
|
||
<img src={IdpSecretsImg} alt="Connector secrets" style={{width: '750px'}} /> | ||
|
||
:::note | ||
You can rename these connector secrets if you want to change the testing bucket used in other environments (such as `test`, `stage` or `prod`). If you do this, you must also change these names to match within the **Authentication** section of the Properties panel for any related published document extraction templates. | ||
::: |
35 changes: 35 additions & 0 deletions
35
docs/components/modeler/web-modeler/idp/idp-document-automation.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
--- | ||
id: idp-document-automation | ||
title: Document automation | ||
description: "Document automation allows you to extract data from complex documents based on one or more linked document extraction projects." | ||
--- | ||
|
||
Extract data from complex documents based on one or more linked [document extraction](idp-document-extraction.md) projects. | ||
|
||
:::note | ||
Document automation is in development for a future release. | ||
::: | ||
|
||
## About document automation | ||
|
||
Document automation allows you to automatically extract data from complex PDF documents. | ||
|
||
For example, if you want to process large multi-page PDFs containing multiple document types (invoices, reports, forms), you can create a document automation project to extract the specific data you want. | ||
|
||
- You must link at least one [document extraction](idp-document-extraction.md) project for the LLM model to accurately analyze, classify, and extract data. | ||
- Choose and test different LLM models to find the model that best suits your budget and accuracy requirements. | ||
- Document classification automatically categorizes documents into predefined classes/types, based on their content. | ||
|
||
<!-- ## Create document automation project | ||
To create a document automation project: | ||
Content... | ||
## Document automation steps | ||
Complete the following steps to configure and publish an unstructured data extraction project: | ||
1. [Step 1: Add projects]: Link one or more document extraction projects to help the system analyze and categorize your documents. | ||
1. [Step 2: Test classification] Select an LLM and test the document classification results. | ||
1. [Step 3: Publish]: Publish the project to make it available to use in your document processing BPMN diagrams. --> |
47 changes: 47 additions & 0 deletions
47
docs/components/modeler/web-modeler/idp/idp-document-extraction.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
--- | ||
id: idp-document-extraction | ||
title: Document extraction | ||
description: "Document extraction projects form the basis for using intelligent document processing (IDP) in your end-to-end processes. Extract data from a single type of structured or unstructured document." | ||
--- | ||
|
||
import IdpExtractionProjectModalImg from './img/idp-create-extraction-project-modal.png'; | ||
|
||
Extract data from a single type of structured or unstructured document. | ||
|
||
## About document extraction | ||
|
||
Document extraction templates form the basis for using IDP in your end-to-end processes. | ||
|
||
- Create a separate document extraction template for each type of document you want to categorize and extract data from, such as an invoice, a report, identity document, and so on. | ||
- Published document extraction templates can then be used to [integrate IDP into your processes](idp-integrate.md). | ||
<!-- - Published extraction projects can be [integrated into your processes](idp-integrate.md) or linked to a [document automation](idp-document-automation.md) project. --> | ||
|
||
## Create document extraction template | ||
|
||
To create a new document extraction template: | ||
|
||
1. In your [IDP application](idp-applications.md), select **Create document extraction template** or **Create new** > **Document extraction**. | ||
1. Configure the general document extraction template information and select the extraction method. | ||
<img src={IdpExtractionProjectModalImg} alt="Create an extraction project modal" width="600px" style={{marginTop: '0'}} /> | ||
|
||
- **Name**: Enter a descriptive name for the type of document, such as “Invoice type A” for example. | ||
- **Description**: Enter a description to provide more detailed information about the document type. | ||
- **Extraction method**: Select the **Unstructured data extraction** extraction method. | ||
|
||
<!-- - **Extraction method**: Select an extraction method: | ||
- **Form extraction**: Select this method to extract data from structured documents. | ||
- **Unstructured data extraction**: Select this method to extract data from unstructured documents. --> | ||
|
||
1. Click **Create** to create and open the new document extraction template. | ||
1. Configure the document extraction template to [extract unstructured data](idp-unstructured-extraction.md). | ||
|
||
<!-- - [Extract structured data](idp-structured-extraction.md): Configure and publish a structured data extraction project. | ||
- [unstructured data extraction project](idp-unstructured-extraction.md): Configure and publish an unstructured data extraction project. --> | ||
|
||
:::note | ||
The **Form Extraction** extraction method is currently in development for a future release. This method will allow you to create a document extraction template for extracting data from structured document types. | ||
::: | ||
|
||
<!-- :::tip | ||
Not sure which extraction method to use? See [structured and unstructured documents](idp-key-concepts.md#structured-and-unstructured-documents) to help determine what type of document(s) you will be processing. | ||
::: --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
--- | ||
id: idp-integrate | ||
title: Integrate IDP into your processes | ||
description: "Integrate intelligent document processing (IDP) into your end-to-end processes in Web Modeler." | ||
--- | ||
|
||
import IdpElementImg from './img/idp-diagram-element.png'; | ||
|
||
Integrate your published document extraction templates into your end-to-end processes in Web Modeler. | ||
|
||
## Create and configure an IDP task | ||
|
||
You can apply a published document extraction template to a task or event via the append menu. For example: | ||
|
||
<img src={IdpElementImg} alt="An overview of intelligent document processing" style={{border: 'none', padding: '0', marginTop: '0', backgroundColor: 'transparent'}} /> | ||
|
||
- **From the canvas:** Select an element and click the **Change element** icon to change an existing element, or use the append feature to add a new element to the diagram. | ||
- **From the properties panel:** Navigate to the **Template** section and click **Select**. | ||
- **From the side palette:** Click the **Create element** icon. | ||
|
||
You can then configure the document extraction template in the properties panel, via the following sections: | ||
|
||
## Input message data | ||
|
||
### Document | ||
|
||
Specify the document object variable used for document handling, provided as a [FEEL expression](/components/modeler/feel/what-is-feel.md) with the document reference. | ||
|
||
Example: `document[1]`. | ||
|
||
### AWS S3 Bucket name | ||
|
||
Specify the name of the Amazon AWS S3 bucket where documents can be temporarily stored during Amazon Textract analysis. | ||
|
||
Example: `idp-extraction-connector` (for the Amazon AWS S3 bucket used for document storage during extraction). | ||
|
||
## Output mapping | ||
|
||
Specify the process variables that you want to map and export the IDP extraction connector response into. | ||
|
||
:::info | ||
To learn more about output mapping, see [variable/response mapping](/components/connectors/use-connectors/index.md#variableresponse-mapping). | ||
::: | ||
|
||
### Result variable | ||
|
||
You can export the complete IDP extraction connector response (for example, the key value pairs extracted from the document) into a dedicated variable that you can then access anywhere in a process. To do this, enter a unique dedicated variable name in the **Result variable** field. | ||
|
||
Example: `IDPResult` | ||
|
||
### Result expression | ||
|
||
In addition, you can choose to unpack the content of the response into multiple process variables using the **Result expression** field, as a [FEEL Context Expression](/components/concepts/expressions.md). | ||
|
||
## Error handling | ||
|
||
If an error occurs, the IDP extraction connector throws an error and includes the error response in the error variable in Operate. | ||
|
||
### Error expression | ||
|
||
You can handle an IDP extraction connector error using an Error Boundary Event and [error expressions](/components/connectors/use-connectors/index.md#error-expression). | ||
|
||
## Retries | ||
|
||
### Retries | ||
|
||
Specify the number of [retries](/components/connectors/use-connectors/outbound.md#retries) (times) the IDP extraction connector repeats execution if it fails. | ||
|
||
### Retry backoff | ||
|
||
Specify a custom **Retry backoff** interval between retries instead of the default behavior of retrying immediately. | ||
|
||
## Execution listeners | ||
|
||
Add and manage [execution listeners](/components/concepts/execution-listeners.md) to allow users to react to events in the workflow execution lifecycle by executing custom logic. |
Oops, something went wrong.