Skip to content

Commit

Permalink
Intelligent document processing v8.7-alpha5 (#4779)
Browse files Browse the repository at this point in the history
* Initial setup

* Initial setup of IDP structure

* Add document extraction

* Add initial content

* Edits to unstructured extraction

* Add document automation

* Self-Managed

* Edits

* Wording edits

* Storage limit

* Add reference items

* TW edits

* Edits

* Add image

* TW edits

* Meta and text edits

* Add architecture diagram

* Add architecture content

* TW edits

* TW edits

* TW edits

* Configure IDP section

* TW edits

* Update screenshots

* TW edits

* TW edits

* TW edits

* Unstructured data extraction pages

* TW edits

* Changes to remove document automation and structured extraction

* Additional TW edits

* TW edits

* Integrate IDP edits

* TW edits

* TW edits

* Add Configuration page

* Configure page edits

* Config edits

* Config changes

* Fix link format

* Rename to document template from project and add manual anchor headings

* Add versions

* Add versions

* Version edits

* Version edits

* TW edits

* Update config guide

* Add document storage and integration parameters

* TW edit

* Add extraction models

* Add note about SM SC and remove references

* Final edits

* cluster note

* Add versions detail

* TW edit

* Add review edits

* Backport to 8.7

* Add keywords meta

* Edits following Yana review
  • Loading branch information
mesellings authored Mar 7, 2025
1 parent a675a14 commit ce7c9bd
Show file tree
Hide file tree
Showing 76 changed files with 1,558 additions and 2 deletions.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
id: external-sso
title: Connect your IdP with Camunda
keywords: [SSO, IDP, AzureAD, SAML]
keywords: [SSO, AzureAD, SAML]
description: "For enterprise customers, we support integrating external identity providers."
---

Expand Down
45 changes: 45 additions & 0 deletions docs/components/modeler/web-modeler/idp/idp-applications.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
id: idp-applications
title: IDP applications
description: "Create and manage your intelligent document processing document extraction templates in an IDP application folder."
---

import IdpApplicationImg from './img/idp-application.png';
import IdpApplicationModalImg from './img/idp-application-modal.png';

Create and manage your IDP document extraction templates in an **IDP application**.

<img src={IdpApplicationImg} alt="IDP application screen" style={{marginTop: '0'}} />

## Create an IDP application

To create an IDP application:

1. In a Web Modeler project, select **Create new** > **IDP application** to open the **Create an IDP application** modal.
<img src={IdpApplicationModalImg} alt="IDP application screen" width="550px" style={{marginTop: '0'}} />
- **Name**: Enter a name for the IDP application.
- **Select a cluster**: Select the cluster you want to use for modeling and testing your document extraction.
1. Click **Create** to create the IDP application.

1. You can now create [document extraction](idp-document-extraction.md) templates inside your IDP application folder.

<!-- 1. You can now create [document extraction](idp-document-extraction.md) and [document automation](idp-document-automation.md) projects inside your IDP application folder. -->

:::note

- Camunda recommends using a development (dev) cluster for your IDP applications.
- You must [configure the required connector secrets](idp-configuration.md#configure-idp) on the selected cluster.
- You cannot change the selected cluster for the IDP application once it has been created.

:::

## IDP application clusters

The following requirements and limitations apply to IDP application clusters:

| Requirement/limitation | Description |
| :--------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Connector secrets | You must [configure the required connector secrets](idp-configuration.md#configure-idp) on the selected cluster. |
| Cluster health | IDP applications and projects are only fully operational when linked to a healthy, active cluster. You can select an unstable or unhealthy cluster when first creating an IDP application, and change to a stable cluster once available. |
| Document handling | When creating an IDP application folder, you can only select a cluster that supports [Camunda document handling](/components/concepts/document-handling.md). For example, the cluster must be version 8.7 or higher. |
| Changing cluster | You cannot change the selected cluster once the IDP application has been created. |
41 changes: 41 additions & 0 deletions docs/components/modeler/web-modeler/idp/idp-configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
id: idp-configuration
title: Configure IDP
description: "Set up and configure intelligent document processing (IDP) in Camunda 8 SaaS and Self-Managed."
---

import IdpSecretsImg from './img/idp-connector-secrets.png';
import TickImg from '/static/img/icon-list-tick.png';
import CrossImg from '/static/img/icon-list-cross.png';

Configure IDP for your Camunda 8 setup and make sure IDP can access the required components and credentials.

:::note
IDP is only supported for Camunda 8 SaaS with the 8.7.0-alpha5 release. Support for Camunda 8 Self-Managed and Camunda 8 Run is planned for delivery with the 8.7 release.
:::

## Prerequisites

The following prerequisites are required for IDP:

| Prerequisite | Description |
| :------------------------ ||
| Amazon Web Services (AWS) | <ul><li><p>Create a valid [AWS Identity and Access Management (IAM) user](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users.html) with permissions configured to allow access to Amazon Bedrock, AWS S3, and Amazon Textract (for example, `AmazonBedrockFullAccess`).</p></li><li><p>Obtain and store the [access key pair](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html) (_access key_ and _secret access key_) for this IAM user. These are required during IDP configuration.</p></li><li><p>Create an [Amazon AWS S3 bucket](https://aws.amazon.com/s3/) named `idp-extraction-connector` that can be used by IDP for document storage during document analysis and extraction.</p></li></ul> |
| Web Modeler | <ul><li><p>Web Modeler is required to create, manage, publish, and integrate [IDP applications](idp-applications.md) and [document extraction](idp-document-extraction.md) templates.</p></li><li><p>IDP does not support Desktop Modeler.</p></li></ul> |

## Configure IDP

Once you have completed all the required prerequisites, configure IDP in a `dev` cluster as follows:

### Add AWS connector secrets to cluster {#aws-secrets}

Add your Amazon AWS IAM user _access key_ and _secret key_ as [connector secrets](/components/console/manage-clusters/manage-secrets.md) to the cluster, using the following names:

- _Access key_: `IDP_AWS_ACCESSKEY`
- _Secret key_: `IDP_AWS_SECRETKEY`

<img src={IdpSecretsImg} alt="Connector secrets" style={{width: '750px'}} />

:::note
You can rename these connector secrets if you want to change the testing bucket used in other environments (such as `test`, `stage` or `prod`). If you do this, you must also change these names to match within the **Authentication** section of the Properties panel for any related published document extraction templates.
:::
35 changes: 35 additions & 0 deletions docs/components/modeler/web-modeler/idp/idp-document-automation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
id: idp-document-automation
title: Document automation
description: "Document automation allows you to extract data from complex documents based on one or more linked document extraction projects."
---

Extract data from complex documents based on one or more linked [document extraction](idp-document-extraction.md) projects.

:::note
Document automation is in development for a future release.
:::

## About document automation

Document automation allows you to automatically extract data from complex PDF documents.

For example, if you want to process large multi-page PDFs containing multiple document types (invoices, reports, forms), you can create a document automation project to extract the specific data you want.

- You must link at least one [document extraction](idp-document-extraction.md) project for the LLM model to accurately analyze, classify, and extract data.
- Choose and test different LLM models to find the model that best suits your budget and accuracy requirements.
- Document classification automatically categorizes documents into predefined classes/types, based on their content.

<!-- ## Create document automation project
To create a document automation project:
Content...
## Document automation steps
Complete the following steps to configure and publish an unstructured data extraction project:
1. [Step 1: Add projects]: Link one or more document extraction projects to help the system analyze and categorize your documents.
1. [Step 2: Test classification] Select an LLM and test the document classification results.
1. [Step 3: Publish]: Publish the project to make it available to use in your document processing BPMN diagrams. -->
47 changes: 47 additions & 0 deletions docs/components/modeler/web-modeler/idp/idp-document-extraction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
id: idp-document-extraction
title: Document extraction
description: "Document extraction projects form the basis for using intelligent document processing (IDP) in your end-to-end processes. Extract data from a single type of structured or unstructured document."
---

import IdpExtractionProjectModalImg from './img/idp-create-extraction-project-modal.png';

Extract data from a single type of structured or unstructured document.

## About document extraction

Document extraction templates form the basis for using IDP in your end-to-end processes.

- Create a separate document extraction template for each type of document you want to categorize and extract data from, such as an invoice, a report, identity document, and so on.
- Published document extraction templates can then be used to [integrate IDP into your processes](idp-integrate.md).
<!-- - Published extraction projects can be [integrated into your processes](idp-integrate.md) or linked to a [document automation](idp-document-automation.md) project. -->

## Create document extraction template

To create a new document extraction template:

1. In your [IDP application](idp-applications.md), select **Create document extraction template** or **Create new** > **Document extraction**.
1. Configure the general document extraction template information and select the extraction method.
<img src={IdpExtractionProjectModalImg} alt="Create an extraction project modal" width="600px" style={{marginTop: '0'}} />

- **Name**: Enter a descriptive name for the type of document, such as “Invoice type A” for example.
- **Description**: Enter a description to provide more detailed information about the document type.
- **Extraction method**: Select the **Unstructured data extraction** extraction method.

<!-- - **Extraction method**: Select an extraction method:
- **Form extraction**: Select this method to extract data from structured documents.
- **Unstructured data extraction**: Select this method to extract data from unstructured documents. -->

1. Click **Create** to create and open the new document extraction template.
1. Configure the document extraction template to [extract unstructured data](idp-unstructured-extraction.md).

<!-- - [Extract structured data](idp-structured-extraction.md): Configure and publish a structured data extraction project.
- [unstructured data extraction project](idp-unstructured-extraction.md): Configure and publish an unstructured data extraction project. -->

:::note
The **Form Extraction** extraction method is currently in development for a future release. This method will allow you to create a document extraction template for extracting data from structured document types.
:::

<!-- :::tip
Not sure which extraction method to use? See [structured and unstructured documents](idp-key-concepts.md#structured-and-unstructured-documents) to help determine what type of document(s) you will be processing.
::: -->
75 changes: 75 additions & 0 deletions docs/components/modeler/web-modeler/idp/idp-integrate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
id: idp-integrate
title: Integrate IDP into your processes
description: "Integrate intelligent document processing (IDP) into your end-to-end processes in Web Modeler."
---

import IdpElementImg from './img/idp-diagram-element.png';

Integrate your published document extraction templates into your end-to-end processes in Web Modeler.

## Create and configure an IDP task

You can apply a published document extraction template to a task or event via the append menu. For example:

<img src={IdpElementImg} alt="An overview of intelligent document processing" style={{border: 'none', padding: '0', marginTop: '0', backgroundColor: 'transparent'}} />

- **From the canvas:** Select an element and click the **Change element** icon to change an existing element, or use the append feature to add a new element to the diagram.
- **From the properties panel:** Navigate to the **Template** section and click **Select**.
- **From the side palette:** Click the **Create element** icon.

You can then configure the document extraction template in the properties panel, via the following sections:

## Input message data

### Document

Specify the document object variable used for document handling, provided as a [FEEL expression](/components/modeler/feel/what-is-feel.md) with the document reference.

Example: `document[1]`.

### AWS S3 Bucket name

Specify the name of the Amazon AWS S3 bucket where documents can be temporarily stored during Amazon Textract analysis.

Example: `idp-extraction-connector` (for the Amazon AWS S3 bucket used for document storage during extraction).

## Output mapping

Specify the process variables that you want to map and export the IDP extraction connector response into.

:::info
To learn more about output mapping, see [variable/response mapping](/components/connectors/use-connectors/index.md#variableresponse-mapping).
:::

### Result variable

You can export the complete IDP extraction connector response (for example, the key value pairs extracted from the document) into a dedicated variable that you can then access anywhere in a process. To do this, enter a unique dedicated variable name in the **Result variable** field.

Example: `IDPResult`

### Result expression

In addition, you can choose to unpack the content of the response into multiple process variables using the **Result expression** field, as a [FEEL Context Expression](/components/concepts/expressions.md).

## Error handling

If an error occurs, the IDP extraction connector throws an error and includes the error response in the error variable in Operate.

### Error expression

You can handle an IDP extraction connector error using an Error Boundary Event and [error expressions](/components/connectors/use-connectors/index.md#error-expression).

## Retries

### Retries

Specify the number of [retries](/components/connectors/use-connectors/outbound.md#retries) (times) the IDP extraction connector repeats execution if it fails.

### Retry backoff

Specify a custom **Retry backoff** interval between retries instead of the default behavior of retrying immediately.

## Execution listeners

Add and manage [execution listeners](/components/concepts/execution-listeners.md) to allow users to react to events in the workflow execution lifecycle by executing custom logic.
Loading

0 comments on commit ce7c9bd

Please sign in to comment.