Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDP Connector should return response as Key/Value pair of extracted fields instead of json string #4145

Open
mathieu-stennier opened this issue Feb 26, 2025 · 3 comments · May be fixed by #4147
Assignees
Labels
kind:task Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc.

Comments

@mathieu-stennier
Copy link
Contributor

What should we do?

IDP Connector should return response as Key/Value pair of extracted fields instead of json string
Full conversation here.

Response provided by a deployed IDP connector would be something like this

"response":"\n\n{\n  \"PolicyholderName\": \"Residential Builders Inc.\",\n  \"PolicyHolderAddress\": \"Parklaan 88 3500 Hasselt Belgie\",\n  \"PolicyholderEnterpriseNumber\": \"1234567890\",\n  \"PolicyholderCapacity\": \"Project Owner\",\n  \"TotalProjectValue\": \"5000000 EUR\",\n  \"ProjectDescription\": \"This project involves the construction of a new 5-story residential apartment building on a currently vacant lot. The building will have reinforced concrete construction, with one underground parking level and five above-ground floors, including a ground floor with retail spaces.\"\n}"

Which is not easily parseable as variables that can then be used in isolations in next steps of the process.

See related FEEL expression feature gap here.

Why should we do it?

I would argue we should parse the results we get from LLM already and provide variables that can be easily mapped as result for further activities in the process.

@mathieu-stennier mathieu-stennier added the kind:task Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. label Feb 26, 2025
@sahilbhatoacamunda sahilbhatoacamunda self-assigned this Feb 26, 2025
@mathieu-stennier
Copy link
Contributor Author

@sahilbhatoacamunda,
@sbuettner recommends us to invest into a proper schema in response that allows you to add more feature in the future. ExtractionResult cotaining extracted fields might be enough for now but then at least we can extend it later on.

@mathieu-stennier
Copy link
Contributor Author

Note: changing this output is ok before 8.7 because we don't offer support for things built with alphas.

@sahilbhatoacamunda
Copy link
Contributor

sahilbhatoacamunda commented Feb 26, 2025

@mathieu-stennier @sbuettner
The ExtractionResult will have a response which is of type Map<String, Object>. This map will be built based on the IDP extraction fields passed as taxonomy items in the request.

Use cases

  1. The response from LLM is a valid JSON string (all taxonomy items available)> The Connector will parse the JSON string and build the final map by checking if all the taxonomy items are available in the response.
  2. The response from LLM is a valid JSON string (some taxonomy items are missing)> The Connector will parse the JSON string and build the final map by checking if all the taxonomy items are available in the response. The system will log a warning LLM model response is missing the following keys: ('missing key 1', 'missing key 2'}).
  3. The response from LLM is a valid JSON string containing nested response> The Connector will parse the JSON string promoting nested response and build the final map by checking if all the taxonomy items are available in the response.
  4. The response from LLM is an invalid JSON string > The Connector will log an error and return the empty map as a response.

Questions >
From points 3 and 4. Is it okay to continue the operation even if the response from LLM is an invalid JSON string OR if some keys are missing? I'm returning an empty/partial map in that case.

I created a draft PR with the above use cases in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:task Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc.
Projects
None yet
2 participants