Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification needed: multiple assert statements evaluation order #67

Open
smhdfdl opened this issue Feb 13, 2025 · 4 comments
Open

Clarification needed: multiple assert statements evaluation order #67

smhdfdl opened this issue Feb 13, 2025 · 4 comments
Assignees

Comments

@smhdfdl
Copy link
Contributor

smhdfdl commented Feb 13, 2025

Mike asked this question ...

Given this DFDL annotation on an xs:sequence

<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:assert test="..." />
<dfdl:discriminator test="..." />
<dfdl:assert test="..." />
</xs:appinfo>

The spec seems silent about the evaluation order among these 3
statement annotations. Let's call them A, B, C.

First, the spec makes it clear that the discriminator B could be
evaluated earlier than either assertion, and even before some of the
sequence content.
This is to allow optimization by a DFDL implementation. The spec is
also clear that even if the parse of the sequence content fails,
discriminator B is evaluated (with the infoset being the state at the
time of the failure). Again this is to ensure the behavior matches
that of an optimizing DFDL implementation.

But let's assume an implementation does no such optimization.

I believe these evaluation orders are legal for the 3 statements after
the sequence content has been parsed:

A, B, C - if A fails, we do know B must still be evaluated.

B, A, C - this is the minimum sort of hoist/optimization, doing the
discriminator before the asserts.

Questions:

  1. Are any other orders of evaluation allowed?

  2. If evaluation of A fails, do we still evaluate assertion C? (My
    hope is the answer here is no, because that allows consecutive asserts
    to build on each other's assumptions. But the spec is unclear.)

  3. Can users depend on the failure of A to generate a message output?
    (ex: if the assert has a message attribute, can we state that this
    message will somehow be exhibited or logged by the implementation,
    unless the failure is suppressed by backtracking at a point of
    uncertainty)

If A fails, the spec does say that B must still be evaluated.

But if A fails, will C be evaluated?

@smhdfdl smhdfdl self-assigned this Feb 13, 2025
@smhdfdl
Copy link
Contributor Author

smhdfdl commented Feb 13, 2025

Mike also added ...

Another question:

For asserts and discriminators with testKind pattern, if a given
annotation point in the schema has both, do we execute the pattern
discrminators first or use schema definition order? If the latter, do
we still evaluate the pattern discriminators even if the pattern
asserts fail?

@smhdfdl
Copy link
Contributor Author

smhdfdl commented Feb 13, 2025

Spec section 7.51 says the following.

"If the resolved set of statement annotations for a schema component contains multiple dfdl:assert statements, then those with testKind 'pattern' are executed before those with testKind 'expression' (the default). However, within each group the order of execution among them is not specified.
If one of the resolved set of asserts for a schema component is unsuccessful, and the failureType of the assert is ‘processingError’, then no further asserts in the set are executed."

That seems clear to me. Schema authors should not rely on the ordering of asserts.

Spec section 9.5.1 says

"Implementations are free to optimize by recognizing and executing discriminators or asserts with testKind 'expression' earlier so long as the resulting behavior is consistent with what results from the description above."

Spec section 9.5.2, as you indicate, says

"When parsing, an attempt to evaluate a discriminator MUST be made even if preceding statements or the parse of the schema component ended in a Processing Error.

This is because a discriminator's expression can evaluate to true thereby resolving a point of uncertainty even if the complete parsing of the construct ultimately caused a Processing Error."

So an attempt to answer your questions

  1. Yes

  2. No

  3. No because C might be evaluated first and fail

@smhdfdl
Copy link
Contributor Author

smhdfdl commented Feb 13, 2025

Regarding Mike's subsequent question about testKind 'pattern', my initial reaction is that the logic behind section 9.5.2 also applies here, it's just more restricted in that the only thing that could possibly fail (ahead of the discriminator being evaluated) is one of the asserts (which must also have testKind 'pattern').

If that is so, then we don't have to say anything further about relative order of assert and discriminator execution.

@smhdfdl
Copy link
Contributor Author

smhdfdl commented Feb 13, 2025

Discussed on this call https://github.com/OpenGridForum/DFDL/blob/master/calls/2025/2025-01-09_DFDL-WG-Call.md and Action 345 raised, concluding that a couple of clarifications are needed to the DFDL 1.0 spec section 7.5.1, specifically ...

If the resolved set of statement annotations for a schema component contains multiple dfdl:assert statements, then those with testKind 'pattern' are executed before those with testKind 'expression' (the default). However, within each group the order of execution among them is not specified.

... should be ...

If the resolved set of statement annotations for a schema component contains multiple dfdl:assert statements, then the subset with testKind 'pattern' are executed before the subset with testKind 'expression' (the default). However, within each subset the order of execution of the statements is not specified and is implementation dependent.

This issue raised and labeled as an erratum against the DFDL 1.0 spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

1 participant