Skip to content

Attach Diagnostic to syntax errors #14437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #14429
eliaperantoni opened this issue Feb 3, 2025 · 4 comments · Fixed by #15680 · May be fixed by #14986
Closed
Tracked by #14429

Attach Diagnostic to syntax errors #14437

eliaperantoni opened this issue Feb 3, 2025 · 4 comments · Fixed by #15680 · May be fixed by #14986
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@eliaperantoni
Copy link
Contributor

Is your feature request related to a problem or challenge?

For a query like:

WITH cte AS (SELECT 1 AS col), SELECT * FROM cte

The only message that the end user of an application built atop of DataFusion sees is:

SQL error: ParserError("Expected: AS, found: * at Line: 1, Column: 39")

We want to provide a richer message that references and highlights locations in the original SQL query, and contextualises and helps the user understand the error. In the end, it would be possible to display errors in a fashion akin to what was enabled by #13664 for some errors:

See #14429 for more information.

Describe the solution you'd like

Attach a well crafted Diagnostic to the DataFusionError, building on top of the foundations laid in #13664. See #14429 for more information.

The location of the error can probably be fetched from the error coming from sqlparser, requiring little changes to DataFausion itself.

Needs some thought to figure out which kinds of syntax errors can be supported and how. i.e. the example above is simply for when the parser expects token in a specific set; but what about unbalanced parenthesis, unrecognised keywords, etc.

Describe alternatives you've considered

No response

Additional context

No response

@eliaperantoni eliaperantoni added the enhancement New feature or request label Feb 3, 2025
@alamb alamb added the good first issue Good for newcomers label Feb 4, 2025
@alamb
Copy link
Contributor

alamb commented Feb 4, 2025

I think this is a good first issue as the need is clear and the tests in https://github.com/apache/datafusion/blob/85fbde2661bdb462fc498dc18f055c44f229604c/datafusion/sql/tests/cases/diagnostic.rs are well structured for extension.

@irenjj
Copy link
Contributor

irenjj commented Feb 7, 2025

take

@eliaperantoni
Copy link
Contributor Author

Hey @irenjj how is it going with this ticket :) Can I help with anything?

@logan-keede
Copy link
Contributor

See #14429 for more information.

@eliaperantoni How are you printing these messages?

Also, the error generated by the above mentioned query seems to be generated by sqlparser and then propagated to DataFusion. So, AFAIK it might not be possible to attach diagnostic for that particular error as TokenWithSpan is lost. Though we can do this for the syntax errors detected in DataFusion, till the day we have DfParser free of sqlparser::parser:Parser .

I have opened a PR #15680, it will take some time to be ready for review but if you have some time I would be grateful if you can let me know whether the approach makes sense to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
4 participants