Skip to content

docs(pg_lexer): add some info about libpg_query and SyntaxKind #189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ We plan to support all of the above for SQL and PL/pgSQL function bodies too!

Despite the rising popularity of Postgres, support for the PL/pgSQL in IDEs and editors is limited. While there are some _generic_ SQL Language Servers[^1] offering the Postgres syntax as a "flavor" within the parser, they usually fall short due to the ever-evolving and complex syntax of PostgreSQL. There are a few proprietary IDEs[^2] that work well, but the features are only available within the respective IDE.

This Language Server is designed to support Postgres, and only Postgres. The server uses [libpg_query](https://github.com/pganalyze/libpg_query), therefore leveraging the PostgreSQL source to parse the SQL code reliably. Using Postgres within a Language Server might seem unconventional, but it's the only reliable way of parsing all valid PostgreSQL queries. You can find a longer rationale on why This is the Way™ [here](https://pganalyze.com/blog/parse-postgresql-queries-in-ruby). While libpg_query was built to execute SQL, and not to build a language server, any shortcomings have been successfully mitigated in the `parser` crate. You can read the [commented source code](./crates/parser/src/lib.rs) for more details on the inner workings of the parser.
This Language Server is designed to support Postgres, and only Postgres. The server uses [libpg_query](https://github.com/pganalyze/libpg_query), both as a git submodule for access to its protobuf file and as the [pg_query](https://crates.io/crates/pg_query/5.0.0) rust crate, therefore leveraging the PostgreSQL source to parse the SQL code reliably. Using Postgres within a Language Server might seem unconventional, but it's the only reliable way of parsing all valid PostgreSQL queries. You can find a longer rationale on why This is the Way™ [here](https://pganalyze.com/blog/parse-postgresql-queries-in-ruby). While libpg_query was built to execute SQL, and not to build a language server, any shortcomings have been successfully mitigated in the `parser` crate. You can read the [commented source code](./crates/parser/src/lib.rs) for more details on the inner workings of the parser.

Once the parser is stable, and a robust and scalable data model is implemented, the language server will not only provide basic features such as semantic highlighting, code completion and syntax error diagnostics, but also serve as the user interface for all the great tooling of the Postgres ecosystem.

Expand Down Expand Up @@ -86,8 +86,7 @@ The server binary will be installed in `.cargo/bin`. Make sure that `.cargo/bin`

### Github CodeSpaces

Currently, Windows does not support `libpg_query`. You can setup your development environment
on [CodeSpaces](https://github.com/features/codespaces).
You can setup your development environment on [CodeSpaces](https://github.com/features/codespaces).

After your codespace boots up, run the following command in the shell to install Rust:

Expand Down
8 changes: 8 additions & 0 deletions crates/pg_lexer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# pg_lexer

The `pg_lexer` crate exposes the `lex` method, which turns an SQL query text into a `Vec<Token>>`: the base for the `pg_parser` and most of pgtools's operations.

A token is always of a certain `SyntaxKind` kind. That `SyntaxKind` enum is derived from `libpg_query`'s protobuf file.

The SQL query text is mostly lexed using the `pg_query::scan` method (`pg_query` is just a Rust wrapper around `libpg_query`).
However, that method does not parse required whitespace tokens, so the `lex` method takes care of parsing those and merging them into the result.
7 changes: 7 additions & 0 deletions crates/pg_lexer_codegen/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# pg_lexer_codegen

This crate is responsible for reading `libpg_query`'s protobuf file and turning it into the Rust enum `SyntaxKind`.

It does so by reading the file from the installed git submodule, parsing it with a protobuf parser, and using a procedural macro to generate the enum.

Rust requires procedural macros to be defined in a different crate than where they're used, hence this \_codegen crate.