-
Notifications
You must be signed in to change notification settings - Fork 47
Exhaustive MySQL Parser #157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 160 commits
Commits
Show all changes
169 commits
Select commit
Hold shift + click to select a range
ccc341b
MySQL AST Parser
adamziel 78fdf69
Fix parser overriding parts of the parse tree as it constructs them.
adamziel c8652d5
Output ParseTree using a class, not an array for much simpler processing
adamziel 137d6ca
Manually factor left recursion into right recursion in the grammar fi…
adamziel 0a2440c
Explore support for SQL_CALC_FOUND_ROWS
adamziel 0406d71
Support VALUES() call
adamziel 87573f2
Extract queries from MySQL test suite and test the parser against them
JanJakes 9629702
Implement handling for manually added lexer symbols
JanJakes d63bc6e
Fix passing nulls to "ctype_" functions
JanJakes 8e7e2e8
Add support for hex format x'ab12', X'ab12', and bin format x'01' and…
JanJakes ebcc17e
Fix wrong MySQL version conditions (AI hallucinations)
JanJakes 1551b0e
Implement the checkCharset() placeholder function
JanJakes cdd84b4
Document manual grammar factoring
JanJakes f50b515
Fix "alterOrderList" that has a wrong definition in the original grammar
JanJakes e267f67
Fix "createUser" that was incorrectly converted from ANTLR to EBNF
JanJakes cd543af
Fix "castType" that was incomplete in the original grammar
JanJakes 135f29f
Fix "SELECT ... WHERE ... INTO @var" using a negative lookahead
JanJakes 27524dd
Fix "EXPLAIN FORMAT=..." by reordering grammar rules
JanJakes 069342f
Fix special "WINDOW" and "OVER" cases by adjusting grammar rules
JanJakes 9bfc977
Fix "GRANT" and "REVOKE" by adjusting grammar rules to solve conflicts
JanJakes ca4de77
Use ebnfutils to dump grammar conflicts
JanJakes cd3504d
Implement the determineFunction() placeholder function, unify SQL modes
JanJakes 81bbde0
Fix processing NOW() synonyms in lexer
JanJakes 71292fb
Match mysqltest commands case-insensitively
JanJakes 1ab3723
Add a script to test lexer on all the testing queries
JanJakes 42ffc1b
Replace lexer switch/case and function calls with lookup tables
JanJakes 01241b8
Fix unicode handling when extracting test queries
JanJakes f5266f1
Fix identifier matching, improve lexer performance by ~25%
JanJakes a2ac60b
Unify charset matching with identifier matching, remove non-existent …
JanJakes 90e2af6
Fix quoted text/identifier matching, improve lexer performance by ~10%
JanJakes dfcad3e
Remove dependency on ctype, improve lexer performance by ~4%
JanJakes 0f23dd3
Determine token names lazily, reduce LOC to < 3000, improve lexer per…
JanJakes 6bb8ff9
Finish manual token pass, use MySQL Workbench token IDs, add comments
JanJakes ec55c10
Fix wrong token type
JanJakes 21f3f16
Inline some simple single-use methods, remove unused methods
JanJakes b4f0e08
Move token iteration loop outside nextToken() method
JanJakes f658751
Fix and simplify lookahead logic, improve lexer performance by ~6%
JanJakes 4eee991
Fix date and time literals by reordering grammar rules
JanJakes 64e2068
Implement handling of unquoted user variable suffixes
JanJakes f1c56cc
Fix number vs identifier matching, add some tests
JanJakes 883c08a
Fix handling of identifiers preceded by a “.”
JanJakes f56f036
Implement lexeer support for NCHAR literals
JanJakes 972f7e0
Fix administration statements to support both "TABLE" and "TABLES" ke…
JanJakes df65873
Use MySQL 8.0.38 test suite, the same version as the grammar
JanJakes f25b4fa
Use the grammar MySQL version 8.0.38 as a default also in the lexer
JanJakes 1b53df5
Add a missing charset from MySQL 5, improve comments
JanJakes 905a7d1
Sort version-specific keyword list alphabetically
JanJakes 7d1cdd7
Fix conflict between “<index> USING” and “<index> TYPE” by adjusting …
JanJakes 4e5e499
Fix “LIKE …” and “LIKE (…)” by reordering grammar rules
JanJakes 4eb9dd5
Fix "sumExpr" (AVG, SUM, COUNT, …) by reordering grammar rules
JanJakes 4f968cd
Fix "REVOKE ALL ON ... FROM ..." by adding a missing rule segment
JanJakes 6188d90
Add full MySQLParser.g4 grammar
JanJakes b371daf
Add manual fixes to MySQLParser.g4 grammar
JanJakes 1765e4f
Add a script to parse grammar directly from MySQLParser.g4
JanJakes 50ef591
Handle empty branches during grammr parsing
JanJakes 28e44f5
Place and document grammar fixes below the original rules
JanJakes cd3c335
Skip perl, append_file, and write_file commands in query extraction
JanJakes 09bd0c7
Fix “histogram” rule by adding missing “USING DATA” clause
JanJakes d06be19
Fix conflict in "explainStatement" by reordering grammar rules
JanJakes cebfdd7
Handle escaped quotes when parsing test queries
JanJakes 94b7bd2
Fix "alterCommandList" to solve conflicts between "alterCommandsModif…
JanJakes ba43fc7
Add missing column visibility settings to the grammar
JanJakes 03fbb98
Fix "fieldDefinition" to solve conflict between "columnAttribute" and…
JanJakes ed15ffa
Fix "replicationStatement" to correctly support the "RESET PERSIST" s…
JanJakes 0b16f44
Implement missing "EXCEPT" and "INTERSECT" operators in the grammar
JanJakes 8df7294
Skip if and while blocks when parsing tests, skip mysqltest.test
JanJakes 0a35989
Fix processing “--“ commands and “--delimiter”
JanJakes 442a682
Fix "alterTableActions" to solve conflicts between "alterCommandsModi…
JanJakes 207078a
Fix REVOKE statement grammar
JanJakes 56c7b8d
Add missing branches to “accountLockPasswordExpireOptions”
JanJakes 9bdd088
Fix ALTER USER statement grammar
JanJakes 8f54a3f
Fix "userIdentifierOrText" to support omitting sequence after "@"
JanJakes 724e1c3
Fix conflict within "queryExpressionParens"
JanJakes 6e985b2
Fix CHARACTER VARYING support by reordering “dataType” subrules
JanJakes e2e7501
Fix ALL/ANY support by reordering subrules
JanJakes 61d2bcc
Implement MySQL-specific and version comments: /*!… */
JanJakes 3118f11
Fix SHOW GRANTS … USING by reordering grammar rules
JanJakes cb44ea7
Fix query expressions followed by “INTO …”
JanJakes 0812288
Fix “SELECT … FOR UPDATE … INTO …” by adding a missing rule part
JanJakes d5e1abb
Fix “DO … AS …” by reordering rules
JanJakes e3555d2
Fix PARTITION being defined as a non-reserved keyword
JanJakes abdbe47
Fix handling —error in MySQL test files
JanJakes 5faef36
Fix ignoring MySQL test commands without params
JanJakes 53cb169
Add missing MySQL test commands
JanJakes ea6d067
Fix handling prefixed quotes in MySQL tests
JanJakes 22ac89b
Add support for missing "INSTALL COMPONENT" statement "SET ..." suffix
JanJakes ce35d1d
Fix "indexHintList" to use only whitespace as a separator (not commas)
JanJakes a33723d
Fix and simplify line comment handling in lexer
JanJakes eb1e756
Add support for "CHANGE REPLICATION SOURCE ..." statement
JanJakes 96accd0
Mark ATTRIBUTE keyword as non-reserved
JanJakes f0c97bd
Fix START TRANSACTION statement with multiple characteristics
JanJakes d861ba7
Fix handling column attributes by reordering rules
JanJakes c74dbca
Fix GRANT “role with @“ by reordering grammar rules
JanJakes 143c421
Fix GRANT CREATE/DROP ROLE by reordering rules
JanJakes a48810c
Add support for COMMENT and ATTRIBUTE to CREATE USER
JanJakes a80dbf1
Fix FLUSH TABLES with qualified identifiers
JanJakes f6ebd7a
Fix temporal literals to support double quotes
JanJakes 09cdac7
Add support for CAST(... AT TIME ZONE ... AS DATETIME ...)
JanJakes 05d02ec
Fix transaction characteristics separator
JanJakes 8022fc5
Add support for IF NOT EXISTS for function, procedure, and trigger
JanJakes e98caad
Fix SET PASSWORD statement conflicts
JanJakes 47b7ec4
Fix handling nchar, national character, etc.
JanJakes 522867d
Ignore “query_attributes” MySQL client command
JanJakes 0d80a16
Fix "leadLagInfo" to support identifiers and variables as well
JanJakes d12d01e
Fix WEIGHT_STRING(… AS BINARY(…)) by reordering grammar rules
JanJakes ad62b8a
Add support for "ALTER INSTANCE ..." statement
JanJakes 560064a
Fix CAST(… AS FLOAT(N)) by fixing a wrong grammar rule
JanJakes d004dad
Add missing CREATE TABLE options
JanJakes 52d9ed1
Add support for ON DELETE SET DEFAULT
JanJakes fb1e5cf
Add support for ENGINE_ATTRIBUTE to tablespaces
JanJakes e5f59fe
Add support for "JSON_VALUE(..., '...' RETURNING <type>)
JanJakes 106a6e5
Skip disabled MySQL tests
JanJakes e5d3a03
Detect & convert charset when extracting tests
JanJakes 88e3ec2
Fix lexing numbers and identifiers
JanJakes c934f37
Fix CREATE/ALTER TABLE … UNION = ()
JanJakes dfee145
Improve RESET PERSIST fix
JanJakes 1c7c18c
Fix ALTER DABATASE to support optional schema name
JanJakes ae0b31a
Fix support for “:=“ assignment in “updateElement”
JanJakes 839d807
Add “phpize-grammar” logic to grammar conversion script
JanJakes 495e227
Delete all no longer necessary grammar tools
JanJakes d9827ac
Use WP coding styles for all new files
JanJakes a19d439
Ignore coding styles for compressed grammar file
JanJakes 8d810fe
Use WP coding styles for filenames, use file per class, improve direc…
JanJakes d869785
Exclude new parser tests for now
JanJakes f3f0c0f
Remove no longer needed rule name processing
JanJakes d993041
Omit grammar tools and WIP work from plugin builds
JanJakes 8c2d227
Omit all test files from plugin builds
JanJakes 2cf39d4
Omit new lexer & parser from plugin builds for now
JanJakes 09dda8a
Improve class naming and directory structure
JanJakes e42bf52
Move tokenize_query to WP_MySQL_Lexer::tokenize()
JanJakes c42ee15
Move tests under mysql directory
JanJakes 7b84cfc
Move manual lexer tests to PHPUnit
JanJakes be7257f
Move test tools to tests/tools and ignore them in PHPUnit
JanJakes 9c34178
Polish and document test downloading and extraction scripts
JanJakes c2cfd45
Skip queries testing parser stack overflow
JanJakes 03cc66a
Run the full MySQL server test suite in unit tests
JanJakes 5b2f488
Repurpose testing scripts for benchmarking, add single-parse testing …
JanJakes 1e3e254
Use strcspn to handle all types of quoted text
JanJakes 7d5869b
Use strspn to handle unquoted user-defined variables
JanJakes 04bf6e5
Replace preg_match with strspn and UTF-8 decoder when handling identi…
JanJakes 43357d0
Simplify lexer state, use and advance only byte position
JanJakes 6e3b31e
Replace all other byte scanning loops with strspn
JanJakes 3b789fe
Inline bin and hex numbers to number method
JanJakes ad4cf36
Simplify version comment processing
JanJakes 3730d80
Simplify matching EOF
JanJakes 833ca68
Remove channel, token, and token instance properties
JanJakes e0dfccd
Replace type property with local variables and function returns
JanJakes dff4649
Improve property and method naming and documentation
JanJakes 17641f7
Check for the U+0080-U+FFFF range manually, add test coverage
JanJakes ee9d59d
Improve lexer docs and naming
JanJakes b32a9dd
Unify identifier matching with other "read_" methods
JanJakes 432bf1f
Reorder methods to group "read_" methods together
JanJakes 73c6c0a
Implement integer type detection
JanJakes 4464cea
Move empty rule (ε) constant to WP_Parser_Grammar
JanJakes 79b2622
Cleanup, add docs and TODOs
JanJakes 43cec5d
Rename WP_Parser_Tree to WP_Parser_Node and document it
JanJakes 634635e
Explain the "ε" rule in a comment
JanJakes 96dd467
Rename $rule to $branches for better clarity
JanJakes dec0e3f
Use `false` rather than `null` when a parser subtree doesn't match
JanJakes afc70bb
Declare mbstring in dev dependencies
JanJakes 33d55c4
Use more descriptive file name for MySQL test queries
JanJakes dd31359
Read quote from the SQL payload instead of passing it as parameter
JanJakes ae42217
Improve code comments
JanJakes 5942fd6
Simplify next_token() loop condition
JanJakes c567a3f
Inline is-digit and is-whitespace logic
JanJakes 60d995d
Fix 0b and 0x indentifiers
JanJakes ccebf37
Fix unclosed b'...' and x'...' numbers
JanJakes 7155b7b
Rename parser testing script to dump-ast.php
JanJakes 9426969
Implement "next_token()" & "get_next_token" API
JanJakes File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.