Skip to content

git-{commit,fetch} file detection & lexing #142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: default
Choose a base branch
from

Conversation

jeremybobbin
Copy link

git-commit lexer

  • 72-byte commit message is a STRING
  • 'Cc:', 'Co-authored-by:', 'Reviewed-by:' (not sure where to find a list for these - just pulled the most common ones from the kernel)
  • commit hashes: numbers

@jeremybobbin
Copy link
Author

Open to feedback - especially open to removing the Co-authored-by, Reviewed-by, etc, keywords as these are vendor-specific.

As long as we can get the file-detection for both & comments highlighted in commit messages, I'm happy.

@rgieseke
Copy link

rgieseke commented Feb 4, 2025

Very nice, I really like to have a git-commit lexer in Textadept, I use the terminal version for writing commit messages a lot!

I'm not sure whether the line-length should be included at the lexer level. Once there is a git-commit lexer one could write a handler to add edge_columns. There might also be different needs, for example some recommend having the first line summary to be shorter (e.g. 50 chars).

As for the word-list, I think something like "Bluetooth" is very Linux kernel specific I assume? Maybe just have the ones like "Co-authored-by"?

@jeremybobbin
Copy link
Author

jeremybobbin commented Feb 4, 2025

I'll take out the "Bluetooth" & stuff & just keep the SMTP & *-by: ones.

It looks like some lexers make use of variables on scintillua:

$ grep -ho 'scintillua\.[a-z_.]\+' $(ls lexers/* | grep -v lexer\.lua)  | sort | uniq
scintillua.angle.braces
scintillua.comment
scintillua.rest.by.sphinx.convention
scintillua.word.chars

It seems like the comments in git-rebase are limited to 76 characters long, so I'll set that as the default,
and make it configurable by scintilllua['git-commit'].summary_length or something.
... If that is something applicable - I'll think more about the appropriate scope of restricting the line length.

Thanks for your feedback

@rgieseke
Copy link

rgieseke commented Feb 5, 2025

Ye, details on what is appropriate for a lexer to be included are up to @orbitalquark - i'm just a happy user and would look forward to have some git lexing.

What could be a cool addition is embedding the diff lexer to show the diff highlighted when one uses git commit --verbose

@orbitalquark
Copy link
Owner

If you wanted to lex with a configurable property, you'd have to do this from inside a pattern function (lpeg.P(function() ... end)) because patterns are compiled at load time, and properties can be set after this. Most of the properties you list are hints for editors on language-specific things like if '<' and '>' are considered matching brace characters, what constitutes a word, etc. Configurable properties are 'fold', 'fold.scintillua.on.zero.sum.lines', etc. This works because the folder is not compiled and evaluates variables at fold time.

That said, if you can get configurable lexer properties to work, the only concern I'd have is performance, but since commit messages are supposed to be small, perhaps it's not much of an issue.

Otherwise, this looks good to me for the most part. I can do a more thorough review when you're finished.

@jeremybobbin
Copy link
Author

Just pushed the changes.

Bugs I haven't fixed:

  • it doesn't highlight the summary when there are no newlines in the file
  • if the first N lines are comments, then it doesn't highlight N+1 as a summary

Couldn't quite figure out how to match against EOF(?) for the first issue.
I'm guessing I may need to introduce state for the second?

Though in this state, it works for me, I appreciate any input you have.

@orbitalquark
Copy link
Owner

EOF is -P(1) or the idiom P(-1). I'm not sure about your second issue. It might have to do with the trailing (lpeg.P('\n')+c) that requires a match. Maybe appending + P(-1) will fix both issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants