Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TextDB doesn't work for Chinese language #1370

Open
lejunzhu opened this issue Feb 13, 2025 · 1 comment
Open

TextDB doesn't work for Chinese language #1370

lejunzhu opened this issue Feb 13, 2025 · 1 comment

Comments

@lejunzhu
Copy link

Describe the Issue
In the same session, if I put something like "Alice and Bob are friends" in TextDB, then ask the question "Who's Alice's friend", I can see the info snipplet in debug log and the model gives the right answer. But if I put a sentense of the same meaning in Chinese, there's no snipplet in the log and the model doesn't know what I'm talking about.

Additional Information:
I'm using version 1.83.1.

@esolithe
Copy link

This is a known limitation of the current implementation of the TextDB in kobold lite. The current way it handles grammar / tenses is based on the English language which means it is less effective for other languages.

The two areas needed for multi language support are:

  • A stop list of common words in Chinese (i.e. words which should be ignored as they usually do not have significance like "is, the" etc.
  • A stemmer (preferably written in Javascript) which works on Chinese (makes words more general - i.e. changes winning -> win)

If both of these exist it should be pretty easy to add to Lite (https://github.com/LostRuins/lite.koboldai.net) which is the front end bundled in Kobold CPP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants