TextDB doesn't work for Chinese language #1370

lejunzhu · 2025-02-13T05:00:05Z

Describe the Issue
In the same session, if I put something like "Alice and Bob are friends" in TextDB, then ask the question "Who's Alice's friend", I can see the info snipplet in debug log and the model gives the right answer. But if I put a sentense of the same meaning in Chinese, there's no snipplet in the log and the model doesn't know what I'm talking about.

Additional Information:
I'm using version 1.83.1.

esolithe · 2025-02-13T18:48:00Z

This is a known limitation of the current implementation of the TextDB in kobold lite. The current way it handles grammar / tenses is based on the English language which means it is less effective for other languages.

The two areas needed for multi language support are:

A stop list of common words in Chinese (i.e. words which should be ignored as they usually do not have significance like "is, the" etc.
A stemmer (preferably written in Javascript) which works on Chinese (makes words more general - i.e. changes winning -> win)

If both of these exist it should be pretty easy to add to Lite (https://github.com/LostRuins/lite.koboldai.net) which is the front end bundled in Kobold CPP

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TextDB doesn't work for Chinese language #1370

TextDB doesn't work for Chinese language #1370

lejunzhu commented Feb 13, 2025

esolithe commented Feb 13, 2025

TextDB doesn't work for Chinese language #1370

TextDB doesn't work for Chinese language #1370

Comments

lejunzhu commented Feb 13, 2025

esolithe commented Feb 13, 2025