Skip to content

Releases: BBC-Esq/VectorDB-Plugin

v6.5.0 - Llama 3.1 & MiniCPM v2

07 Aug 18:13
ccc5d5b
Compare
Choose a tag to compare

General updates

  • Remove triton dependency as cogvlm vision model is also removed.
  • Redid all benchmarks with more-accurate parameters.

Local Models

Overall, the large amount of chat models was becoming unnecessary or redundant. Therefore, I removed models that weren't providing optimal responses to simplify the user's experience, and added Llama 3.1.

Removed Models

  • Qwen 2 - 0.5b
  • Qwen 1.5 - 0.5b
  • Qwen 2 - 1.5b
  • Qwen 2 - 7b
    • Redundant with Dolphin Qwen 2 - 7b
  • Yi 1.5 - 6b
  • Stablelm2 - 12b
  • Llama 3 - 8b
    • Redundant with Dolphin Llama 3 - 8b

Added Models

  • Dolphin Llama 3.1 - 8b

Vision Models

Overall, two vision models were removed as unnecessary and MiniCPM-V-2_6 - 8b was added. As of the date of this release, MiniCPM-V-2_6 - 8b is now the best model in terms of quality. I currently recommend using this model if you have the time and VRAM.

Removed Models

  • cogvlm
  • MiniCPM-Llama3

Vector Models

  • Added Stella_en_1.5B_v5, which ranks very high on the leaderboard.
    • Note, this is a work in progress as currently the results seem to be sub-optimal.

Current Chat and Vision Models

chart_chat chart_vision

v6.4 - stream responses

03 Aug 18:03
6cfb5a8
Compare
Choose a tag to compare

Improvements

  • All "local models" now stream their responses for a better user experience.
  • Various small improvements.

Local Models

  • Fixed Dolphin Phi3-Medium
  • Added Yi 1.5 - 6b
  • Added H2O Danube3 - 4b
    • Great quality small model.
  • Removed Mistral v.03 - 7b
    • The model is gated so it's difficult to implement in a program. Plus, there are a plethora of other good models.
  • Removed Llama 3.1 - 8b
    • Same as with Mistral.
  • Added Internlm 2.5 - 7b
  • Fixed Dolphin-Mistral-Nemo

Vision Models

  • Added Falcon-vlm - 11b
    • Great quality. Uses Llava 1.6's processor.

Falcon-vlm, Llava 1.6 Vicuna - 7b, and Llava 1.6 Vicuna - 13b have arguably surpassed Cogvlm and are faster for less VRAM. Thus, Cogvlm may be deprecated in the future.

Misc.

  • Most, but not all, models should now download to the Models folder so you can take your folder with you. FYI, ensuring that all models do so is a work in progress, the goal being to carry all of the necessary files + program on a flash drive.

Current Chat and Vision Models

chart_chat chart_vision

6.3.0 - whisper upgrade

31 Jul 20:06
851893b
Compare
Choose a tag to compare

NOTE

This release has been deleted a few times because of errors but this one should work now....

Updates:

  • Added the large-v3 whisper model and removed large-v2.
  • Added all three distil whisper model sizes.
  • Ensured that all whisper model files are downloaded to the Models/whisper folder in the source code folder.
  • Added error handling in metrics bar for if/when the numbers go over 100% - e.g. a model overflows the vram.
  • Modified gui.py to specify the multiprocess type earlier in the script to avoid some errors.

v6.2.3 - FAST installation

30 Jul 13:26
aca7eec
Compare
Choose a tag to compare

Uses the impressive uv library written in RUST for a 2x-4x speed up of setup_windows.py.

Make sure that run pip install uv first, as outlined in the updated installation instructions.

v6.2.2 - Welcome LLAVA_NEXT

27 Jul 12:11
f87ba5c
Compare
Choose a tag to compare

New Vector Models

Reintroducing these after an unduly long hiatus:

New Vision Models

Welcome llava-next also known as Llava 1.6:

Other Changes

  • Removed sentence-t5-xxl vector model.
  • Set batch sizes for all current vector models.
  • Fixed a bug where chat model didn't automatically eject when the program's window was closed, thus preventing the command prompt from being returned to a user.

v6.2.1 - PERFECT install patch

25 Jul 04:08
cc59151
Compare
Choose a tag to compare

Patch release to add dependencies accidentally missing from setup_windows.py. See the release notes for version 6.2.0 for more details on the release itself.

v6.2.0 - PERFECT installation

24 Jul 17:50
d339ada
Compare
Choose a tag to compare
  • Note, use the setup_windows.py script attached to this release instead or check out release 6.2.1

Breaking Changes

  • Overhauled the installation procedure. Too many dependencies were creating conflicts that neither pip, pip-compile or any other approach I'm aware of could solve. Thus, setup_windows.py has been completely revamped to install on Windows + Nvidia GPU systems.
  • It should now install every library needed EVERY SINGLE TIME without exceptions. The tradeoff is that it's slightly slower, which is no biggie.
  • If I have time, I will re-incorporate an installation procedure for CPU-only systems.

New Chat Models

Other Changes

  • Clean up Unneeded Portions of Scripts
  • Disable PHI3 Mini Models Temporarily due to errors.
  • Update the System Message for the Chat Models
  • Upgraded to transformers==4.43.1 and downgraded to cuda==12.1 (to avoid errors).

Currently Support Chat Models:

image

v6.1.0 - complexity growing!

19 Jul 23:00
3cacc39
Compare
Choose a tag to compare

Version 6.1

Stability-geared release.

Bug Fixes

  • VectorDB's can now be created with images again and searched. Sentence-transformers was the mail culprit.
  • Solved the issue of the DB not being created by using from_texts instead of from_documents within the TileDB library.
  • Massive improvement in stability when switching to/from "local models." Involved heavy troubleshooting multiprocessing.
  • Greatly improved the installation procedure - i.e. setup_windows.py and requirements.txt, which was responsible for a lot of conflicting dependencies and therefore random errors.

Regressions

  • Temporarily commented out Phi3 (original) models to solve an inference issue, but dolphin phi3 works fine.

v6.0.2

02 Jul 11:35
61a467f
Compare
Choose a tag to compare

Improvements

  • Added all of these chat models to select from to chat with the vector database!
chart_chat
  • Added all of these vision models to select from when processing images.
chart_vision
  • Added ChatTTS and google TTS as text to speech backends.
  • Added hyperlinks to all model cards for vector, chat, and vision models.
  • Added the ability to restore backups of your databases.
  • Significantly improved the "Test Vision Model" tool to be able to test all vision models.
  • Revamped the User Manual
  • Added a pull-down menu to select the vector model instead of having to navigate and select a particular folder.
  • MASSIVE refactoring.
  • MASSIVE restructuring of the setup.py and requirements.txt due to the crazy increase in dependencies. "Dependency "hell" is a real thing...

CUDA no longer required (sort of):

  • Previously, you were required to install CUDA and CUDNN directly from Nvidia, which was by its very nature a system-wide installation.
  • Now, ALL CUDA and CUDNN-related files are "pip installed" within the setup.py script. This allows you to either install a different version of CUDA system-wide to be used by other programs or, alternatively, not install CUDA system-wide at all.
  • This was achieved by installing CUDA and CUDNN as a python dependency and temporarily adding these specific installations to the PATH whenever the program is running.

IMPORTANT...restructuring of model downloading procedure

  • Previously, the vector models were downloaded using a git clone command and other types of models (i.e. chat, vision, TTS, whisper.) were automatically downloaded to the system's cache folder.
  • Now, almost all of the models (except vector models) are downloaded to a specific sub-folder within the Models folder. In future releases all models will be downloaded this way.
  • The goal is to make the program as portable as possible; for example, copying the "src" folder (and therefore all the models) to a thumb drive to use on a laptop without having to re-download everything. Eventually, all paths to models selected within the program will be "relative" such that wherever you move the src folder (even to a different computer) everything should work just like it did without having to download anything again or change any settings.

Bug Fixes

  • Fixed the "local model" not being removed from memory even after a different chat model was is loaded.
  • Fixed the "local model" exponentially increasing memory usage when asking multiple questions (best guess, it was was re-loading the model each time).
  • Re-added a script that was accidentally deleted from this repository during the last release...
  • Fixed a huge issue involving sentence-transformers, TileDB, and Langchain that threw a variety of un-fixable errors when trying to create a vector database. This required modifying the source code for sentence-transformers itself as a temporary fix, but everything seems to be working much better so this may become a permanent fix.
  • Fixed numerous other bugs.

Known issues

  • Image search is NOT WORKING but will be fixed in an incremental release.
  • There's an issue with Langchain specific to TileDB; specifically, the from_documents method. A temporary work around was to actually modify the sentence-transformers source code. A subsequent patch will likely use the from_texts method instead, but the database seems to be working fine.

Please create an issue with any bugs you encounter!

Credit goes to the new Claude 3.5 Sonnet for finally being able to solve the memory issue regarding loading/unloading chat models in a separate process nonetheless.

v6.0.1 - hard fought well done

30 Jun 11:12
cef3a7a
Compare
Choose a tag to compare

Improvements

  • Added all of these chat models to select from to chat with the vector database!
chart_chat
  • Added all of these vision models to select from when processing images.
chart_vision
  • Added ChatTTS and google TTS as text to speech backends.
  • Added hyperlinks to all model cards for vector, chat, and vision models.
  • Added the ability to restore backups if your databases.
  • Significantly improved the "Test Vision Model" tool to be able to test all vision models.
  • Revamped the User Manual
  • Added a pulldown menu to select the vector model instead of having to navigate and select a particular folder.
  • MASSIVE refactoring.
  • MASSIVE restructuring of the setup.py and requirements.txt due to the crazy increase in dependencies. "Dependency "hell" is a real thing...

CUDA no longer required (sort of):

  • Previously, you were required to install CUDA from Nvidia before the program would work...but not only that cudnn, which required a developer account (albeit, only requires an email address).
  • Now, ALL CUDA-RELATED FILES ARE PIP INSTALLED INTO THE SITE-PACKAGES FOLDER.
  • This means that you no longer have to install CUDA/CUDNN systemwide...or you can install a different version system-wide. Basically, setup.py now handles everything.
  • This was achieved by setting the paths to CUDA/CUDNN files temporarily whenever the program begins. Thus, no manual changing of paths is needed.

IMPORTANT...restructuring of model downloading procedure

  • Previously, the vector models were downloaded using a git clone command when you clicked a button. All other models (e.g. chat, vision, etc.) were automatically downloaded to the system's cache folder.
  • Now, all models (except vector models) are downloaded to the Models folder within the main src folder so you can see them. Vector models will eventually be downloaded the same way.
  • The goal is to make the program as completely portable as possible - e.g. put everything on a thumb drive and use it on your laptop without having to re-download everything. But more importantly, all paths are now relative such that even if you move your src folder on your computer the paths to the models should still work (have yet to confirm with vector models).

Bug Fixes

  • Fixed the issue where the local chat model was not being removed from memory.
  • Fixed the issue where the local chat model would exponentially increase vram when you asked more than one question.
  • A crucial script had been accidentally deleted that went unnoticed until now.
  • Fixed a huge issue creating databases due to issues between sentence-transformers, TileDB, and Langchain. Database creation should now be far more reliable!
  • Fixed numerous other bugs.

Known issues

  • Image search is NOT WORKING currently but will be fixed in an incremental release. It doesn't matter if you create an all-image database or combing images with other documents, questions regarding images supposedly entered into the database are not being returned.
  • There's an issue with Langchain specific to TileDB; specifically, the from_documents method. A temporary work around was to actually modify the sentence-transformers source code. A subsequent patch will likely use the from_texts method instead, but the database seems to be working fine.

Please create an issue with any bugs you encounter!

Credit goes to the new Claude 3.5 Sonnet for finally being able to solve the memory issue regarding loading/unloading chat models in a separate process nonetheless.