v6.5.0 - Llama 3.1 & MiniCPM v2
General updates
- Remove
triton
dependency ascogvlm
vision model is also removed. - Redid all benchmarks with more-accurate parameters.
Local Models
Overall, the large amount of chat models was becoming unnecessary or redundant. Therefore, I removed models that weren't providing optimal responses to simplify the user's experience, and added Llama 3.1
.
Removed Models
Qwen 2 - 0.5b
Qwen 1.5 - 0.5b
Qwen 2 - 1.5b
Qwen 2 - 7b
- Redundant with
Dolphin Qwen 2 - 7b
- Redundant with
Yi 1.5 - 6b
Stablelm2 - 12b
Llama 3 - 8b
- Redundant with
Dolphin Llama 3 - 8b
- Redundant with
Added Models
Dolphin Llama 3.1 - 8b
Vision Models
Overall, two vision models were removed as unnecessary and MiniCPM-V-2_6 - 8b
was added. As of the date of this release, MiniCPM-V-2_6 - 8b
is now the best model in terms of quality. I currently recommend using this model if you have the time and VRAM.
Removed Models
cogvlm
MiniCPM-Llama3
Vector Models
- Added
Stella_en_1.5B_v5
, which ranks very high on the leaderboard.- Note, this is a work in progress as currently the results seem to be sub-optimal.
Current Chat and Vision Models

