Skip to content

Commit c71bfd7

Browse files
authored
llama : fix compatibility with old 2 expert models (ggml-org#6735)
1 parent 3b8f1ec commit c71bfd7

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

llama.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4592,7 +4592,7 @@ static bool llm_load_tensors(
45924592
size_t ctx_size = ggml_tensor_overhead()*(ml.n_tensors + 1); // +1 for models where tok_embd is duplicated as output
45934593

45944594
// for moe merged tensors
4595-
ctx_size += ggml_tensor_overhead()*hparams.n_expert*n_layer;
4595+
ctx_size += ggml_tensor_overhead()*n_layer*3;
45964596

45974597
std::map<ggml_backend_buffer_type_t, ggml_context *> ctx_map;
45984598
for (auto & it : buft_layer_count) {

0 commit comments

Comments
 (0)