Further modified llama.cpp-b4139 to support Llama-3_1-Nemotron-51B. This is a full support implementation.
This release is also tested to confirm that it works with DeciLM-7B and Mistral-7B-v0.3.
Support for DeciLM-7B dynamic NTK-aware RoPE scaling will be added when llama.cpp supports it.