Skip to content

Commit bdf6019

Browse files
committed
Merge branch 'master' into croco_nex_0
2 parents 0395eb5 + 1204f97 commit bdf6019

File tree

13 files changed

+570
-61
lines changed

13 files changed

+570
-61
lines changed

convert_hf_to_gguf.py

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2562,6 +2562,63 @@ def generate_extra_tensors(self) -> Iterable[tuple[str, Tensor]]:
25622562
yield (self.format_tensor_name(gguf.MODEL_TENSOR.ROPE_FACTORS_SHORT), torch.tensor(short_factors, dtype=torch.float32))
25632563

25642564

2565+
@Model.register("PhiMoEForCausalLM")
2566+
class PhiMoeModel(Phi3MiniModel):
2567+
model_arch = gguf.MODEL_ARCH.PHIMOE
2568+
2569+
_experts: list[dict[str, Tensor]] | None = None
2570+
2571+
def set_gguf_parameters(self):
2572+
super().set_gguf_parameters()
2573+
self.gguf_writer.add_expert_used_count(self.hparams["num_experts_per_tok"])
2574+
self.gguf_writer.add_expert_count(self.hparams["num_local_experts"])
2575+
2576+
def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iterable[tuple[str, Tensor]]:
2577+
# process the experts separately
2578+
if name.find("block_sparse_moe.experts") != -1:
2579+
n_experts = self.hparams["num_local_experts"]
2580+
assert bid is not None
2581+
2582+
if self._experts is None:
2583+
self._experts = [{} for _ in range(self.block_count)]
2584+
2585+
self._experts[bid][name] = data_torch
2586+
2587+
if len(self._experts[bid]) >= n_experts * 3:
2588+
tensors: list[tuple[str, Tensor]] = []
2589+
2590+
# merge the experts into a single 3d tensor
2591+
for w_name in ["w1", "w2", "w3"]:
2592+
datas: list[Tensor] = []
2593+
2594+
for xid in range(n_experts):
2595+
ename = f"model.layers.{bid}.block_sparse_moe.experts.{xid}.{w_name}.weight"
2596+
datas.append(self._experts[bid][ename])
2597+
del self._experts[bid][ename]
2598+
2599+
data_torch = torch.stack(datas, dim=0)
2600+
2601+
merged_name = f"model.layers.{bid}.block_sparse_moe.experts.{w_name}.weight"
2602+
2603+
new_name = self.map_tensor_name(merged_name)
2604+
2605+
tensors.append((new_name, data_torch))
2606+
return tensors
2607+
else:
2608+
return []
2609+
2610+
return [(self.map_tensor_name(name), data_torch)]
2611+
2612+
def prepare_tensors(self):
2613+
super().prepare_tensors()
2614+
2615+
if self._experts is not None:
2616+
# flatten `list[dict[str, Tensor]]` into `list[str]`
2617+
experts = [k for d in self._experts for k in d.keys()]
2618+
if len(experts) > 0:
2619+
raise ValueError(f"Unprocessed experts: {experts}")
2620+
2621+
25652622
@Model.register("PlamoForCausalLM")
25662623
class PlamoModel(Model):
25672624
model_arch = gguf.MODEL_ARCH.PLAMO

docs/cuda-fedora.md

Lines changed: 317 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,317 @@
1+
# Setting Up CUDA on Fedora
2+
3+
In this guide we setup [Nvidia CUDA](https://docs.nvidia.com/cuda/) in a toolbox container. This guide is applicable for:
4+
- [Fedora Workstation](https://fedoraproject.org/workstation/)
5+
- [Atomic Desktops for Fedora](https://fedoraproject.org/atomic-desktops/)
6+
- [Fedora Spins](https://fedoraproject.org/spins)
7+
- [Other Distributions](https://containertoolbx.org/distros/), including `Red Hat Enterprise Linux >= 8.`, `Arch Linux`, and `Ubuntu`.
8+
9+
10+
## Table of Contents
11+
12+
- [Prerequisites](#prerequisites)
13+
- [Monitoring NVIDIA CUDA Repositories](#monitoring-nvidia-cuda-repositories)
14+
- [Using the Fedora 39 CUDA Repository](#using-the-fedora-39-cuda-repository)
15+
- [Creating a Fedora Toolbox Environment](#creating-a-fedora-toolbox-environment)
16+
- [Installing Essential Development Tools](#installing-essential-development-tools)
17+
- [Adding the CUDA Repository](#adding-the-cuda-repository)
18+
- [Installing `nvidia-driver-libs`](#installing-nvidia-driver-libs)
19+
- [Manually Resolving Package Conflicts](#manually-resolving-package-conflicts)
20+
- [Finalizing the Installation of `nvidia-driver-libs`](#finalizing-the-installation-of-nvidia-driver-libs)
21+
- [Installing the CUDA Meta-Package](#installing-the-cuda-meta-package)
22+
- [Configuring the Environment](#configuring-the-environment)
23+
- [Verifying the Installation](#verifying-the-installation)
24+
- [Conclusion](#conclusion)
25+
- [Troubleshooting](#troubleshooting)
26+
- [Additional Notes](#additional-notes)
27+
- [References](#references)
28+
29+
## Prerequisites
30+
31+
- **Toolbox Installed on the Host System** `Fedora Silverblue` and `Fedora Workstation` both have toolbox by default, other distributions may need to install the [toolbox package](https://containertoolbx.org/install/).
32+
- **NVIDIA Drivers and Graphics Card installed on Host System (optional)** To run CUDA program, such as `llama.cpp`, the host should be setup to access your NVIDIA hardware. Fedora Hosts can use the [RPM Fusion Repository](https://rpmfusion.org/Howto/NVIDIA).
33+
- **Internet connectivity** to download packages.
34+
35+
### Monitoring NVIDIA CUDA Repositories
36+
37+
Before proceeding, it is advisable to check if NVIDIA has updated their CUDA repositories for your Fedora version. NVIDIA's repositories can be found at:
38+
39+
- [Fedora 40 CUDA Repository](https://developer.download.nvidia.com/compute/cuda/repos/fedora40/x86_64/)
40+
- [Fedora 41 CUDA Repository](https://developer.download.nvidia.com/compute/cuda/repos/fedora41/x86_64/)
41+
42+
As of the latest update, these repositories do not contain the `cuda` meta-package or are missing essential components.
43+
44+
### Using the Fedora 39 CUDA Repository
45+
46+
Since the newer repositories are incomplete, we'll use the Fedora 39 repository:
47+
48+
- [Fedora 39 CUDA Repository](https://developer.download.nvidia.com/compute/cuda/repos/fedora39/x86_64/)
49+
50+
**Note:** Fedora 39 is no longer maintained, so we recommend using a toolbox environment to prevent system conflicts.
51+
52+
## Creating a Fedora Toolbox Environment
53+
54+
This guide focuses on Fedora hosts, but with small adjustments, it can work for other hosts. Using a Fedora 39 toolbox allows us to install the necessary packages without affecting the host system.
55+
56+
**Note:** Toolbox is available for other systems, and even without Toolbox, it is possible to use Podman or Docker.
57+
58+
We do not recommend installing on the host system, as Fedora 39 is out-of-maintenance, and instead you should upgrade to a maintained version of Fedora for your host.
59+
60+
1. **Create a Fedora 39 Toolbox:**
61+
62+
```bash
63+
toolbox create --image registry.fedoraproject.org/fedora-toolbox:39 --container fedora-toolbox-39-cuda
64+
```
65+
66+
2. **Enter the Toolbox:**
67+
68+
```bash
69+
toolbox enter --container fedora-toolbox-39-cuda
70+
```
71+
72+
Inside the toolbox, you have root privileges and can install packages without affecting the host system.
73+
74+
## Installing Essential Development Tools
75+
76+
1. **Synchronize the DNF Package Manager:**
77+
78+
```bash
79+
sudo dnf distro-sync
80+
```
81+
82+
2. **Install the Default Text Editor (Optional):**
83+
84+
```bash
85+
sudo dnf install vim-default-editor --allowerasing
86+
```
87+
88+
The `--allowerasing` flag resolves any package conflicts.
89+
90+
3. **Install Development Tools and Libraries:**
91+
92+
```bash
93+
sudo dnf install @c-development @development-tools cmake
94+
```
95+
96+
This installs essential packages for compiling software, including `gcc`, `make`, and other development headers.
97+
98+
## Adding the CUDA Repository
99+
100+
Add the NVIDIA CUDA repository to your DNF configuration:
101+
102+
```bash
103+
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora39/x86_64/cuda-fedora39.repo
104+
```
105+
106+
After adding the repository, synchronize the package manager again:
107+
108+
```bash
109+
sudo dnf distro-sync
110+
```
111+
112+
## Installing `nvidia-driver-libs`
113+
114+
Attempt to install `nvidia-driver-libs`:
115+
116+
```bash
117+
sudo dnf install nvidia-driver-libs
118+
```
119+
120+
**Explanation:**
121+
122+
- `nvidia-driver-libs` contains necessary NVIDIA driver libraries required by CUDA.
123+
- This step might fail due to conflicts with existing NVIDIA drivers on the host system.
124+
125+
## Manually Resolving Package Conflicts
126+
127+
If the installation fails due to conflicts, we'll manually download and install the required packages, excluding conflicting files.
128+
129+
### 1. Download the `nvidia-driver-libs` RPM
130+
131+
```bash
132+
sudo dnf download --arch x86_64 nvidia-driver-libs
133+
```
134+
135+
You should see a file similar to:
136+
137+
```
138+
nvidia-driver-libs-560.35.05-1.fc39.x86_64.rpm
139+
```
140+
141+
### 2. Attempt to Install the RPM
142+
143+
```bash
144+
sudo dnf install nvidia-driver-libs-560.35.05-1.fc39.x86_64.rpm
145+
```
146+
147+
**Expected Error:**
148+
149+
Installation may fail with errors pointing to conflicts with `egl-gbm` and `egl-wayland`.
150+
151+
**Note: It is important to carefully read the error messages to identify the exact paths that need to be excluded.**
152+
153+
### 3. Download Dependencies
154+
155+
```bash
156+
sudo dnf download --arch x86_64 egl-gbm egl-wayland
157+
```
158+
159+
### 4. Install `egl-gbm` with Excluded Paths
160+
161+
Exclude conflicting files during installation:
162+
163+
```bash
164+
sudo rpm --install --verbose --hash \
165+
--excludepath=/usr/lib64/libnvidia-egl-gbm.so.1.1.2 \
166+
--excludepath=/usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json \
167+
egl-gbm-1.1.2^20240919gitb24587d-3.fc39.x86_64.rpm
168+
```
169+
170+
**Explanation:**
171+
172+
- The `--excludepath` option skips installing files that conflict with existing files.
173+
- Adjust the paths based on the error messages you receive.
174+
175+
### 5. Install `egl-wayland` with Excluded Paths
176+
177+
```bash
178+
sudo rpm --install --verbose --hash \
179+
--excludepath=/usr/share/egl/egl_external_platform.d/10_nvidia_wayland.json \
180+
egl-wayland-1.1.17^20241118giteeb29e1-5.fc39.x86_64.rpm
181+
```
182+
183+
### 6. Install `nvidia-driver-libs` with Excluded Paths
184+
185+
```bash
186+
sudo rpm --install --verbose --hash \
187+
--excludepath=/usr/share/glvnd/egl_vendor.d/10_nvidia.json \
188+
--excludepath=/usr/share/nvidia/nvoptix.bin \
189+
nvidia-driver-libs-560.35.05-1.fc39.x86_64.rpm
190+
```
191+
192+
**Note:**
193+
194+
- Replace the paths with the ones causing conflicts in your installation if they differ.
195+
- The `--verbose` and `--hash` options provide detailed output during installation.
196+
197+
## Finalizing the Installation of `nvidia-driver-libs`
198+
199+
After manually installing the dependencies, run:
200+
201+
```bash
202+
sudo dnf install nvidia-driver-libs
203+
```
204+
205+
You should receive a message indicating the package is already installed:
206+
207+
```
208+
Package nvidia-driver-libs-3:560.35.05-1.fc39.x86_64 is already installed.
209+
Dependencies resolved.
210+
Nothing to do.
211+
Complete!
212+
```
213+
214+
## Installing the CUDA Meta-Package
215+
216+
Now that the driver libraries are installed, proceed to install CUDA:
217+
218+
```bash
219+
sudo dnf install cuda
220+
```
221+
222+
This installs the CUDA toolkit and associated packages.
223+
224+
## Configuring the Environment
225+
226+
To use CUDA, add its binary directory to your system's `PATH`.
227+
228+
1. **Create a Profile Script:**
229+
230+
```bash
231+
sudo sh -c 'echo "export PATH=\$PATH:/usr/local/cuda/bin" >> /etc/profile.d/cuda.sh'
232+
```
233+
234+
**Explanation:**
235+
236+
- We add to `/etc/profile.d/` as the `/etc/` folder is unique to this particular container, and is not shared with other containers or the host system.
237+
- The backslash `\` before `$PATH` ensures the variable is correctly written into the script.
238+
239+
2. **Make the Script Executable:**
240+
241+
```bash
242+
sudo chmod +x /etc/profile.d/cuda.sh
243+
```
244+
245+
3. **Source the Script to Update Your Environment:**
246+
247+
```bash
248+
source /etc/profile.d/cuda.sh
249+
```
250+
251+
**Note:** This command updates your current shell session with the new `PATH`. The `/etc/profile.d/cuda.sh` script ensures that the CUDA binaries are available in your `PATH` for all future sessions.
252+
253+
## Verifying the Installation
254+
255+
To confirm that CUDA is correctly installed and configured, check the version of the NVIDIA CUDA Compiler (`nvcc`):
256+
257+
```bash
258+
nvcc --version
259+
```
260+
261+
You should see output similar to:
262+
263+
```
264+
nvcc: NVIDIA (R) Cuda compiler driver
265+
Copyright (c) 2005-2024 NVIDIA Corporation
266+
Built on Tue_Oct_29_23:50:19_PDT_2024
267+
Cuda compilation tools, release 12.6, V12.6.85
268+
Build cuda_12.6.r12.6/compiler.35059454_0
269+
```
270+
271+
This output confirms that the CUDA compiler is accessible and indicates the installed version.
272+
273+
## Conclusion
274+
275+
You have successfully set up CUDA on Fedora within a toolbox environment using the Fedora 39 CUDA repository. By manually resolving package conflicts and configuring the environment, you can develop CUDA applications without affecting your host system.
276+
277+
## Troubleshooting
278+
279+
- **Installation Failures:**
280+
- If you encounter errors during installation, carefully read the error messages. They often indicate conflicting files or missing dependencies.
281+
- Use the `--excludepath` option with `rpm` to exclude conflicting files during manual installations.
282+
283+
- **Driver Conflicts:**
284+
- Since the host system may already have NVIDIA drivers installed, conflicts can arise. Using the toolbox environment helps isolate these issues.
285+
286+
- **Environment Variables Not Set:**
287+
- If `nvcc` is not found after installation, ensure that `/usr/local/cuda/bin` is in your `PATH`.
288+
- Run `echo $PATH` to check if the path is included.
289+
- Re-source the profile script or open a new terminal session.
290+
291+
## Additional Notes
292+
293+
- **Updating CUDA in the Future:**
294+
- Keep an eye on the official NVIDIA repositories for updates to your Fedora version.
295+
- When an updated repository becomes available, adjust your `dnf` configuration accordingly.
296+
297+
- **Building `llama.cpp`:**
298+
- With CUDA installed, you can follow these [build instructions for `llama.cpp`](https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md) to compile it with CUDA support.
299+
- Ensure that any CUDA-specific build flags or paths are correctly set in your build configuration.
300+
301+
- **Using the Toolbox Environment:**
302+
- The toolbox environment is isolated from your host system, which helps prevent conflicts.
303+
- Remember that system files and configurations inside the toolbox are separate from the host. By default the home directory of the user is shared between the host and the toolbox.
304+
305+
---
306+
307+
**Disclaimer:** Manually installing and modifying system packages can lead to instability of the container. The above steps are provided as a guideline and may need adjustments based on your specific system configuration. Always back up important data before making significant system changes, especially as your home folder is writable and shared with he toolbox.
308+
309+
**Acknowledgments:** Special thanks to the Fedora community and NVIDIA documentation for providing resources that assisted in creating this guide.
310+
311+
## References
312+
313+
- [Fedora Toolbox Documentation](https://docs.fedoraproject.org/en-US/fedora-silverblue/toolbox/)
314+
- [NVIDIA CUDA Installation Guide](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html)
315+
- [Podman Documentation](https://podman.io/get-started)
316+
317+
---

examples/server/public/index.html.gz

600 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)