Skip to content

Commit d6330f0

Browse files
committed
📝 Major doc upgrade
1 parent 4092303 commit d6330f0

File tree

9 files changed

+3031
-1831
lines changed

9 files changed

+3031
-1831
lines changed

docs/.vitepress/config.mts

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ export default defineConfig({
2121
],
2222

2323
vue: {
24-
reactivityTransform: true,
2524
},
2625

2726
themeConfig: {
@@ -62,6 +61,10 @@ export default defineConfig({
6261
text: 'Getting Started',
6362
link: '/guide/',
6463
},
64+
{
65+
text: 'Moving between Machines',
66+
link: '/guide/cluster',
67+
},
6568
{
6669
text: 'Limitations',
6770
link: '/guide/limit',

docs/.vitepress/theme/styles/vars.css

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
--vp-c-brand-dark: #980218;
1111
--vp-c-brand-darker: #710515;
1212
--vp-c-brand-dimm: rgba(207, 0, 32, 0.08);
13+
--vp-c-brand-1: #cf0020;
1314
}
1415

1516
/**

docs/config/index.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,17 @@ layout: doc
88

99
| Name | Version | Installation Method | Location | Uninstall Method |
1010
| -------------------------------- | --------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
11-
| Nvidia driver (root + container) | 525.85.05 | [Run file](https://us.download.nvidia.com/XFree86/Linux-x86_64/525.85.05/NVIDIA-Linux-x86_64-525.85.05.run), with Vulkan support, DKMS for root, no-kernel for container | Devices mount rule: `/etc/udev/rules.d/70-nvidia.rules`, <br />Module loaded on boot: `/etc/modules-load.d/modules.conf` | (not recommended)<br /> `driver.run --uninstall` |
11+
| Nvidia driver (root + container) | 550.78 | [Run file](https://us.download.nvidia.com/XFree86/Linux-x86_64/550.78/NVIDIA-Linux-x86_64-550.78.run), with Vulkan support, DKMS for root, no-kernel for container | Devices mount rule: `/etc/udev/rules.d/70-nvidia.rules`, <br />Module loaded on boot: `/etc/modules-load.d/modules.conf` | (not recommended)<br /> `driver.run --uninstall` |
1212
| CUDA | 11.8.0 | [Run file](https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run) | `/usr/local/cuda/``/usr/local/cuda-11.8` | (not recommended)<br />`/usr/local/cuda/bin/cuda-uninstaller` |
1313
| CUDNN | 8.6.0.163 | Local Debian Package (Require Nvidia account) | DyLib: `/usr/lib/x86_64-linux-gnu/libcudnn*`<br />Header: `/usr/include/cudnn*` | `dpkg -r` |
1414
| TensorRT | 8.5.3.1 | Uncompress TAR (Require Nvidia account) | Dylib: `/usr/lib/x86_64-linux-gnu/libfakeroot/libnvinfer*`<br />Header: `/usr/include/cuda/NvInfer*` | Manual deletion |
15-
| Tensorflow | 2.11.0 | Compile from [Github](https://github.com/tensorflow/tensorflow/commit/d5b57ca93e506df258271ea00fc29cf98383a374) d5b57ca, with CUDA and TensorRT support | `/home/ubuntu/miniconda3/lib/python3.10/site-packages/tensorflow/` | `pip uninstall` |
16-
| Miniconda3 | 23.1.0 | [Shell Script](https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh) | `/home/ubuntu/miniconda3` | `rm -rf` the folder |
17-
| PyTorch | 1.13.1 | Miniconda | `/home/ubuntu/miniconda3/lib/python3.10/site-packages/torch` | `conda remove` |
18-
| JupyterLab | 3.5.3 | Miniconda | `/home/ubuntu/miniconda3/lib/python3.10/site-packages/jupyter.py` | `conda remove` |
15+
| Tensorflow | 2.17.0 | PIP | `/home/ubuntu/miniconda3/lib/python3.10/site-packages/tensorflow/` | `pip uninstall` |
16+
| Miniconda3 | 24.7.1 | [Shell Script](https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh) | `/home/ubuntu/miniconda3` | `rm -rf` the folder |
17+
| PyTorch | 2.4.0+cu121 | Miniconda | `/home/ubuntu/miniconda3/lib/python3.10/site-packages/torch` | `conda remove` |
18+
| JupyterLab | 4.2.5 | Miniconda | `/home/ubuntu/miniconda3/lib/python3.10/site-packages/jupyter.py` | `conda remove` |
1919
| XRDP | 0.9.17 | APT | Dylib: `/usr/lib/x86_64-linux-gnu/xrdp*`, <br />Header: `/usr/include/xrdp*` | `apt autoremove` |
20-
| LXD (root) | 5.0.2 | SNAP | `/snap/lxd` | `snap remove --purge` |
21-
| ZFS (root) | 2.1.4 | APT | ` /usr/sbin/zfs` `/etc/zfs` `/usr/share/zfs` | `apt autoremove` |
20+
| LXD (root) | 5.0.3 | SNAP | `/snap/lxd` | `snap remove --purge` |
21+
| ZFS (root) | 2.1.5 | APT | ` /usr/sbin/zfs` `/etc/zfs` `/usr/share/zfs` | `apt autoremove` |
2222

2323

2424

docs/guide/cluster.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
# Moving between Machines
2+
3+
The RoseLab environment provides a powerful utility for managing and migrating your containers across different servers. This tool allows you to copy your full environment between servers in less than 10 minutes, making it easy to balance workloads and access new resources.
4+
5+
## Accessing the Utility
6+
7+
1. On any RoseLab container, navigate to `/public/common-utilities/`
8+
2. Run the command: `python client.py`. You might need to install Python 3.8+ and libraries like `requests` and `json`.
9+
10+
## Key Features
11+
12+
The utility offers several options:
13+
14+
1. Add or delete port mapping
15+
2. Copy container
16+
3. Start container
17+
4. Stop container
18+
5. Remove container
19+
6. Create new container
20+
21+
## Container Migration Process
22+
23+
To migrate your container to another server:
24+
25+
1. Use option 2 to copy your environment to another server
26+
2. Use option 3 to start your copied container on the new server
27+
28+
## Creating New Containers
29+
30+
Use option 6 to create a new container from preset images on any available server.
31+
32+
## Benefits of Using This Utility
33+
34+
- Access to multiple servers (roselab1~4) independently
35+
- Quick replication of your environment on another server when your primary server is overloaded
36+
- Easy synchronization of environments across multiple servers
37+
- Set up external access to JupyterLab, TCP services, or web services without SSH forwarding
38+
- Restart containers that were accidentally shut down
39+
- Delete and recreate containers in case of environment corruption
40+
- Backup containers before risky operations
41+
- Immediate access to new RoseLab servers as they come online
42+
43+
## Migration Time
44+
45+
Migration speed depends on container size. With few concurrent migrations, the transfer speed exceeds 300MB/s. A 128 GB container takes approximately 6 minutes to copy.
46+
47+
## Differences Between Copied and Original Containers
48+
49+
Copied containers are nearly identical to the originals, with only these differences:
50+
- The hostname changes from roselabX.ucsd.edu to roselabY.ucsd.edu
51+
- Potential hardware variations (CPU, GPU)
52+
53+
Software configurations remain the same. For example, a website available at roselabX.ucsd.edu:55555 will also be accessible at roselabY.ucsd.edu:55555, provided your port mappings are applied across all servers.
54+
55+
## Port Mapping
56+
57+
The utility allows you to manage TCP and HTTPS port mappings:
58+
- TCP mapping (e.g., roselabX.ucsd.edu:55555 → container:8888 for JupyterLab) makes your service available via http://roselabX.ucsd.edu:55555
59+
- HTTPS mapping makes it available via https://roselabX.ucsd.edu:55555
60+
61+
HTTPS adds a security layer and is more browser-friendly but only supports hosted HTTP web services, not TCP services like FTP or SSH.
62+
63+
## Limitations
64+
65+
- Master students may have restricted cross-server copying permissions. Contact Rose for expanded access.
66+
- You can have only one container per host. To create or copy a new container, you must first remove your old one.
67+
- The script requires additional confirmation before container removal. Exercise extreme caution to prevent data loss.
68+
- It's recommended to access this utility from your "main" container, as you can't delete an active container.
69+
70+
## Automation
71+
72+
You can create scripts for frequent or scheduled synchronization. The current inter-server bandwidth is 25Gbps, with plans to upgrade to 100Gbps. The 300MB/s data transfer rate is well within these limits.
73+
74+
## Security Measures
75+
76+
The utility implements strict security measures:
77+
- It only accepts connections from the RoseLAN intranet.
78+
- Users can only manage containers under their own name.
79+
- You cannot modify other users' port mappings.
80+
- The /public/common-utilities/ folder is only visible to the ubuntu user, not even to root.
81+
- Non-ubuntu container users (unless with sudo or API key) cannot shut down or delete containers using this utility.
82+
83+
::: warning
84+
If you accidentally share the API key from the client, notify the admin immediately so it can be updated.
85+
:::
86+
87+
By utilizing this tool, you can efficiently manage your containers across the RoseLab servers, optimizing your workflow and making the most of the available resources.

0 commit comments

Comments
 (0)