Skip to content

Commit eb1041c

Browse files
authored
Merge pull request #67 from archlitchi/update_documents
update documents about dynamic-mig feature
2 parents 739f344 + 3cb77d2 commit eb1041c

File tree

4 files changed

+20
-29
lines changed

4 files changed

+20
-29
lines changed

README.md

Lines changed: 16 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -77,9 +77,9 @@ We will be editing the docker daemon config file which is usually present at `/e
7777
> *if `runtimes` is not already present, head to the install page of [nvidia-docker](https://github.com/NVIDIA/nvidia-docker)*
7878
7979

80-
### Configure scheduler
80+
### Configuration
8181

82-
update the scheduler configuration:
82+
You need to enable vgpu in volcano-scheduler configMap:
8383

8484
```shell script
8585
kubectl edit cm -n volcano-system volcano-scheduler-configmap
@@ -111,7 +111,17 @@ data:
111111
- name: binpack
112112
```
113113
114-
Customize your installation by adjusting the [configs](doc/config.md)
114+
### Sharing Mode
115+
116+
Volcano-vgpu supports two types of device-sharing: `HAMi-core` and `dynamia-mig`, A node can either using `HAMi-core`, or `Dynamic-mig`. Heterogeneous is supported(a part of node using HAMi-core, the other using Dynamic-mig)
117+
118+
A brief introduction about these two modes:
119+
120+
HAMi-core is a user-layer resource isolator provided by HAMi community, works on all types of GPU.
121+
122+
Dynamic-mig is a hardware resource isolator, works on Ampere arch or later GPU.
123+
124+
You can set the sharing mode and customize your installation by adjusting the [configs](doc/config.md)
115125

116126

117127
### Enabling GPU Support in Kubernetes
@@ -130,28 +140,7 @@ Check the node status, it is ok if `volcano.sh/vgpu-number` is included in the a
130140
```shell script
131141
$ kubectl get node {node name} -oyaml
132142
...
133-
status:
134-
addresses:
135-
- address: 172.17.0.3
136-
type: InternalIP
137-
- address: volcano-control-plane
138-
type: Hostname
139-
allocatable:
140-
cpu: "4"
141-
ephemeral-storage: 123722704Ki
142-
hugepages-1Gi: "0"
143-
hugepages-2Mi: "0"
144-
memory: 8174332Ki
145-
pods: "110"
146-
volcano.sh/vgpu-memory: "89424"
147-
volcano.sh/vgpu-number: "10" # vGPU resource
148143
capacity:
149-
cpu: "4"
150-
ephemeral-storage: 123722704Ki
151-
hugepages-1Gi: "0"
152-
hugepages-2Mi: "0"
153-
memory: 8174332Ki
154-
pods: "110"
155144
volcano.sh/vgpu-memory: "89424"
156145
volcano.sh/vgpu-number: "10" # vGPU resource
157146
```
@@ -166,6 +155,8 @@ apiVersion: v1
166155
kind: Pod
167156
metadata:
168157
name: gpu-pod1
158+
annotations:
159+
volcano.sh/vgpu-mode: "hami-core" # (Optional, 'hami-core' or 'mig')
169160
spec:
170161
schedulerName: volcano
171162
containers:
@@ -188,6 +179,7 @@ You can validate device memory using nvidia-smi inside container:
188179
> **WARNING:** *if you don't request GPUs when using the device plugin with NVIDIA images all
189180
> the GPUs on the machine will be exposed inside your container.
190181
> The number of vgpu used by a container can not exceed the number of gpus on that node.*
182+
> You can specify the mode of this task by assigning `volcano.sh/vgpu-mode` annotations, If not, both modes are possible.
191183

192184
### Monitor
193185

examples/vgpu-case01.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ spec:
66
restartPolicy: OnFailure
77
schedulerName: volcano
88
containers:
9-
- image: ubuntu:20.04
9+
- image: ubuntu:24.04
1010
name: pod1-ctr
1111
command: ["sleep"]
1212
args: ["100000"]

examples/vgpu-case02.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,11 @@ spec:
66
restartPolicy: OnFailure
77
schedulerName: volcano
88
containers:
9-
- image: nvidia/cuda:11.2.2-base-ubi8
9+
- image: ubuntu:24.04
1010
name: pod1-ctr
1111
command: ["sleep"]
1212
args: ["100000"]
1313
resources:
1414
limits:
1515
volcano.sh/vgpu-number: 1 #request 1 GPU
16-
volcano.sh/vgpu-cores: 50 #each GPU request 50% of compute core resources
1716
volcano.sh/vgpu-memory: 10240 #each GPU request 10G device memory

examples/vgpu-deployment.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@ spec:
1515
schedulerName: volcano
1616
containers:
1717
- name: resnet101-container
18-
image: ubuntu:18.04
18+
image: ubuntu:24.04
1919
command: ["sleep","infinity"]
2020
resources:
2121
limits:
2222
volcano.sh/vgpu-number: 1 # requesting 2 vGPUs
23-
volcano.sh/vgpu-memory: 16384
23+
volcano.sh/vgpu-memory: 16384

0 commit comments

Comments
 (0)