You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In my use-case I am often enabling and disabling NOS on individual nodes by adding/removing the label nos.nebuly.com/gpu-partitioning=mps. After labeling the node, NOS will change the GPU mode to exclusive. However, after removing the label, the GPU remains in exclusive mode.
Expected behavior: NOS should revert the GPU mode to whatever it was when it started or to default.
Workaround: Change back to default mode (or whatever mode you want) after removing the label. Do this for all GPUs. For example, to change the mode on GPU 0 back to default use the following.
nvidia-smi -i 0 -c 0
The text was updated successfully, but these errors were encountered:
Damowerko
changed the title
MPS server leaves GPUs on node in exclusive mode
NOS MPS leaves GPUs on node in exclusive mode
Mar 28, 2023
@Baenimyr Good that the device plugin supports MPS now. The problem is that it does not scale dynamically. Of course, NOS could use the NVIDIA plugin now. However, with the NVIDIA DRA driver on the horizon, it does not make sense for me personally to use NOS.
In my use-case I am often enabling and disabling NOS on individual nodes by adding/removing the label
nos.nebuly.com/gpu-partitioning=mps
. After labeling the node, NOS will change the GPU mode to exclusive. However, after removing the label, the GPU remains in exclusive mode.Expected behavior: NOS should revert the GPU mode to whatever it was when it started or to default.
Workaround: Change back to default mode (or whatever mode you want) after removing the label. Do this for all GPUs. For example, to change the mode on GPU 0 back to default use the following.
The text was updated successfully, but these errors were encountered: