Skip to content

Commit eaf5cb7

Browse files
Alexandru OrmenisanAlexandru Ormenisan
Alexandru Ormenisan
authored and
Alexandru Ormenisan
committed
[HWORKS-1627] Kubernetes Priority Classes & Labels
1 parent 0581f14 commit eaf5cb7

File tree

6 files changed

+105
-0
lines changed

6 files changed

+105
-0
lines changed
Loading
Loading
Loading
Loading
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# Scheduler
2+
3+
This page explains how we provide the user with access to Kubernetes properties when running Hopsworks computation on top of Kubernetes. Currently these capabilities include:
4+
5+
- [Affinity](https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/)
6+
- [Priority Classes](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass)
7+
8+
These capabilities require some configuration from a Hopsworks Admin - as can be seen in the [Cluster configuration](#cluster-configuration), [Default Project configuration](#default-project-configuration) and [Custom Project configuration](#custom-project-configuration) section. Some further configuration on defaults can be done by Hopsworks data owners - as can be seen in the [Project defaults](#project-defaults) section. Finally, these capabilities can be used by all members of a project within:
9+
10+
- Jobs
11+
- Jupyter Notebooks
12+
- Model Deployments
13+
14+
## Node Labels, Node Affinity and Node Anti-Affinity
15+
16+
Labels in Kubernetes are key-value pairs used to organize and select resources. In this particular page we will show how labels applied to nodes can be used for pod-node affinity to determine where the pod can (or cannot) run.
17+
18+
Some base uses cases where labels and affinity can be used:
19+
20+
- Hardware constraints (GPU, SSD)
21+
- Environment separation (prod/dev)
22+
- Co-locating related pods
23+
- Spreading pods for high availability
24+
25+
In Hopsworks we make use of the node affinity `IN` operator for the Hopsworks Node Affinity and the `NOT IN` operator for the Hopsworks Node Anti Affinity.
26+
27+
For more information on Kubernetes Affinity, you can check the Kubernetes [Affinity documentation](https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/) page.
28+
29+
## Priority Classes
30+
31+
PriorityClasses in Kubernetes determine the scheduling and eviction priority of pods.
32+
33+
Pods with higher priority:
34+
35+
- Get scheduled first
36+
- Can preempt (evict) lower priority pods
37+
- Less likely to be evicted under resource pressure
38+
39+
Common uses:
40+
41+
- Protecting critical workloads
42+
- Ensuring core services stay running
43+
- Managing resource competition
44+
- Guaranteeing QoS for important applications
45+
46+
For more information on Priority Classes, you can check the Kubernetes [Priority Classes documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass) page.
47+
48+
## Cluster Configuration
49+
50+
The first step in configuring Affinity and Priority Classes is by a Hopsworks Admin through the page found under: `Cluster Settings -> Scheduler` as we can see in the image below.
51+
52+
![Cluster Configuration - Node Labels and Priority Classes](../../../assets/images/guides/project/scheduler/admin_cluster_scheduler.png)
53+
54+
When we want to configure the use of labels and priority classes there are a number of levels which require filtering: `kubernetes -> hopsworks cluster -> hopsworks project`
55+
56+
The Hopsworks Cluster tipycally runs inside a Kubernets Cluster and is not always the only inhabitant. So the first configuration level is to limit the subset of node labels and priority classes that can be used within the Hopsworks Cluster. This can be done from the `Available in Hopsworks` sub-section.
57+
58+
In order to be able to list all the Kubernetes Node Labels, Hopsworks requires this cluster role:
59+
60+
```
61+
- apiGroups: [""]
62+
resources: ["nodes"]
63+
verbs: ["get", "list"]
64+
```
65+
66+
In order to be able to list all the Kubernetes Cluster Priority Classes, Hopsworsk requires this cluster role:
67+
68+
```
69+
- apiGroups: ["scheduling.k8s.io"]
70+
resources: ["priorityclasses"]
71+
verbs: ["get", "list"]
72+
```
73+
74+
If the roles above are configured properly (by default configured with the Hopsworks installation) the Admin can only select values from the drop down menu. If the roles are missing, the Admin would require to enter them as free text and should be careful about typos. Any typos here will be propagated in the other configuration and use levels leading to errors or missbehaviour when running computation.
75+
76+
## Default Project Configuration
77+
78+
The second level of configuration we can do is project configuration. At this level, the Hopsworks Admin restricts the Node Labels and Priority Classes that can be used within a project. This will be a subset of the ones configured for Hopsworks.
79+
In the figure above, in the sub-section `Available in Project` the Hopsworks Admin can configure the Node Labels and Priority Classes available by default in any Hopsworks Project.
80+
81+
## Custom Project Configuration
82+
83+
The Project level configuration, can be customised further and the Hopsworks Admin can configure per project Node Labels and Priority Classes selection in the menu option: `Cluster Settings -> Project -> <ProjectName> -> edit configuration`
84+
85+
![Custom Project Configuration - Node Labels and Priority Classes](../../../assets/images/guides/project/scheduler/admin_project_scheduler.png)
86+
87+
## Project defaults
88+
89+
Every Member of a project with the role `Data Owner` can then set the default values for the project. These defaults will be set in the Advanced configuration of Jobs, Notebooks, and Deployments, but they can of course be modified if so required.
90+
The default Label will be used for the default Node Affinity for Jobs, Nodes, and Deployments.
91+
92+
![ Project Default - Labels and Priority Classes](../../../assets/images/guides/project/scheduler/project_default.png)
93+
94+
## Configuration of Jobs, Notebooks, and Deployments
95+
96+
In the Advance configuration of Job, Notebook, and Deployments, we can set Affinity, Anti Affinity, and Priority Class. The Affinity and Anti Affinity can be selected from the list of allowed labels.
97+
98+
`Affinity` configures on which nodes this pod can run. If a node has any of the labels present in the Affinity option, the pod can be scheduler to run to run there.
99+
100+
`Anti Affinity` configures on which nodes this pod will not run on. If a node has any of the labels present in the Anti Affinity option, the pod will not be scheduler to run there.
101+
102+
`Priority Class` specifies with which priority a pod will run.
103+
104+
![ Job Configuration - Affinity and Priority Classes](../../../assets/images/guides/project/scheduler/job_configuration.png)

mkdocs.yml

+1
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,7 @@ nav:
154154
- Run Python Job: user_guides/projects/jobs/python_job.md
155155
- Run Jupyter Notebook Job: user_guides/projects/jobs/notebook_job.md
156156
- Scheduling: user_guides/projects/jobs/schedule_job.md
157+
- Kubernetes Scheduling: user_guides/projects/scheduling/kube_scheduler.md
157158
- Airflow: user_guides/projects/airflow/airflow.md
158159
- OpenSearch:
159160
- Connect: user_guides/projects/opensearch/connect.md

0 commit comments

Comments
 (0)