Skip to content

Commit e825fa9

Browse files
authored
[CLOUD-576] Drop support for GPU worker nodes in the cloud (#281)
1 parent 4d21199 commit e825fa9

24 files changed

+78
-674
lines changed

CHANGELOG.md

+1
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ NOTES:
55
BREAKING CHANGES:
66

77
ENHANCEMENTS:
8+
* Drop support for GPU workers on Spark
89
* resource/hopsworksai_cluster: Set Default `version` to 3.4.0
910

1011
FEATURES:

docs/data-sources/cluster.md

-24
Original file line numberDiff line numberDiff line change
@@ -63,32 +63,8 @@ data "hopsworksai_clusters" "cluster" {
6363

6464
Read-Only:
6565

66-
- `gpu_workers` (List of Object) (see [below for nested schema](#nestedobjatt--autoscale--gpu_workers))
6766
- `non_gpu_workers` (List of Object) (see [below for nested schema](#nestedobjatt--autoscale--non_gpu_workers))
6867

69-
<a id="nestedobjatt--autoscale--gpu_workers"></a>
70-
### Nested Schema for `autoscale.gpu_workers`
71-
72-
Read-Only:
73-
74-
- `disk_size` (Number)
75-
- `downscale_wait_time` (Number)
76-
- `instance_type` (String)
77-
- `max_workers` (Number)
78-
- `min_workers` (Number)
79-
- `spot_config` (List of Object) (see [below for nested schema](#nestedobjatt--autoscale--gpu_workers--spot_config))
80-
- `standby_workers` (Number)
81-
82-
<a id="nestedobjatt--autoscale--gpu_workers--spot_config"></a>
83-
### Nested Schema for `autoscale.gpu_workers.spot_config`
84-
85-
Read-Only:
86-
87-
- `fall_back_on_demand` (Boolean)
88-
- `max_price_percent` (Number)
89-
90-
91-
9268
<a id="nestedobjatt--autoscale--non_gpu_workers"></a>
9369
### Nested Schema for `autoscale.non_gpu_workers`
9470

docs/data-sources/clusters.md

-24
Original file line numberDiff line numberDiff line change
@@ -87,32 +87,8 @@ Read-Only:
8787

8888
Read-Only:
8989

90-
- `gpu_workers` (List of Object) (see [below for nested schema](#nestedobjatt--clusters--autoscale--gpu_workers))
9190
- `non_gpu_workers` (List of Object) (see [below for nested schema](#nestedobjatt--clusters--autoscale--non_gpu_workers))
9291

93-
<a id="nestedobjatt--clusters--autoscale--gpu_workers"></a>
94-
### Nested Schema for `clusters.autoscale.gpu_workers`
95-
96-
Read-Only:
97-
98-
- `disk_size` (Number)
99-
- `downscale_wait_time` (Number)
100-
- `instance_type` (String)
101-
- `max_workers` (Number)
102-
- `min_workers` (Number)
103-
- `spot_config` (List of Object) (see [below for nested schema](#nestedobjatt--clusters--autoscale--gpu_workers--spot_config))
104-
- `standby_workers` (Number)
105-
106-
<a id="nestedobjatt--clusters--autoscale--gpu_workers--spot_config"></a>
107-
### Nested Schema for `clusters.autoscale.gpu_workers.standby_workers`
108-
109-
Read-Only:
110-
111-
- `fall_back_on_demand` (Boolean)
112-
- `max_price_percent` (Number)
113-
114-
115-
11692
<a id="nestedobjatt--clusters--autoscale--non_gpu_workers"></a>
11793
### Nested Schema for `clusters.autoscale.non_gpu_workers`
11894

docs/data-sources/instance_type.md

-1
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,6 @@ data "hopsworksai_instance_type" "supported_type" {
4242
### Optional
4343

4444
- `min_cpus` (Number) Filter based on the minimum number of CPU cores. Defaults to `0`.
45-
- `min_gpus` (Number) Filter based on the minimum number of GPUs. Defaults to `0`.
4645
- `min_memory_gb` (Number) Filter based on the minimum memory in gigabytes. Defaults to `0`.
4746
- `with_nvme` (Boolean) Filter based on the presence of NVMe drives. Defaults to `false`.
4847

docs/data-sources/instance_types.md

-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,6 @@ data "hopsworksai_instance_types" "supported_worker_types" {
4141
Read-Only:
4242

4343
- `cpus` (Number)
44-
- `gpus` (Number)
4544
- `id` (String)
4645
- `memory` (Number)
4746
- `with_nvme` (Boolean)

docs/resources/cluster.md

-30
Original file line numberDiff line numberDiff line change
@@ -294,10 +294,6 @@ Required:
294294

295295
- `non_gpu_workers` (Block List, Min: 1, Max: 1) Setup auto scaling for non gpu nodes. (see [below for nested schema](#nestedblock--autoscale--non_gpu_workers))
296296

297-
Optional:
298-
299-
- `gpu_workers` (Block List, Max: 1) Setup auto scaling for gpu nodes. (see [below for nested schema](#nestedblock--autoscale--gpu_workers))
300-
301297
<a id="nestedblock--autoscale--non_gpu_workers"></a>
302298
### Nested Schema for `autoscale.non_gpu_workers`
303299

@@ -324,32 +320,6 @@ Optional:
324320

325321

326322

327-
<a id="nestedblock--autoscale--gpu_workers"></a>
328-
### Nested Schema for `autoscale.gpu_workers`
329-
330-
Required:
331-
332-
- `instance_type` (String) The instance type to use while auto scaling.
333-
334-
Optional:
335-
336-
- `disk_size` (Number) The disk size to use while auto scaling Defaults to `512`.
337-
- `downscale_wait_time` (Number) The time to wait before removing unused resources. Defaults to `300`.
338-
- `max_workers` (Number) The maximum number of workers created by auto scaling. Defaults to `10`.
339-
- `min_workers` (Number) The minimum number of workers created by auto scaling. Defaults to `0`.
340-
- `spot_config` (Block List, Max: 1) The configuration to use spot instances (see [below for nested schema](#nestedblock--autoscale--gpu_workers--spot_config))
341-
- `standby_workers` (Number) The percentage of workers to be always available during auto scaling. If you set this value to 0 new workers will only be added when a job or a notebook requests the resources. This attribute will not be taken into account if you set the minimum number of workers to 0 and no resources are used in the cluster, instead, it will start to take effect as soon as you start using resources. Defaults to `0.5`.
342-
343-
<a id="nestedblock--autoscale--gpu_workers--spot_config"></a>
344-
### Nested Schema for `autoscale.gpu_workers.spot_config`
345-
346-
Optional:
347-
348-
- `fall_back_on_demand` (Boolean) Fall back to on demand instance if unable to allocate a spot instance Defaults to `true`.
349-
- `max_price_percent` (Number) The maximum spot instance price in percentage of the on-demand price. Defaults to `100`.
350-
351-
352-
353323

354324
<a id="nestedblock--aws_attributes"></a>
355325
### Nested Schema for `aws_attributes`

docs/resources/cluster_from_backup.md

-30
Original file line numberDiff line numberDiff line change
@@ -70,10 +70,6 @@ Required:
7070

7171
- `non_gpu_workers` (Block List, Min: 1, Max: 1) Setup auto scaling for non gpu nodes. (see [below for nested schema](#nestedblock--autoscale--non_gpu_workers))
7272

73-
Optional:
74-
75-
- `gpu_workers` (Block List, Max: 1) Setup auto scaling for gpu nodes. (see [below for nested schema](#nestedblock--autoscale--gpu_workers))
76-
7773
<a id="nestedblock--autoscale--non_gpu_workers"></a>
7874
### Nested Schema for `autoscale.non_gpu_workers`
7975

@@ -100,32 +96,6 @@ Optional:
10096

10197

10298

103-
<a id="nestedblock--autoscale--gpu_workers"></a>
104-
### Nested Schema for `autoscale.gpu_workers`
105-
106-
Required:
107-
108-
- `instance_type` (String) The instance type to use while auto scaling.
109-
110-
Optional:
111-
112-
- `disk_size` (Number) The disk size to use while auto scaling Defaults to `512`.
113-
- `downscale_wait_time` (Number) The time to wait before removing unused resources. Defaults to `300`.
114-
- `max_workers` (Number) The maximum number of workers created by auto scaling. Defaults to `10`.
115-
- `min_workers` (Number) The minimum number of workers created by auto scaling. Defaults to `0`.
116-
- `spot_config` (Block List, Max: 1) The configuration to use spot instances (see [below for nested schema](#nestedblock--autoscale--gpu_workers--spot_config))
117-
- `standby_workers` (Number) The percentage of workers to be always available during auto scaling. If you set this value to 0 new workers will only be added when a job or a notebook requests the resources. This attribute will not be taken into account if you set the minimum number of workers to 0 and no resources are used in the cluster, instead, it will start to take effect as soon as you start using resources. Defaults to `0.5`.
118-
119-
<a id="nestedblock--autoscale--gpu_workers--spot_config"></a>
120-
### Nested Schema for `autoscale.gpu_workers.spot_config`
121-
122-
Optional:
123-
124-
- `fall_back_on_demand` (Boolean) Fall back to on demand instance if unable to allocate a spot instance Defaults to `true`.
125-
- `max_price_percent` (Number) The maximum spot instance price in percentage of the on-demand price. Defaults to `100`.
126-
127-
128-
12999

130100
<a id="nestedblock--aws_attributes"></a>
131101
### Nested Schema for `aws_attributes`

examples/complete/aws/autoscale/README.md

+3-12
Original file line numberDiff line numberDiff line change
@@ -44,15 +44,15 @@ terraform apply
4444

4545
## Update Autoscale
4646

47-
You can update the autoscale configuration after creations, by changing the `autoscale` configuration block. For example, you can configure autoscale for GPU workers as follows:
47+
You can update the autoscale configuration after creations, by changing the `autoscale` configuration block. For example, you can configure autoscale as follows:
4848

4949
> **Notice** that you need to run `terraform apply` after updating your configuration for your changes to take place.
5050
5151
```hcl
52-
data "hopsworksai_instance_type" "gpu_worker" {
52+
data "hopsworksai_instance_type" "small_worker" {
5353
cloud_provider = "AWS"
5454
node_type = "worker"
55-
min_gpus = 1
55+
min_cpus = 8
5656
}
5757
5858
resource "hopsworksai_cluster" "cluster" {
@@ -67,15 +67,6 @@ resource "hopsworksai_cluster" "cluster" {
6767
standby_workers = 0.5
6868
downscale_wait_time = 300
6969
}
70-
71-
gpu_workers {
72-
instance_type = data.hopsworksai_instance_type.gpu_worker.id
73-
disk_size = 256
74-
min_workers = 0
75-
max_workers = 5
76-
standby_workers = 0.5
77-
downscale_wait_time = 300
78-
}
7970
}
8071
8172
}

examples/complete/aws/basic/README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -82,13 +82,13 @@ resource "hopsworksai_cluster" "cluster" {
8282
}
8383
```
8484

85-
You can add a new different worker type for example another worker with at least one gpu as follows:
85+
You can add a new different worker type for example another worker with at least 16 cpu cores as follows:
8686

8787
```hcl
88-
data "hopsworksai_instance_type" "gpu_worker" {
88+
data "hopsworksai_instance_type" "my_worker" {
8989
cloud_provider = "AWS"
9090
node_type = "worker"
91-
min_gpus = 1
91+
min_cpus = 16
9292
}
9393
9494
resource "hopsworksai_cluster" "cluster" {
@@ -101,7 +101,7 @@ resource "hopsworksai_cluster" "cluster" {
101101
}
102102
103103
workers {
104-
instance_type = data.hopsworksai_instance_type.gpu_worker.id
104+
instance_type = data.hopsworksai_instance_type.my_worker.id
105105
disk_size = 512
106106
count = 1
107107
}

examples/complete/azure/autoscale/README.md

+4-13
Original file line numberDiff line numberDiff line change
@@ -44,38 +44,29 @@ terraform apply -var="resource_group=<YOUR_RESOURCE_GROUP>"
4444

4545
## Update Autoscale
4646

47-
You can update the autoscale configuration after creations, by changing the `autoscale` configuration block. For example, you can configure autoscale for GPU workers as follows:
47+
You can update the autoscale configuration after creations, by changing the `autoscale` configuration block. For example, you can configure your own worker as follows:
4848

4949
> **Notice** that you need to run `terraform apply` after updating your configuration for your changes to take place.
5050
5151
```hcl
52-
data "hopsworksai_instance_type" "gpu_worker" {
52+
data "hopsworksai_instance_type" "my_worker" {
5353
cloud_provider = "AZURE"
5454
node_type = "worker"
55-
min_gpus = 1
55+
min_cpus = 16
5656
}
5757
5858
resource "hopsworksai_cluster" "cluster" {
5959
# all the other configurations are omitted for clarity
6060
6161
autoscale {
6262
non_gpu_workers {
63-
instance_type = data.hopsworksai_instance_type.small_worker.id
63+
instance_type = data.hopsworksai_instance_type.my_worker.id
6464
disk_size = 256
6565
min_workers = 0
6666
max_workers = 10
6767
standby_workers = 0.5
6868
downscale_wait_time = 300
6969
}
70-
71-
gpu_workers {
72-
instance_type = data.hopsworksai_instance_type.gpu_worker.id
73-
disk_size = 256
74-
min_workers = 0
75-
max_workers = 5
76-
standby_workers = 0.5
77-
downscale_wait_time = 300
78-
}
7970
}
8071
8172
}

examples/complete/azure/basic/README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -82,13 +82,13 @@ resource "hopsworksai_cluster" "cluster" {
8282
}
8383
```
8484

85-
You can add a new different worker type for example another worker with at least one gpu as follows:
85+
You can add a new different worker type for example another worker with at least 16 cpu cores follows:
8686

8787
```hcl
88-
data "hopsworksai_instance_type" "gpu_worker" {
88+
data "hopsworksai_instance_type" "my_worker" {
8989
cloud_provider = "AZURE"
9090
node_type = "worker"
91-
min_gpus = 1
91+
min_cpus = 16
9292
}
9393
9494
resource "hopsworksai_cluster" "cluster" {
@@ -101,7 +101,7 @@ resource "hopsworksai_cluster" "cluster" {
101101
}
102102
103103
workers {
104-
instance_type = data.hopsworksai_instance_type.gpu_worker.id
104+
instance_type = data.hopsworksai_instance_type.my_worker.id
105105
disk_size = 512
106106
count = 1
107107
}

hopsworksai/data_source_instance_type.go

-11
Original file line numberDiff line numberDiff line change
@@ -46,13 +46,6 @@ func dataSourceInstanceType() *schema.Resource {
4646
Default: 0,
4747
ValidateFunc: validation.IntAtLeast(0),
4848
},
49-
"min_gpus": {
50-
Description: "Filter based on the minimum number of GPUs.",
51-
Type: schema.TypeInt,
52-
Optional: true,
53-
Default: 0,
54-
ValidateFunc: validation.IntAtLeast(0),
55-
},
5649
"with_nvme": {
5750
Description: "Filter based on the presence of NVMe drives.",
5851
Type: schema.TypeBool,
@@ -85,7 +78,6 @@ func dataSourceInstanceTypeRead(ctx context.Context, d *schema.ResourceData, met
8578

8679
minMemory := d.Get("min_memory_gb").(float64)
8780
minCPUs := d.Get("min_cpus").(int)
88-
minGPUs := d.Get("min_gpus").(int)
8981
withNVMe := d.Get("with_nvme").(bool)
9082

9183
var chosenType *api.SupportedInstanceType = nil
@@ -96,9 +88,6 @@ func dataSourceInstanceTypeRead(ctx context.Context, d *schema.ResourceData, met
9688
if minCPUs > 0 && v.CPUs < minCPUs {
9789
continue
9890
}
99-
if minGPUs > 0 && v.GPUs < minGPUs {
100-
continue
101-
}
10291
if withNVMe != v.WithNVMe {
10392
continue
10493
}

0 commit comments

Comments
 (0)