Skip to content

feat: Added FailureDomains as Nutanix cluster mutation #684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions api/v1alpha1/crds/caren.nutanix.com_nutanixclusterconfigs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -572,6 +572,83 @@ spec:
- host
- port
type: object
failureDomains:
description: |-
failureDomains configures failure domains information for the Nutanix platform.
When set, the failure domains defined here may be used to spread Machines across
prism element clusters to improve fault tolerance of the cluster.
items:
description: NutanixFailureDomain configures failure domain information for Nutanix.
properties:
cluster:
description: |-
cluster is to identify the cluster (the Prism Element under management of the Prism Central),
in which the Machine's VM will be created. The cluster identifier (uuid or name) can be obtained
from the Prism Central console or using the prism_central API.
properties:
name:
description: name is the resource name in the PC
type: string
type:
description: Type is the identifier type to use for this resource.
enum:
- uuid
- name
type: string
uuid:
description: uuid is the UUID of the resource in the PC.
type: string
required:
- type
type: object
controlPlane:
description: indicates if a failure domain is suited for control plane nodes
type: boolean
name:
description: |-
name defines the unique name of a failure domain.
Name is required and must be at most 64 characters in length.
It must consist of only lower case alphanumeric characters and hyphens (-).
It must start and end with an alphanumeric character.
This value is arbitrary and is used to identify the failure domain within the platform.
maxLength: 64
minLength: 1
pattern: '[a-z0-9]([-a-z0-9]*[a-z0-9])?'
type: string
subnets:
description: |-
subnets holds a list of identifiers (one or more) of the cluster's network subnets
for the Machine's VM to connect to. The subnet identifiers (uuid or name) can be
obtained from the Prism Central console or using the prism_central API.
items:
description: NutanixResourceIdentifier holds the identity of a Nutanix PC resource (cluster, image, subnet, etc.)
properties:
name:
description: name is the resource name in the PC
type: string
type:
description: Type is the identifier type to use for this resource.
enum:
- uuid
- name
type: string
uuid:
description: uuid is the UUID of the resource in the PC.
type: string
required:
- type
type: object
minItems: 1
type: array
required:
- cluster
- name
- subnets
type: object
type: array
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
prismCentralEndpoint:
description: Nutanix Prism Central endpoint configuration.
properties:
Expand Down
44 changes: 44 additions & 0 deletions api/v1alpha1/nutanix_clusterconfig_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ import (
"fmt"
"net/url"
"strconv"

capxv1 "github.com/nutanix-cloud-native/cluster-api-runtime-extensions-nutanix/api/external/github.com/nutanix-cloud-native/cluster-api-provider-nutanix/api/v1beta1"
)

const (
Expand All @@ -23,6 +25,14 @@ type NutanixSpec struct {
// Nutanix Prism Central endpoint configuration.
// +kubebuilder:validation:Required
PrismCentralEndpoint NutanixPrismCentralEndpointSpec `json:"prismCentralEndpoint"`

// failureDomains configures failure domains information for the Nutanix platform.
// When set, the failure domains defined here may be used to spread Machines across
// prism element clusters to improve fault tolerance of the cluster.
// +listType=map
// +listMapKey=name
// +kubebuilder:validation:Optional
FailureDomains []NutanixFailureDomain `json:"failureDomains,omitempty"`
}

type NutanixPrismCentralEndpointSpec struct {
Expand Down Expand Up @@ -76,3 +86,37 @@ func (s NutanixPrismCentralEndpointSpec) ParseURL() (string, int32, error) {

return hostname, int32(port), nil
}

// NutanixFailureDomains is a list of FDs.
type NutanixFailureDomains []NutanixFailureDomain

// NutanixFailureDomain configures failure domain information for Nutanix.
type NutanixFailureDomain struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some validation we want to do with PE Clusters and Subnets, like setting either this or those values there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capx handles it by checking fd set or not first. https://github.com/nutanix-cloud-native/cluster-api-provider-nutanix/blob/v1.4.0-alpha.2/controllers/nutanixmachine_controller.go#L834 need to see if we need to handle that in caren since caren does not have any notion of sequence of which patch will get called first. do you have any proposals?

// name defines the unique name of a failure domain.
// Name is required and must be at most 64 characters in length.
// It must consist of only lower case alphanumeric characters and hyphens (-).
// It must start and end with an alphanumeric character.
// This value is arbitrary and is used to identify the failure domain within the platform.
// +kubebuilder:validation:Required
// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:MaxLength=64
// +kubebuilder:validation:Pattern=`[a-z0-9]([-a-z0-9]*[a-z0-9])?`
Name string `json:"name"`

// cluster is to identify the cluster (the Prism Element under management of the Prism Central),
// in which the Machine's VM will be created. The cluster identifier (uuid or name) can be obtained
// from the Prism Central console or using the prism_central API.
// +kubebuilder:validation:Required
Cluster capxv1.NutanixResourceIdentifier `json:"cluster"`

// subnets holds a list of identifiers (one or more) of the cluster's network subnets
// for the Machine's VM to connect to. The subnet identifiers (uuid or name) can be
// obtained from the Prism Central console or using the prism_central API.
// +kubebuilder:validation:Required
// +kubebuilder:validation:MinItems=1
Subnets []capxv1.NutanixResourceIdentifier `json:"subnets"`

// indicates if a failure domain is suited for control plane nodes
// +kubebuilder:validation:Required
ControlPlane bool `json:"controlPlane,omitempty"`
}
55 changes: 54 additions & 1 deletion api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

93 changes: 93 additions & 0 deletions docs/content/customization/nutanix/failure-domains.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
+++
title = "Failure Domains"
+++

Configure Failure Domains. Defines the Prism Element Cluster and subnets to use for creating Control Plane or Worker
node VMs of Kubernetes Cluster.

## Examples

### Configure one or more Failure Domains

```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: <NAME>
spec:
topology:
variables:
- name: clusterConfig
value:
nutanix:
failureDomains:
- cluster:
name: pe-cluster-name-1
type: name
controlPlane: true
name: failure-domain-name-1
subnets:
- name: subnet-name-1
type: name
- cluster:
name: pe-cluster-name-2
type: name
controlPlane: true
name: failure-domain-name-2
subnets:
- name: subnet-name-2
type: name

```

Applying this configuration will result in the following value being set:

- `NutanixCluster`:

```yaml
spec:
template:
spec:
failureDomains:
- cluster:
name: pe-cluster-name-1
type: name
controlPlane: true
name: failure-domain-name-1
subnets:
- name: subnet-name-1
type: name
- cluster:
name: pe-cluster-name-2
type: name
controlPlane: true
name: failure-domain-name-2
subnets:
- name: subnet-name-2
type: name
```

Note:

- Configuring Failure Domains is optional and if not configured then respective NutanixMachineTemplate's cluster and
subnets will be used to create respective control plane and worker nodes

- Only one Failure Domain can be used per Machine Deployment. Worker nodes will be created on respective Prism Element
cluster and subnet of the respective failure domain.

- Control plane nodes will be created on every failure domain's cluster which has ControlPlane boolean set to true.

Following is the way to set failure Domain to each Machine Deployment

- `NutanixCluster`:

```yaml
workers:
machineDeployments:
- class: default-worker
name: md-0
failureDomain: failure-domain-name-1
- class: default-worker
name: md-1
failureDomain: failure-domain-name-2
```
Loading
Loading