Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLOUD-97] Add support for GCR/GKE #342

Merged
merged 9 commits into from
Mar 13, 2024
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 17 additions & 5 deletions docs/setup_installation/gcp/cluster_creation.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ Enter the name of the bucket in which the Hopsworks cluster will store its data
!!! warning
The bucket must be empty and must be in a region accessible from the region in which the cluster is deployed.

The artifact registry used to store the clusters docker images (9)

<p align="center">
<figure>
<img style="border: 1px solid #000;width:700px" src="../../../assets/images/setup_installation/managed/gcp/create-instance-general.png" alt="General configuration">
Expand Down Expand Up @@ -97,7 +99,17 @@ To backup the storage bucket data when taking a cluster backup we need to set a
</figure>
</p>

### Step 6 VPC and Subnet selection
### Step 6 Managed Containers
Hopsworks cluster can integrate with Google GKE (GKE) to launch Python jobs, Jupyter servers, and ML model servings on top of Google GKE. For more detail on how to set up this integration refer to [Integration with Google GKE](gke_integration.md).
<p align="center">
<figure>
<img style="border: 1px solid #000;width:700px" src="../../../assets/images/setup_installation/managed/gcp/create-instance-set-gke-cluster.png" alt="Add GKE cluster name">
<figcaption>Add GKE cluster name</figcaption>
</figure>
</p>


### Step 7 VPC and Subnet selection

You can select the VPC which will be used by the Hopsworks cluster.
You can either select an existing VPC or let [managed.hopsworks.ai](https://managed.hopsworks.ai) create one for you.
Expand Down Expand Up @@ -125,7 +137,7 @@ Select the *Subnet* to be used by your cluster and press *Next*.
</p>


### Step 7 User management selection
### Step 8 User management selection
In this step, you can choose which user management system to use. You have three choices:

* *Managed*: [managed.hopsworks.ai](https://managed.hopsworks.ai) automatically adds and removes users from the Hopsworks cluster when you add and remove users from your organization (more details [here](../common/user_management.md)).
Expand All @@ -140,7 +152,7 @@ In this step, you can choose which user management system to use. You have three
</figure>
</p>

### Step 8 Managed RonDB
### Step 9 Managed RonDB
Hopsworks uses [RonDB](https://www.rondb.com/) as a database engine for its online Feature Store. By default database will run on its
own VM. Premium users can scale-out database services to multiple VMs
to handle increased workload.
Expand All @@ -156,7 +168,7 @@ For details on how to configure RonDB check our guide [here](../common/rondb.md)

If you need to deploy a RonDB cluster instead of a single node please contact [us](mailto:sales@logicalclocks.com).

### Step 9 add tags to your instances.
### Step 10 add tags to your instances.
In this step, you can define tags that will be added to the cluster virtual machines.

<p align="center">
Expand All @@ -183,7 +195,7 @@ configuration as this might affect Cluster creation.
</figure>
</p>

### Step 11 Review and create
### Step 12 Review and create
Review all information and select *Create*:

<p align="center">
Expand Down
11 changes: 9 additions & 2 deletions docs/setup_installation/gcp/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,12 @@ To run all the commands on this page the user needs to have at least the followi
storage.buckets.create
```

Make sure to enable *Compute Engine API*, *Cloud Resource Manager API*, and *Identity and Access Management (IAM) API* on the GCP project. This can be done by running the following commands. Replacing *$PROJECT_ID* with the id of your GCP project.
Make sure to enable *Compute Engine API*, *Cloud Resource Manager API*, *Identity and Access Management (IAM) API*, and *Artifact Registry* on the GCP project. This can be done by running the following commands. Replacing *$PROJECT_ID* with the id of your GCP project.
```bash
gcloud --project=$PROJECT_ID services enable compute.googleapis.com
gcloud --project=$PROJECT_ID services enable cloudresourcemanager.googleapis.com
gcloud --project=$PROJECT_ID services enable iam.googleapis.com
gcloud --project=$PROJECT_ID services enable artifactregistry.googleapis.com
```
You can find more information about GCP cloud APIs in the [GCP documentation](https://cloud.google.com/apis/docs/getting-started).
## Step 1: Connecting your GCP account
Expand Down Expand Up @@ -63,7 +64,7 @@ gsutil mb -p $PROJECT_ID -l US gs://$BUCKET_NAME


## Step 3: Creating a service account for your cluster instances
The cluster instances will need to be granted permission to access the storage bucket. You achieve this by creating a service account that will later be attached to the Hopsworks cluster instances. This service account should be different from the service account created in step 1, as it has only those permissions related to storing objects in a GCP bucket.
The cluster instances will need to be granted permission to access the storage bucket and the artifact registry. You achieve this by creating a service account that will later be attached to the Hopsworks cluster instances. This service account should be different from the service account created in step 1, as it has only those permissions related to storing objects in a GCP bucket and docker images in an artifact registry repository.

### Step 3.1: Creating a custom role for accessing storage
Create a file named *hopsworksai_instances_role.yaml* with the following content:
Expand All @@ -84,6 +85,12 @@ includedPermissions:
- storage.objects.get
- storage.objects.list
- storage.objects.update
- artifactregistry.repositories.create
- artifactregistry.repositories.get
- artifactregistry.repositories.uploadArtifacts
- artifactregistry.repositories.downloadArtifacts
- artifactregistry.tags.list
- artifactregistry.tags.delete
```

!!! note
Expand Down
108 changes: 108 additions & 0 deletions docs/setup_installation/gcp/gke_integration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Integration with Goolge GKE

This guide demonstrates the step-by-step process to create a cluster in [managed.hopsworks.ai](https://managed.hopsworks.ai) with integrated support for Google Kubernetes Engine (GKE). This enables Hopsworks to launch Python jobs, Jupyter servers, and serve models on top of GKE.

!!! note
Currently, we don't support sharing GKE clusters between Hopsworks clusters. That is, a GKE cluster can be only used by one Hopsworks cluster. Also, we only support integration with GKE in the same project as Hopsworks cluster.

!!! note
If you prefer to use Terraform over gcloud command line, then you can refer to our Terraform example [here](https://github.com/logicalclocks/terraform-provider-hopsworksai/tree/main/examples/complete/gcp/gke).

## Step 1: Attach Kuberentes developer role to the service account for cluster instances

Ensure that the Hopsworks cluster has access to the GKE cluster by attaching the Kubernetes Engine Developer role to the [service account you will attach to the cluster nodes](getting_started.md#step-3-creating-a-service-account-for-your-cluster-instances). Execute the following gcloud command to attach `roles/container.developer` to the cluster service account. Replace *\$PROJECT_ID* with your GCP project id and *\$SERVICE_ACCOUNT* with your service account that you have created during getting started [Step 3](getting_started.md#step-3-creating-a-service-account-for-your-cluster-instances).

```bash
gcloud projects add-iam-policy-binding $PROJECT_ID --member=$SERVICE_ACCOUNT --role="roles/container.developer"
```

## Steps 2: Create a virtual network to be used by Hopsworks and GKE

You need to create a virtual network and a subnet in which Hopsworks and the GKE nodes will run. To do this run the following commands, replacing *\$PROJECT_ID* with your GCP project id in which you will run your cluster and *\$SERVICE_ACCOUNT* with the service account that you have updated in [Step 1](#step-1-attach-kuberentes-developer-role-to-the-service-account-for-cluster-instances). In this step, we will create a virtual network `hopsworks`, a subnetwork `hopsworks-eu-north`, and 3 firewall rules to allow communication within the virtual network and allow inbound http and https traffic.

```bash
gcloud compute networks create hopsworks --project=$PROJECT_ID --subnet-mode=custom --mtu=1460 --bgp-routing-mode=regional

gcloud compute networks subnets create hopsworks-eu-north --project=$PROJECT_ID --range=10.1.0.0/24 --stack-type=IPV4_ONLY --network=hopsworks --region=europe-north1

gcloud compute firewall-rules create hopsworks-nodetonode --network=hopsworks --allow=all --direction=INGRESS --target-service-accounts=$SERVICE_ACCOUNT --source-service-accounts=$SERVICE_ACCOUNT --project=$PROJECT_ID

gcloud compute firewall-rules create hopsworks-inbound-http --network=hopsworks --allow=all --direction=INGRESS --target-service-accounts=$SERVICE_ACCOUNT --allow=tcp:80 --source-ranges="0.0.0.0/0" --project=$PROJECT_ID

gcloud compute firewall-rules create hopsworks-inbound-https --network=hopsworks --allow=all --direction=INGRESS --target-service-accounts=$SERVICE_ACCOUNT --allow=tcp:443 --source-ranges="0.0.0.0/0" --project=$PROJECT_ID

```

## Step 3: Create a GKE cluster

In this step, we create a GKE cluster and we set the cluster pod CIDR to `10.124.0.0/14`. GKE offers two different modes of operation for clusters: `Autopilot` and `Standard` clusters. Choose one of the two following options to create a GKE cluster.

### Option 1: Standard cluster

Run the following gcloud command to create a zonal standard GKE cluster. Replace *\$PROJECT_ID* with your GCP project id in which you will run your cluster.

```bash
gcloud container clusters create hopsworks-gke --project=$PROJECT_ID --machine-type="e2-standard-8" --num-nodes=1 --zone="europe-north1-c" --network="hopsworks" --subnetwork="hopsworks-eu-north" --cluster-ipv4-cidr="10.124.0.0/14" --cluster-version="1.27.3-gke.100"
```

Run the following gcloud command to allow all incoming traffic from the GKE cluster to Hopsworks.

```bash
gcloud compute firewall-rules create hopsworks-allow-traffic-from-gke-pods --project=$PROJECT_ID --network="hopsworks" --direction=INGRESS --priority=1000 --action=ALLOW --rules=all --source-ranges="10.124.0.0/14"
```

### Option 2: Autopilot cluster

Run the following gcloud command to create an autopilot cluster. Replace *\$PROJECT_ID* with your GCP project id in which you will run your cluster.

```bash
gcloud container clusters create-auto hopsworks-gke --project $PROJECT_ID --region="europe-north1" --network="hopsworks" --subnetwork="hopsworks-eu-north" --cluster-ipv4-cidr="10.124.0.0/14"
```

Run the following gcloud command to allow all incoming traffic from the GKE cluster to Hopsworks.

```bash
gcloud compute firewall-rules create hopsworks-allow-traffic-from-gke-pods --project=$PROJECT_ID --network="hopsworks" --direction=INGRESS --priority=1000 --action=ALLOW --rules=all --source-ranges="10.124.0.0/14"
```

## Step 4: Create a Hopsworks cluster

In [managed.hopsworks.ai](https://managed.hopsworks.ai), follow the same instructions as in [the cluster creation guide](cluster_creation.md) except when setting *Region*, *Managed Containers*, *VPC* and *Subnet*.

- On the General tab, choose the same region as what we use in [Step 2](#steps-2-create-a-virtual-network-to-be-used-by-hopsworks-and-gke) and [Step 3](#step-3-create-a-gke-cluster) (`europe-north1`)
- On the *Managed Containers* tab, choose **Enabled** and input the name of the GKE cluster that we have created in [Step 3](#step-3-create-a-gke-cluster) (`hopsworks-gke`)
- On the VPC and Subnet tabs, choose the name of the network and subnetwork that we have created in [Step 2](#steps-2-create-a-virtual-network-to-be-used-by-hopsworks-and-gke) (`hopsworks`, `hopsworks-eu-north`).

## Step 5: Configure DNS

### Option 1: Standard cluster
In the setup described in [Step 3](#option-1-standard-cluster), we are using the default DNS which is `kube-dns`. Hopsworks automatically configures `kube-dns` during cluster initialization, so there is no extra steps that needs to be done here.

Alternatively, if you configure `Cloud DNS` while creating the standard GKE cluster, then you would need to add the following firewall rule to allow the incoming traffic from `Cloud DNS` on port `53` to Hopsworks. `35.199.192.0/19` is the ip range used by Cloud DNS to issue DNS requests, check [this guide](https://cloud.google.com/dns/docs/zones/forwarding-zones#firewall-rules) for more details.

```bash
gcloud compute --project=$PROJECT_ID firewall-rules create hopsworks-clouddns-forward-consul --direction=INGRESS --priority=1000 --network="hopsworks" --action=ALLOW --rules=udp:53 --source-ranges="35.199.192.0/19"
```


### Option 2: Autopilot cluster

Hopsworks internally uses Consul for service discovery and we automatically forward traffic from Standard GKE clusters to the corresponding Hopsworks cluster. However, Autopilot clusters don't allow updating the DNS configurations through `kube-dns` and they use Cloud DNS by default. Therefore, in order to allow seamless communication between pods running on GKE and Hopsworks, we would need to add a [forwarding zone](https://cloud.google.com/dns/docs/zones/forwarding-zones) to Cloud DNS to forward `.consul` DNS Zone to Hopsworks head node.

First, we need to get the IP of the Hopsworks head node of your cluster. Replace *\$PROJECT_ID* with your GCP project id in which you will run your cluster, *\$CLUSTER_NAME* with the name you gave to your Hopsworks cluster during creation in [Step 4](#step-4-create-a-hopsworks-cluster). Using the following gcloud command, we will be able to get the internal IP of Hopsworks cluster.

```bash
HOPSWORKS_HEAD_IP=`gcloud compute instances describe --project=$PROJECT_ID $CLUSTER_NAME-master --format='get(networkInterfaces[0].networkIP)'`
```

Use the *\$HOPSWORKS_HEAD_IP* you just got from the above command to create the following forwarding zone on Cloud DNS

```bash
gcloud dns --project=$PROJECT_ID managed-zones create hopsworks-consul --description="Forward .consul DNS requests to Hopsworks" --dns-name="consul." --visibility="private" --networks="hopsworks" --forwarding-targets=$HOPSWORKS_HEAD_IP
```

Finally, you would need to add the following firewall rule to allow the incoming traffic from `Cloud DNS` on port `53` to Hopsworks. `35.199.192.0/19` is the IP range used by Cloud DNS to issue DNS requests, check [this guide](https://cloud.google.com/dns/docs/zones/forwarding-zones#firewall-rules) for more details.

```bash
gcloud compute --project=$PROJECT_ID firewall-rules create hopsworks-clouddns-forward-consul --direction=INGRESS --priority=1000 --network="hopsworks" --action=ALLOW --rules=udp:53 --source-ranges="35.199.192.0/19"
```
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,7 @@ nav:
- Getting Started: setup_installation/gcp/getting_started.md
- Cluster Creation: setup_installation/gcp/cluster_creation.md
- Limiting Permissions: setup_installation/gcp/restrictive_permissions.md
- GKE integration: setup_installation/gcp/gke_integration.md
- Common:
- The dashboard: setup_installation/common/dashboard.md
- Settings: setup_installation/common/settings.md
Expand Down
Loading