Skip to content

Commit

Permalink
Add new metrics for Dataset. Need to add update logic
Browse files Browse the repository at this point in the history
  • Loading branch information
akuangkkk committed Dec 14, 2023
1 parent ee58544 commit 0b8405b
Show file tree
Hide file tree
Showing 5 changed files with 70 additions and 42 deletions.
37 changes: 20 additions & 17 deletions Documentation/devhandbook.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
# Developemnt Handbook

Make Sure you have a k8s cluster
#### Make Sure you have a k8s cluster

## Local Testing

### Step 0
Generate Helm Chart files: RUN ```./dev/build/generate.sh``` under project root.

Generate Helm Chart files by running `./dev/build/generate.sh` under project root.

### Step 1
Install Alluxion Operator via Helm Chart under ```deploy/charts/alluxio-operator```
Install Alluxion Operator via Helm Chart under `deploy/charts/alluxio-operator`
```shell
helm install operator -f operator-config.yaml deploy/charts/alluxio-operator
```
Expand All @@ -20,28 +19,32 @@ Install Alluxion Operator via Helm Chart under ```deploy/charts/alluxio-operator
```


## How to Make Operator Docker Image
## Make Alluxio K8s Operator Docker Image

### Step 0
注册一个dockerhub的账号,在terminal里docker login
Create a dockerhub account, and login in terminal

### Step 1
Generate Helm Chart files: RUN ```./dev/build/generate.sh``` under project root.
Generate Helm Chart files by running `./dev/build/generate.sh` under project root.

### Step 2
在project home的路径下运行 docker:
```shell
docker build -t <docker username>/alluxio-operator:<tag> -f dev/build/Dockerfile .
```
Build dokcer image by running `docker build -t <docker username>/alluxio-operator:<tag> -f dev/build/Dockerfile .` under project root.

* For Apple Silicon Chip: `docker buildx build --platform linux/amd64 -t <docker username>/alluxio-operator:<tag> -f dev/build/Dockerfile .`

* For Apple Silicon Chip:
```docker buildx build --platform linux/amd64 -t kshou433/alluxio-operator:v1.5 -f dev/build/Dockerfile .```

* Example:
```shell
docker buildx build --platform linux/amd64 -t kshou433/alluxio-operator:v1.6 -f dev/build/Dockerfile .
```

### Step 3
```shell
docker push kshou433/alluxio-operator:v1.5
```
Push image to docker hub : `docker push <docker username>/alluxio-operator:<tag>`.

* Example:
```shell
docker push kshou433/alluxio-operator:v1.6
```

### Step 4
Update image url and tage in ```operator-config.yaml```
Expand All @@ -50,7 +53,7 @@ Update image url and tage in ```operator-config.yaml```
## Verify the Deployment

### Check endpoints
```kubectl get endpoints```
```kubectl get endpoints -A```

### Check if prometheus is running
```kubectl get prometheus -o yaml```
Expand Down
17 changes: 15 additions & 2 deletions Documentation/runbook.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,32 @@



## How to deploy Alluxio K8S Operator with Prometheus Operator
## How to deploy Alluxio K8S Operator with [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus)

### Install Prometheus Operator to K8S Cluster

Make sure add correct namespace in the jsonnet file and generate new manifest file.

#### It is common that
```
prometheus+:: {
namespaces: ["default", "kube-system", "monitoring", "alluxio-operator"],
},
```

##### These errors are expected:
```
Error from server (NotFound): namespaces "alluxio-operator" not found
Error from server (NotFound): namespaces "alluxio-operator" not found
```
We need to use next step to deploy the `Alluxio Operator`.

### Deploy Alluxio Operator using Helm
`helm install operator -f operator-config.yaml deploy/charts/alluxio-operator`

### Update Prometheus Operator

### Use method in Dev Handbook to verify everything is running,


### Uninstall Operator:
`helm delete operator `
54 changes: 33 additions & 21 deletions monitoring/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,49 +21,61 @@ var metricDescription = map[string]MetricDescription{
// Help: "Total number of times the deployment size was not as desired.",
// Type: "Counter",
//},
//"AlluxioClusterAliveWorkerTotal": {
// Name: "alluxio_cluster_alive_worker_total",
// Help: "Total number of alive worker.",
// Type: "Gauge",
//},
"AlluxioClusterAliveWorkerTotal": {
Name: "alluxio_cluster_alive_worker_total",
Help: "Total number of alive worker in Alluxio Cluster.",
Type: "Gauge",
},
"AlluxioClusterDatasetMountedCountTotal": {
Name: "alluxio_cluster_dataset_mounted_count_total",
Help: "Total number of times the dataset was mounted.",
Type: "Counter",
},
"DatasetAliveWorkerTotal": {
Name: "dataset_alive_worker_total",
Help: "Total number of alive worker in Dataset.",
Type: "Gauge",
},
}

var (
//AlluxioClusterAliveWorkerTotal = prometheus.NewGauge(
// prometheus.GaugeOpts{
// Name: metricDescription["AlluxioClusterAliveWorkerTotal"].Name,
// Help: metricDescription["AlluxioClusterAliveWorkerTotal"].Help,
// },
//)
AlluxioClusterAliveWorkerTotal = prometheus.NewGauge(
prometheus.GaugeOpts{
Name: metricDescription["AlluxioClusterAliveWorkerTotal"].Name,
Help: metricDescription["AlluxioClusterAliveWorkerTotal"].Help,
},
)
AlluxioClusterDatasetMountedCountTotal = prometheus.NewCounter(
prometheus.CounterOpts{
Name: metricDescription["AlluxioClusterDatasetMountedCountTotal"].Name,
Help: metricDescription["AlluxioClusterDatasetMountedCountTotal"].Help,
},
)
DatasetAliveWorkerTotal = prometheus.NewGauge(
prometheus.GaugeOpts{
Name: metricDescription["DatasetAliveWorkerTotal"].Name,
Help: metricDescription["DatasetAliveWorkerTotal"].Help,
},
)
)

// RegisterMetrics will register metrics with the global prometheus registry
func RegisterMetrics() {
// metrics.Registry.MustRegister(AlluxioClusterAliveWorkerTotal)
metrics.Registry.MustRegister(AlluxioClusterDatasetMountedCountTotal)
logger.Infof("Register Metrics: AlluxioClusterDatasetMountedCountTotal")
}
//func RegisterMetrics() {
// // metrics.Registry.MustRegister(AlluxioClusterAliveWorkerTotal)
// metrics.Registry.MustRegister(AlluxioClusterDatasetMountedCountTotal)
// logger.Infof("Register Metrics: AlluxioClusterDatasetMountedCountTotal")
//}

func RegisterDatasetControllerMetrics() {
// metrics.Registry.MustRegister(AlluxioClusterAliveWorkerTotal)
func RegisterAlluxioControllerMetrics() {
metrics.Registry.MustRegister(AlluxioClusterAliveWorkerTotal)
logger.Infof("Register Metrics: AlluxioClusterAliveWorkerTotal")
metrics.Registry.MustRegister(AlluxioClusterDatasetMountedCountTotal)
logger.Infof("Register Metrics: AlluxioClusterDatasetMountedCountTotal")
}

func RegisterAlluxioControllerMetrics() {
// metrics.Registry.MustRegister(AlluxioClusterAliveWorkerTotal)
logger.Infof("Register Metrics: PENDING: AlluxioClusterAliveWorkerTotal")
func RegisterDatasetControllerMetrics() {
metrics.Registry.MustRegister(DatasetAliveWorkerTotal)
logger.Infof("Register Metrics: DatasetAliveWorkerTotal")
}

// ListMetrics will create a slice with the metrics available in metricDescription
Expand Down
2 changes: 1 addition & 1 deletion operator-config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
image: docker.io/kshou433/alluxio-operator
imageTag: v1.5
imageTag: v1.6
imagePullPolicy: Always
alluxio-csi:
enabled: 'false'
2 changes: 1 addition & 1 deletion pkg/alluxiocluster/alluxiocluster_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ type AlluxioClusterReconcileReqCtx struct {
}

const ruleName = "alluxio-operator-rules"
const namespace = "alluxio-operator-system"
const namespace = "alluxio-operator"

func (r *AlluxioClusterReconciler) Reconcile(context context.Context, req ctrl.Request) (ctrl.Result, error) {
logger.Infof("Reconciling AlluxioCluster %s", req.NamespacedName.String())
Expand Down

0 comments on commit 0b8405b

Please sign in to comment.