You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/modules/ROOT/pages/lm-eval-tutorial.adoc
+83-11Lines changed: 83 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -6,8 +6,8 @@ xref:component-lm-eval.adoc[LM-Eval] is a service for large language model evalu
6
6
7
7
[NOTE]
8
8
====
9
-
LM-Eval is only available in the `latest` community builds.
10
-
In order to use if on Open Data Hub, you need to add the following `devFlag` to you `DataScienceCluster` resource:
9
+
LM-Eval is only available since TrustyAI's 1.28.0 community builds.
10
+
In order to use it on Open Data Hub, you need to use either ODH 2.20 (or newer) or add the following `devFlag` to you `DataScienceCluster` resource:
11
11
12
12
[source,yaml]
13
13
----
@@ -198,7 +198,7 @@ Specify the task using the Unitxt recipe format:
198
198
|`genArgs`
199
199
|Map to `--gen_kwargs` parameter for the lm-evaluation-harness. Here are the link:https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/interface.md#command-line-interface[details].
200
200
201
-
|`logSampes`
201
+
|`logSamples`
202
202
|If this flag is passed, then the model's outputs, and the text fed into the model, will be saved at per-document granularity.
203
203
204
204
|`batchSize`
@@ -214,6 +214,17 @@ Specify extra information for the lm-eval job's pod.
214
214
** `resources`: Specify the resources for the lm-eval container.
215
215
* `volumes`: Specify the volume information for the lm-eval and other containers. It uses the `Volume` data structure of kubernetes.
216
216
* `sideCars`: A list of containers that run along with the lm-eval container. It uses the `Container` data structure of kubernetes.
217
+
218
+
|`outputs`
219
+
|This sections defines custom output locations for the evaluation results storage. At the moment only Persistent Volume Claims (PVC) are supported.
220
+
221
+
|`outputs.pvcManaged`
222
+
|Create an operator-managed PVC to store this job's results. The PVC will be named `<job-name>-pvc` and will be owned by the `LMEvalJob`. After job completion, the PVC will still be available, but it will be deleted upon deleting the `LMEvalJob`. Supports the following fields:
223
+
224
+
* `size`: The PVC's size, compatible with standard PVC syntax (e.g. `5Gi`)
225
+
226
+
|`outputs.pvcName`
227
+
|Binds an existing PVC to a job by specifying its name. The PVC must be created separately and must already exist when creating the job.
217
228
|===
218
229
219
230
== Examples
@@ -359,6 +370,66 @@ Inside the custom card, it uses the HuggingFace dataset loader:
359
370
360
371
You can use other link:https://www.unitxt.ai/en/latest/unitxt.loaders.html#module-unitxt.loaders[loaders] and use the `volumes` and `volumeMounts` to mount the dataset from persistent volumes. For example, if you use link:https://www.unitxt.ai/en/latest/unitxt.loaders.html#unitxt.loaders.LoadCSV[LoadCSV], you need to mount the files to the container and make the dataset accessible for the evaluation process.
361
372
373
+
=== Using PVCs as storage
374
+
375
+
To use a PVC as storage for the `LMEvalJob` results, there are two supported modes, at the moment, managed and existing PVCs.
376
+
377
+
Managed PVCs, as the name implies, are managed by the TrustyAI operator. To enable a managed PVC simply specify its size:
378
+
379
+
[source,yaml]
380
+
----
381
+
apiVersion: trustyai.opendatahub.io/v1alpha1
382
+
kind: LMEvalJob
383
+
metadata:
384
+
name: evaljob-sample
385
+
spec:
386
+
# other fields omitted ...
387
+
outputs: <1>
388
+
pvcManaged: <2>
389
+
size: 5Gi <3>
390
+
----
391
+
<1> `outputs` is the section for specifying custom storage locations
392
+
<2> `pvcManaged` will create an operator-managed PVC
393
+
<3> `size` (compatible with standard PVC syntax) is the only supported value
394
+
395
+
This will create a PVC named `<job-name>-pvc` (in this case `evaljob-sample-pvc`) which will be available after the job finishes, but will be deleted when the `LMEvalJob` is deleted.
396
+
397
+
To use an already existing PVC you can pass its name as a reference.
398
+
The PVC must already exist when the `LMEvalJob` is created. Start by creating a PVC, for instance:
399
+
400
+
[source,yaml]
401
+
----
402
+
apiVersion: v1
403
+
kind: PersistentVolumeClaim
404
+
metadata:
405
+
name: "my-pvc"
406
+
spec:
407
+
accessModes:
408
+
- ReadWriteOnce
409
+
resources:
410
+
requests:
411
+
storage: 1Gi
412
+
----
413
+
414
+
And then reference it from the `LMEvalJob`:
415
+
416
+
[source,yaml]
417
+
----
418
+
apiVersion: trustyai.opendatahub.io/v1alpha1
419
+
kind: LMEvalJob
420
+
metadata:
421
+
name: evaljob-sample
422
+
spec:
423
+
# other fields omitted ...
424
+
outputs:
425
+
pvcName: "my-pvc" <1>
426
+
----
427
+
<1> `pvcName` references the already existing PVC `my-pvc`.
428
+
429
+
In this case, the PVC is not managed by the TrustyAI operator, so it will be available even after deleting the `LMEvalJob`.
430
+
431
+
In the case where both managed and existing PVCs are referenced in `outputs`, the TrustyAI operator will prefer the managed PVC and ignore the existing one.
432
+
362
433
=== Using an `InferenceService`
363
434
364
435
[NOTE]
@@ -394,22 +465,23 @@ spec:
394
465
value: "False"
395
466
- name: tokenizer
396
467
value: ibm-granite/granite-7b-instruct
397
-
envSecrets:
398
-
- env: OPENAI_TOKEN
399
-
secretRef: <2>
400
-
name: $SECRET_NAME_THAT_CONTAINS_TOKEN <3>
401
-
key: token <4>
468
+
env:
469
+
- name: OPENAI_TOKEN
470
+
valueFrom:
471
+
secretKeyRef: <2>
472
+
name: <secret-name> <3>
473
+
key: token <4>
402
474
----
403
475
<1> `base_url` should be set to the route/service URL of your model. Make sure to include the `/v1/completions` endpoint in the URL.
404
-
<2> `envSecrets.secretRef` should point to a secret that contains a token that can authenticate to your model. `secretRef.name` should be the secret's name in the namespace, while `secretRef.key` should point at the token's key within the secret.
405
-
<3> `secretRef.name` can equal the output of
476
+
<2> `env.valueFrom.secretKeyRef.name` should point to a secret that contains a token that can authenticate to your model. `secretRef.name` should be the secret's name in the namespace, while `secretRef.key` should point at the token's key within the secret.
477
+
<3> `secretKeyRef.name` can equal the output of
406
478
+
407
479
[source,shell]
408
480
----
409
481
oc get secrets -o custom-columns=SECRET:.metadata.name --no-headers | grep user-one-token
410
482
----
411
483
+
412
-
<4> `secretRef.key` should equal `token`
484
+
<4> `secretKeyRef.key` should equal `token`
413
485
414
486
415
487
Then, apply this CR into the same namespace as your model. You should see a pod spin up in your
0 commit comments