Skip to content

Commit 7baff9f

Browse files
Merge pull request #57 from christinaexyou/update-gorch-tutorial
Update GORCH tutorial to RHOAI 2.20
2 parents d9a76f7 + 610b292 commit 7baff9f

File tree

1 file changed

+193
-50
lines changed

1 file changed

+193
-50
lines changed

docs/modules/ROOT/pages/gorch-tutorial.adoc

Lines changed: 193 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
= Getting Started with GuardrailsOrchestrator
22

33
xref:component-gorch.adoc[GuardrailsOrchestrator] is a service for large language model guardrailing underpinned by the open-source project link:https://github.com/foundation-model-stack/fms-guardrails-orchestrator[fms-guardrails-orchestrator]. GuardrailsOrchestrator is a component of the xref:trustyai-operator.adoc[TrustyAI Kubernetes Operator]. In this tutorial, you will learn how to create a `GuardrailsOrchestrator` CR to
4-
perform detections on text generation output
4+
perform detections on text generation output.
55

66
[NOTE]
7-
GuardrailsOrchestrator is only available in TrustyAI's 1.30.0 community builds and later via KServe Raw Deployment mode.
7+
GuardrailsOrchestrator is available in RHOAI 2.19+ via KServe Raw Deployment mode.
88

99
In order to use it on Open Data Hub or OpenShift AI, first enable `KServe Raw Deployment`. In the `DataScienceIntialization` resource, set the value of `managementState` for the `serviceMesh` component to `Removed`.
1010

@@ -21,22 +21,11 @@ controlPlane:
2121
managementState: Removed
2222
---
2323

24-
Next, in the `DataScienceCluster` resource,under the spec.components section, set the value of of kserve.serving.managementState to `Removed` and add the following `devFlag`:
24+
Next, in the `DataScienceCluster` resource,under the spec.components section, set the value of of kserve.serving.managementState to `Removed`.
2525

26-
[source,yaml]
27-
---
28-
trustyai:
29-
devFlags:
30-
manifests:
31-
- contextDir: config
32-
sourcePath: ''
33-
uri: https://github.com/trustyai-explainability/trustyai-service-operator/tarball/main
34-
managementState: Managed
35-
---
36-
37-
== The GuardrailsOrchestrator
26+
== The GuardrailsOrchestrator Service
3827

39-
The GuardrailsOrchestrator service defines a new Custom Resource Definition called: *`GuardrailsOrchestrator`*. `GuardrailsOrchestrator` objects are monitored by the xref:trustyai-operator.adoc[TrustyAI Kubernetes operator]. A GuardrailsOrchestrator object represents an orchestration service that invokes detectors on text generation input/output and standalone detections. Therefore, to run an orchestration service, you need to create a `GuardrailsOrchestrator` object with ...
28+
The GuardrailsOrchestrator service defines a new Custom Resource Definition named *`GuardrailsOrchestrator`*. `GuardrailsOrchestrator` objects are monitored by the xref:trustyai-operator.adoc[TrustyAI Kubernetes operator]. A GuardrailsOrchestrator object represents an orchestration service that invokes detectors on text generation input/output and standalone detections.
4029

4130
Here is a minimal example of a `GuardrailsOrchestrator` object:
4231

@@ -54,7 +43,7 @@ spec:
5443
<1> The orchestratorConfig field specifies a ConfigMap object that contains generator, detector, and chunker arguments.
5544
<2> The replicas field specifies the number of replicas for the orchestrator.
5645

57-
Here is a minimal example of an ochestratorConfig object:
46+
Here is a minimal example of an orchestratorConfig object:
5847
[source,yaml]
5948
---
6049
kind: ConfigMap
@@ -66,25 +55,26 @@ data:
6655
generation: <1>
6756
service:
6857
hostname: llm-predictor.guardrails-test.svc.cluster.local
69-
port: 8032
58+
port: 8033
7059
detectors: <2>
71-
regex:
72-
type: text_contents
60+
hap <2.1>:
7361
service:
74-
hostname: "127.0.0.1"
75-
port: 8080
62+
hostname: http:/detector-host/api/v1/text/contents
63+
port: 8000
7664
chunker_id: whole_doc_chunker
7765
default_threshold: 0.5
7866
---
7967

80-
<1> The generation field specifies the hostname and port of the large language model predictor service.
81-
<2> The detectors field specifies the hostname and port of the detector service, the chunker ID, and the default threshold.
68+
<1> The generation field specifies the hostname and port of the Large Language Model (LLM) predictor service.
69+
<2> The detectors field specifies the name, hostname, and port of the detector service, the chunker ID, and the default threshold.
70+
<2.1> The name of the detector. In this example, we are specifiying it as a Hateful and Profance (HAP) detector.
8271

83-
After you apply the example `orchestratorConfig` and `GuardrailsOrchestrator` above, you can check its readiness by using the following command:
72+
After you apply the example `orchestratorConfig` ConfigMap and `GuardrailsOrchestrator` CR above, you can guardrail against your LLM inputs and outputs:
8473

74+
Verify the orchestrator pod is up and running:
8575
[source,shell]
8676
---
87-
oc get pods | grep gorch-sample
77+
oc get pods -n <TEST_NAMESPACE> | grep gorch-sample
8878
---
8979

9080
The expected output is:
@@ -93,6 +83,66 @@ The expected output is:
9383
gorch-sample-6776b64c58-xrxq9 3/3 Running 0 4h19m
9484
---
9585

86+
Retrieve the external HTTP route for the orchestrator:
87+
[source,shell]
88+
---
89+
GORCH_ROUTE_HTTP=$(oc get routes gorch-sample-http -o jsonpath='{.spec.host}' -n <TEST_NAMESPACE>)
90+
---
91+
92+
Send a request to the *v2/chat/completions-detection* endpoint, specifying detections against HAP content in input text and generated outputs.
93+
[source,shell]
94+
---
95+
curl -X 'POST' \
96+
"https://$GORCH_ROUTE_HTTP/api/v2/chat/completions-detection" \
97+
-H 'accept: application/json' \
98+
-H 'Content-Type: application/json' \
99+
-d '{
100+
"model": "llm",
101+
"messages": [
102+
{
103+
"content": "You dotard, I really hate this stuff",
104+
"role": "user"
105+
}
106+
],
107+
"detectors": {
108+
"input": {
109+
"hap": {}
110+
},
111+
"output": {
112+
"hap": {}
113+
}
114+
}
115+
}'
116+
---
117+
118+
Example output with HAP content detected:
119+
[source,shell]
120+
---
121+
{"id":"086980692dc1431f9c32cd56ba607067",
122+
"object":"",
123+
"created":1743084024,
124+
"model":"llm",
125+
"choices":[],"
126+
usage":{"prompt_tokens":0,"total_tokens":0,"completion_tokens":0},
127+
"detections":{
128+
"input":[{
129+
"message_index":0,
130+
"results":[{
131+
"start":0,"end":36,"text":"<explicit_text>, I really hate this stuff",
132+
"detection":"sequence_classifier",
133+
"detection_type": "sequence_classification",
134+
"detector_id":"hap",
135+
"score":0.9634239077568054
136+
}]
137+
}]
138+
},
139+
"warnings":[{
140+
"type":"UNSUITABLE_INPUT",
141+
"message":"Unsuitable input detected. Please check the detected entities on your input and try again with the unsuitable input removed."
142+
}]
143+
}
144+
---
145+
96146
== Details of GuardrailsOrchestrator
97147
In this section, let's review all the possible parameters for the `GuardrailsOrchestrator` object and their usage.
98148

@@ -101,7 +151,9 @@ In this section, let's review all the possible parameters for the `GuardrailsOrc
101151
|Parameter |Description
102152
|`replicas`| The number of orchestrator pods to spin up
103153
|`orchestratorConfig`| The name of the ConfigMap object that contains generator, detector, and chunker arguments
104-
|`vLLMGatewayConfig **(optional)**`| The name of the ConfigMap object that contains VLLM gateway arguments
154+
|`enableRegexDetectors **(optional)**`| Boolean value to inject the regex detector sidecar container into the orchestrator pod. The regex detector is a lightweight HTTP server designed to parse text using predefined patterns or custom regular expressions.
155+
|`enableGuardrailsGateway **(optional)**`| Boolean value to enable controlled interaction with the orchestrator service by enforcing stricter access to its exposed endpoints. It provides a mechanism of configuring fixed detector pipelines, and then provides a unique /v1/chat/completions endpoint per configured detector pipeline.
156+
|`guardrailsGatewayConfig **(optional)**`| The name of the ConfigMap object that specifies gateway configurations
105157
|`otelExporter **(optional)**`| List of paired name and value arguments for configuring OpenTelemetry traces and/or metrics
106158

107159
* `protocol` - sets the protocol for all the OTLP endpoints. Acceptable values are `grpc` or`http`
@@ -115,25 +167,52 @@ In this section, let's review all the possible parameters for the `GuardrailsOrc
115167

116168
== Optional Configurations for GuardrailsOrchestrator
117169

118-
== Configuring the Regex Detector and vLLM Gateway
119-
The regex detector and vLLM gateway are two sidecar images that can be used with the GuardrailsOrchestrator service. To enable them, the user must (1) specify their images in a ConfigMap (2) specify detectors they wish to use as well as the routes (3) reference the ConfigMap in the `GuardrailsOrchestrator` object:
170+
== Configuring the Regex Detector and Guardrails Gateway
171+
The regex detector and guardrails gateway are two sidecar images that can be used with the GuardrailsOrchestrator service, either individually or together. They are enabled via the GuardrailsOrchestrator CR.
120172

121-
Here is an example of a ConfigMap that references the regex detector and vLLM gateway images:
173+
Here is an example of a GuardrailsOrchestrator CR that references the regex detector and guardrails gateway images:
174+
[source,yaml]
175+
---
176+
apiVersion: trustyai.opendatahub.io/v1alpha1
177+
kind: GuardrailsOrchestrator
178+
metadata:
179+
name: gorch-sample
180+
spec:
181+
orchestratorConfig: "fms-orchestr8-config-nlp"
182+
enableBuiltInDetectors: True <1>
183+
enableGuardrailsGateway: True <2>
184+
guardrailsGatewayConfig: "fms-orchestr8-config-gateway" <3>
185+
replicas: 1
186+
---
187+
188+
<1> The enabledBuiltInDetectors, if set to **True**, injects regex detectors as a sidecar container into the orchestrator pod
189+
<2> The enableGuardrailsGateway, if set to **True**, injects the vLLM gateway as a sidecar conatiner into the orchestrator pod
190+
<3> The guardrailsGatewayConfig field specifies a ConfigMap that reroutes the orchestrator and regex detector routes to specific paths
191+
192+
Here is an example orchestratorConfig named `fms-orchestr8-config-nlp`. Please take note that it differs from the previous example:
122193
[source,yaml]
123194
---
124-
apiVersion: v1
125195
kind: ConfigMap
196+
apiVersion: v1
126197
metadata:
127-
name: gorch-sample-config
198+
name: fms-orchestr8-config-nlp
128199
data:
129-
regexDetectorImage: 'quay.io/csantiago/regex-detector@sha256:2dbfa4680938a97d0e0cac75049c43687ad163666cf2c6ddc37643c4f516d144' <1>
130-
vllmGatewayImage: 'quay.io/csantiago/vllm-orchestrator-gateway@sha256:493ac4679d50db9c2c59967dcaa6737a995cd19f319727f33c40f159db6817db <2>
200+
config.yaml: |
201+
chat_generation:
202+
service:
203+
hostname: llm-predictor.guardrails-test.svc.cluster.local
204+
port: 8032
205+
detectors:
206+
regex:
207+
type: text_contents
208+
service:
209+
hostname: "127.0.0.1"
210+
port: 8080
211+
chunker_id: whole_doc_chunker
212+
default_threshold: 0.5
131213
---
132214

133-
<1> The regex detector is a sidecar image that provides regex-based detections
134-
<2> The vLLM gateway is a sidecar image that emulates the vLLM chat completions API and saves preset detector configurations
135-
136-
Here is an example of a vLLM gateway ConfigMap named `fms-orchestr8-config-gateway`:
215+
Here is an example of a guardrailsGatewayConfig named `fms-orchestr8-config-gateway`:
137216
[source,yaml]
138217
---
139218
kind: ConfigMap
@@ -150,6 +229,8 @@ data:
150229
detectors:
151230
- name: regex
152231
detector_params:
232+
input: true
233+
output: true
153234
regex:
154235
- email
155236
- ssn
@@ -162,7 +243,7 @@ data:
162243
detectors:
163244
---
164245

165-
Let's review all the required arguments for the regex detector:
246+
Let's review all the required arguments for the guardrailsGatewayConfig:
166247

167248
[cols="1,2a", options="header"]
168249
|===
@@ -172,18 +253,81 @@ Let's review all the required arguments for the regex detector:
172253
|`routes`| The resulting endpoints for detections
173254
|===
174255

175-
Here is an example of a corresponding `GuardrailsOrchestrator` object that references the vLLM Gateway ConfigMap:
256+
Send a request to the */v1/chat/completions* endpoint, specifying detections against PII content in input text and generated outputs.
257+
[source,shell]
258+
---
259+
curl "https://$GORCH_ROUTE_HTTP/pii/v1/chat/completions" \
260+
-H "Content-Type: application/json" \
261+
-d '{
262+
"model": "Qwen/Qwen2.5-1.5B-Instruct",
263+
"messages": [
264+
{
265+
"role": "user",
266+
"content": "say hello to me at someemail@somedomain.com"
267+
},
268+
{
269+
"role": "user",
270+
"content": "btw here is my social 123456789"
271+
}
272+
]
273+
}'
274+
---
176275

177-
[source,yaml]
276+
Example output with PII content detected:
277+
[source,shell]
178278
---
179-
apiVersion: trustyai.opendatahub.io/v1alpha1
180-
kind: GuardrailsOrchestrator
181-
metadata:
182-
name: gorch-sample
183-
spec:
184-
orchestratorConfig: "fms-orchestr8-config-nlp"
185-
vllmGatewayConfig: "fms-orchestr8-config-gateway"
186-
replicas: 1
279+
{
280+
"choices": [
281+
{
282+
"finish_reason": "stop",
283+
"index": 0,
284+
"logprobs": null,
285+
"message": {
286+
"audio": null,
287+
"content": "I'm sorry, I'm afraid I can't do that.",
288+
"refusal": null,
289+
"role": "assistant",
290+
"tool_calls": null
291+
}
292+
}
293+
],
294+
"created": 1741182848,
295+
"detections": {
296+
"input": null,
297+
"output": [
298+
{
299+
"choice_index": 0,
300+
"results": [
301+
{
302+
"detection": "EmailAddress",
303+
"detection_type": "pii",
304+
"detector_id": "regex-language",
305+
"end": 176,
306+
"score": 1.0,
307+
"start": 152,
308+
"text": "someemail@somedomain.com"
309+
}
310+
]
311+
}
312+
]
313+
},
314+
"id": "16a0abbf4b0c431e885be5cfa4ff1c4b",
315+
"model": "Qwen/Qwen2.5-1.5B-Instruct",
316+
"object": "chat.completion",
317+
"service_tier": null,
318+
"system_fingerprint": null,
319+
"usage": {
320+
"completion_tokens": 83,
321+
"prompt_tokens": 61,
322+
"total_tokens": 144
323+
},
324+
"warnings": [
325+
{
326+
"message": "Unsuitable output detected.",
327+
"type": "UNSUITABLE_OUTPUT"
328+
}
329+
]
330+
}
187331
---
188332

189333
== Configuring the OpenTelemetry Exporter for Metrics & Tracing
@@ -205,7 +349,6 @@ metadata:
205349
name: gorch-test
206350
spec:
207351
orchestratorConfig: "fms-orchestr8-config-nlp"
208-
vllmGatewayConfig: "fms-orchestr8-config-gateway"
209352
replicas: 1
210353
otelExporter:
211354
protocol: "http"

0 commit comments

Comments
 (0)