You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
xref:component-gorch.adoc[GuardrailsOrchestrator] is a service for large language model guardrailing underpinned by the open-source project link:https://github.com/foundation-model-stack/fms-guardrails-orchestrator[fms-guardrails-orchestrator]. GuardrailsOrchestrator is a component of the xref:trustyai-operator.adoc[TrustyAI Kubernetes Operator]. In this tutorial, you will learn how to create a `GuardrailsOrchestrator` CR to
4
-
perform detections on text generation output
4
+
perform detections on text generation output.
5
5
6
6
[NOTE]
7
-
GuardrailsOrchestrator is only available in TrustyAI's 1.30.0 community builds and later via KServe Raw Deployment mode.
7
+
GuardrailsOrchestrator is available in RHOAI 2.19+ via KServe Raw Deployment mode.
8
8
9
9
In order to use it on Open Data Hub or OpenShift AI, first enable `KServe Raw Deployment`. In the `DataScienceIntialization` resource, set the value of `managementState` for the `serviceMesh` component to `Removed`.
10
10
@@ -21,22 +21,11 @@ controlPlane:
21
21
managementState: Removed
22
22
---
23
23
24
-
Next, in the `DataScienceCluster` resource,under the spec.components section, set the value of of kserve.serving.managementState to `Removed` and add the following `devFlag`:
24
+
Next, in the `DataScienceCluster` resource,under the spec.components section, set the value of of kserve.serving.managementState to `Removed`.
The GuardrailsOrchestrator service defines a new Custom Resource Definition called: *`GuardrailsOrchestrator`*. `GuardrailsOrchestrator` objects are monitored by the xref:trustyai-operator.adoc[TrustyAI Kubernetes operator]. A GuardrailsOrchestrator object represents an orchestration service that invokes detectors on text generation input/output and standalone detections. Therefore, to run an orchestration service, you need to create a `GuardrailsOrchestrator` object with ...
28
+
The GuardrailsOrchestrator service defines a new Custom Resource Definition named *`GuardrailsOrchestrator`*. `GuardrailsOrchestrator` objects are monitored by the xref:trustyai-operator.adoc[TrustyAI Kubernetes operator]. A GuardrailsOrchestrator object represents an orchestration service that invokes detectors on text generation input/output and standalone detections.
40
29
41
30
Here is a minimal example of a `GuardrailsOrchestrator` object:
42
31
@@ -54,7 +43,7 @@ spec:
54
43
<1> The orchestratorConfig field specifies a ConfigMap object that contains generator, detector, and chunker arguments.
55
44
<2> The replicas field specifies the number of replicas for the orchestrator.
56
45
57
-
Here is a minimal example of an ochestratorConfig object:
46
+
Here is a minimal example of an orchestratorConfig object:
<1> The generation field specifies the hostname and port of the large language model predictor service.
81
-
<2> The detectors field specifies the hostname and port of the detector service, the chunker ID, and the default threshold.
68
+
<1> The generation field specifies the hostname and port of the Large Language Model (LLM) predictor service.
69
+
<2> The detectors field specifies the name, hostname, and port of the detector service, the chunker ID, and the default threshold.
70
+
<2.1> The name of the detector. In this example, we are specifiying it as a Hateful and Profance (HAP) detector.
82
71
83
-
After you apply the example `orchestratorConfig` and `GuardrailsOrchestrator` above, you can check its readiness by using the following command:
72
+
After you apply the example `orchestratorConfig` ConfigMap and `GuardrailsOrchestrator` CR above, you can guardrail against your LLM inputs and outputs:
84
73
74
+
Verify the orchestrator pod is up and running:
85
75
[source,shell]
86
76
---
87
-
oc get pods | grep gorch-sample
77
+
oc get pods -n <TEST_NAMESPACE> | grep gorch-sample
88
78
---
89
79
90
80
The expected output is:
@@ -93,6 +83,66 @@ The expected output is:
93
83
gorch-sample-6776b64c58-xrxq9 3/3 Running 0 4h19m
94
84
---
95
85
86
+
Retrieve the external HTTP route for the orchestrator:
87
+
[source,shell]
88
+
---
89
+
GORCH_ROUTE_HTTP=$(oc get routes gorch-sample-http -o jsonpath='{.spec.host}' -n <TEST_NAMESPACE>)
90
+
---
91
+
92
+
Send a request to the *v2/chat/completions-detection* endpoint, specifying detections against HAP content in input text and generated outputs.
"start":0,"end":36,"text":"<explicit_text>, I really hate this stuff",
132
+
"detection":"sequence_classifier",
133
+
"detection_type": "sequence_classification",
134
+
"detector_id":"hap",
135
+
"score":0.9634239077568054
136
+
}]
137
+
}]
138
+
},
139
+
"warnings":[{
140
+
"type":"UNSUITABLE_INPUT",
141
+
"message":"Unsuitable input detected. Please check the detected entities on your input and try again with the unsuitable input removed."
142
+
}]
143
+
}
144
+
---
145
+
96
146
== Details of GuardrailsOrchestrator
97
147
In this section, let's review all the possible parameters for the `GuardrailsOrchestrator` object and their usage.
98
148
@@ -101,7 +151,9 @@ In this section, let's review all the possible parameters for the `GuardrailsOrc
101
151
|Parameter |Description
102
152
|`replicas`| The number of orchestrator pods to spin up
103
153
|`orchestratorConfig`| The name of the ConfigMap object that contains generator, detector, and chunker arguments
104
-
|`vLLMGatewayConfig **(optional)**`| The name of the ConfigMap object that contains VLLM gateway arguments
154
+
|`enableRegexDetectors **(optional)**`| Boolean value to inject the regex detector sidecar container into the orchestrator pod. The regex detector is a lightweight HTTP server designed to parse text using predefined patterns or custom regular expressions.
155
+
|`enableGuardrailsGateway **(optional)**`| Boolean value to enable controlled interaction with the orchestrator service by enforcing stricter access to its exposed endpoints. It provides a mechanism of configuring fixed detector pipelines, and then provides a unique /v1/chat/completions endpoint per configured detector pipeline.
156
+
|`guardrailsGatewayConfig **(optional)**`| The name of the ConfigMap object that specifies gateway configurations
105
157
|`otelExporter **(optional)**`| List of paired name and value arguments for configuring OpenTelemetry traces and/or metrics
106
158
107
159
* `protocol` - sets the protocol for all the OTLP endpoints. Acceptable values are `grpc` or`http`
@@ -115,25 +167,52 @@ In this section, let's review all the possible parameters for the `GuardrailsOrc
115
167
116
168
== Optional Configurations for GuardrailsOrchestrator
117
169
118
-
== Configuring the Regex Detector and vLLM Gateway
119
-
The regex detector and vLLM gateway are two sidecar images that can be used with the GuardrailsOrchestrator service. To enable them, the user must (1) specify their images in a ConfigMap (2) specify detectors they wish to use as well as the routes (3) reference the ConfigMap in the `GuardrailsOrchestrator` object:
170
+
== Configuring the Regex Detector and Guardrails Gateway
171
+
The regex detector and guardrails gateway are two sidecar images that can be used with the GuardrailsOrchestrator service, either individually or together. They are enabled via the GuardrailsOrchestrator CR.
120
172
121
-
Here is an example of a ConfigMap that references the regex detector and vLLM gateway images:
173
+
Here is an example of a GuardrailsOrchestrator CR that references the regex detector and guardrails gateway images:
0 commit comments