Guard#

Motivation#

DataRobot MLOps can register GenAI apps running on external environments as external model deployments. This allows for both monitoring of prediction activities and optionally enforcing guardrails
by attaching a “guard” to these entrypoints.

guard and aguard are Python decorators that accelerate enhancing an existing app with these capabilities for development and testing purposes.

Warning

Using these decorators on mission-critical production apps is not advised. Some limitations include:

Risks of being unable to report all predictions to DR ML Ops during spikes of heavy usage due to REST rate-limiting
Unintended interactions between threaded application code and the threads used by guard to report data to ML Ops concurrently

For production use we recommend installing and integrating the official ML Ops tracking agent into your production code and environment.

Potential use cases#

Using DataRobot MLOps, you can track your external LLM app’s service health. This includes monitoring request counts, response times and other possible custom metrics of interest.

Use of guard or aguard also allow you to alter your app’s responses based on specific rules, such as allowing prompts within preferred topics or intercepting the completion process if the content has undesired attributes, such as toxicity, high probability for prompt injection/jailbreak or is irrelevant to the prompt, etc.

In order to set up such a system with monitoring and guardrails in place, it’s necessary to evaluate the app’s inputs (prompts) using a secondary (e.g. sidecar) model. If the secondary model flags the inputs, a preset response informs the user, and the event is recorded. This data, including prompt and completion details, is then logged on DataRobot MLOps for monitoring and/or future analysis.

Guardrail examples#

Detect prompt injection jailbreaks
Detect toxic comments
Evaluate sentiment
LLM-powered moderation

Usage#

To add a guard to your app, DataRobot deployments for both the LLM app and the optional guard model are needed. This guard model deployment contains the decision-making logic to determine if the provided input meets the acceptable use criteria. Based on its decision logic, it labels the input as flagged or unflagged and conveys this information back to its invoker, i.e., the DRX guard decorator.

Both guard and aguard can be used without a guardrail to easily attach stand-alone monitoring to an external app.

Registering an LLM application as an External Deployment in DataRobot#

import datarobot as dr
from datarobotx.common.config import context

pred_env = {
    "name": "External Prediction Environment",
    "description": "Prediction environment for external LLM apps using DR Guardrails - created via drx",
    "platform": "other",
    "supported_model_formats": ["externalModel"]
}

pred_env = dr.models.PredictionEnvironment.create(**pred_env)
reg_model_name = "External LLM app"

registered_model_version = dr.RegisteredModelVersion.create_for_external(
    name=reg_model_name, 
    model_description={
        "description": "Testing Guards by DrX"
    }, 
    target=  {"type": "TextGeneration", "name": "output"},
    registered_model_name=reg_model_name
)
deployment = dr.Deployment.create_from_registered_model_version(
    registered_model_version.id,
    label="LLM Proxy Deployment",
    prediction_environment_id=pred_env.id
)


ext_depl_id = deployment.id
deployment.update_predictions_data_collection_settings(True)
model_id = registered_model_version.id

These steps can also be easily performed from the DataRobot GUI.

Deploying a guardrail model as a Custom Unstructured Model#

Here’s how you can create a toxicity guardrail as an Unstructured Custom Model Deployment in DataRobot MLOps, using drx.deploy()

from datarobotx import deploy
from transformers import pipeline

pipe = pipeline(
    "text-classification",
    model="unitary/toxic-bert",
)
MODEL_NAME = "toxic-bert"
path = f"./{MODEL_NAME}"
pipe.save_pretrained(path)

def load_model(input_dir):
    from transformers import (  # pylint: disable=reimported
        AutoModelForSequenceClassification,
        AutoTokenizer,
        pipeline,
    )

    model_path = input_dir + "/" + MODEL_NAME
    model = AutoModelForSequenceClassification.from_pretrained(model_path)
    tokenizer = AutoTokenizer.from_pretrained(model_path)
    return pipeline(
        "text-classification",
        model=model,
        tokenizer=tokenizer,
    )

def score_unstructured(model, data, query, **kwargs):
    import json

    data_dict = json.loads(data)
    outputs = model(data_dict["prompt"])
    rv = {"flagged": outputs[0]["score"] > 0.5}
    return json.dumps(rv)

guard = deploy(
        path,
        hooks={"load_model": load_model, "score_unstructured": score_unstructured},
        environment_id='64c964448dd3f0c07f47d040', # DR GenAI drop-in environment
    )

Configuring the DRX guard decorator#

from datarobotx.llm.chains.guard import aguard, MonitoringConfig, GuardrailConfig

monitor_config = MonitoringConfig(
    deployment_id=ext_depl_id,
    model_id=model_id,
    inputs_parser=lambda x: {"prompt": x}
)

guardrail_config = GuardrailConfig(
    deployment_id=guard.dr_deployment.id,
    datarobot_key=guard.dr_deployment.default_prediction_server["datarobot-key"],
    prediction_server_endpoint=f"{guard.dr_deployment.default_prediction_server['url']}/predApi/v1.0",
    timeout_secs=10,
    input_parser=lambda x: {"prompt": x},
    guardrail_prompt="{prompt}",
)
    
@aguard(monitor_config, guardrail_config)
async def llm_proxy(prompt):
    #Here's where you call out to an LLM model and get completions for the input prompt
    completion = call_llm(prompt)
    return completion

Now that the setup is done, let’s test our guarded LLM application.

First, let’s call the llm_proxy function as usual, with a prompt that is not toxic. The response of this function call should be the same as the unguarded LLM app’s completion to the same prompt, since our guard will not intercept the app’s usual flow.

await llm_proxy("Can I have challenger models in DataRobot MLOps?")

Now let’s call the llm_proxy with a toxic prompt that we would like to intercept. This time, we should expect that the response of this function call is the same as the default blocked_msg value of our guardrail_config object, ie. “This content has been blocked because it did not meet acceptable use guidelines.”

await llm_proxy("What should I do about dirty and smelly people?")

Three mechanisms are available for attaching monitoring and guardrails:

aguard is a decorator for use with async python entrypoints; an event loop must already be running when the decorated function is called, this event loop is ued to schedule interactions with the guardrail deployment(s) and DR ML Ops via REST API
guard is a decorator for use with synchronous (non-async) python entrypoints; threading is used to concurrently interact with the guardrail deployment(s) and DR ML Ops via REST API
GuardChain is a chain for wrapping another langchain chain in monitoring and guardrails; it expects the wrapped chain to return exactly one output

API Reference#

`aguard`(monitor, *guardrails)	Decorator for monitoring and optionally guardrailing an async entrypoint with DR MLOps.
`guard`(monitor, *guardrails)	Decorator for monitoring and optionally guardrailing a synchronous entrypoint with DR MLOps.
`GuardChain`(**kwargs)	Apply monitoring and guardrails to a chain with DataRobot MLOps.
`MonitoringConfig`(deployment_id, model_id[, ...])	Monitoring deployment configuration.
`GuardrailConfig`(deployment_id, ...[, ...])	Guardrail configuration.