Docs
LLM Security
Getting Started

Getting Started with LLM Security

LLM Security can be addressed with a combination of:

  • LLM Security libraries for run-time security measures
  • Langfuse for the ex-post evaluation of the effectiveness of these measures

Learn more about this conceptually in our LLM Security Overview.

Example: Anonymizing Personally Identifiable Information (PII)

Exposing PII to LLMs can pose serious security and privacy risks, such as violating contractual obligations or regulatory compliance requirements, or mitigating the risks of data leakage or a data breach.

Personally Identifiable Information (PII) includes:

  • Credit card number
  • Full name
  • Phone number
  • Email address
  • Social Security number
  • IP Address

The example below shows a simple application that summarizes a given court transcript. For privacy reasons, the application wants to anonymize PII before the information is fed into the model, and then un-redact the response to produce a coherent summary.

To read more about other security risks, including prompt injection, banned topics, or malicious URLs, please check out the docs of the various libraries or read our security cookbook which includes more examples.

Install packages

In this example we use the open source library LLM Guard (opens in a new tab) for run-time security checks. All examples easily translate to other libraries such as Prompt Armor (opens in a new tab), NeMo Guardrails (opens in a new tab), Microsoft Azure AI Content Safety (opens in a new tab), and Lakera (opens in a new tab).

First, import the security packages and Langfuse tools.

pip install llm-guard langfuse openai
from llm_guard.input_scanners import Anonymize
from llm_guard.input_scanners.anonymize_helpers import BERT_LARGE_NER_CONF
from langfuse.openai import openai # OpenAI integration
from langfuse.decorators import observe, langfuse_context
from llm_guard.output_scanners import Deanonymize
from llm_guard.vault import Vault

Anonymize and deanonymize PII and trace with Langfuse

We break up each step of the process into its own function so we can track each step separately in Langfuse.

By decorating the functions with @observe(), we can trace each step of the process and monitor the risk scores returned by the security tools. This allows us to see how well the security tools are working and whether they are catching the PII as expected.

vault = Vault()
 
@observe()
def anonymize(input: str):
  scanner = Anonymize(vault, preamble="Insert before prompt", allowed_names=["John Doe"], hidden_names=["Test LLC"],
                    recognizer_conf=BERT_LARGE_NER_CONF, language="en")
  sanitized_prompt, is_valid, risk_score = scanner.scan(prompt)
  return sanitized_prompt
 
@observe()
def deanonymize(sanitized_prompt: str, answer: str):
  scanner = Deanonymize(vault)
  sanitized_model_output, is_valid, risk_score = scanner.scan(sanitized_prompt, answer)
 
  return sanitized_model_output

Instrument LLM call

In this example, we use the native OpenAI SDK integration, to instrument the LLM call. Thereby, we can automatically collect token counts, model parameters, and the exact prompt that was sent to the model.

Note: Langfuse natively integrates with a number of frameworks (e.g. LlamaIndex, LangChain, Haystack, ...) and you can easily instrument any LLM via the SDKs.

@observe()
def summarize_transcript(prompt: str):
  sanitized_prompt = anonymize(prompt)
 
  answer = openai.chat.completions.create(
        model="gpt-3.5-turbo",
        max_tokens=100,
        messages=[
          {"role": "system", "content": "Summarize the given court transcript."},
          {"role": "user", "content": sanitized_prompt}
        ],
    ).choices[0].message.content
 
  sanitized_model_output = deanonymize(sanitized_prompt, answer)
 
  return sanitized_model_output

Execute the application

Run the function. In this example, we input a section of a court transcript. Applications that handle sensitive information will often need to use anonymize and deanonymize functionality to comply with data privacy policies such as HIPAA or GDPR.

prompt = """
Plaintiff, Jane Doe, by and through her attorneys, files this complaint
against Defendant, Big Corporation, and alleges upon information and belief,
except for those allegations pertaining to personal knowledge, that on or about
July 15, 2023, at the Defendant's manufacturing facility located at 123 Industrial Way, Springfield, Illinois, Defendant negligently failed to maintain safe working conditions,
leading to Plaintiff suffering severe and permanent injuries. As a direct and proximate
result of Defendant's negligence, Plaintiff has endured significant physical pain, emotional distress, and financial hardship due to medical expenses and loss of income. Plaintiff seeks compensatory damages, punitive damages, and any other relief the Court deems just and proper.
"""
summarize_transcript(prompt)

Inspect trace in Langfuse

In this trace (public link (opens in a new tab)), we can see how the name of the plaintiff is anonymized before being sent to the model, and then un-redacted in the response. We can now evaluate run evaluations in Langfuse to control for the effectiveness of these measures.

More Examples

Find more examples of LLM security monitoring in our cookbook.

Was this page useful?

Questions? We're here to help

Subscribe to updates