Add Governance to OpenAI Agents in 5 Minutes¶

The OpenAI Agents SDK lets you build agents that call function tools autonomously. That is powerful -- and dangerous. A single hallucinated function call can delete production data, send an email to the wrong person, or charge a credit card twice.

Aegis adds a policy layer between your agent's decisions and the actual tool calls. You define rules in YAML; Aegis enforces them at runtime. Every action is evaluated, gated, and logged -- with zero changes to your existing function tools.

What you will build: An OpenAI agent where search tools run freely, write operations require approval, delete operations are hard-blocked, and everything is recorded in an audit trail.

Time: 5 minutes.

Prerequisites¶

pip install 'agent-aegis[openai-agents]' openai-agents

Aegis works with any OpenAI Agents SDK setup. The @governed_tool decorator wraps your existing function tools transparently.

Step 1: Define Your Policy¶

Create policy.yaml in your project root:

version: "1"

defaults:
  risk_level: medium
  approval: approve        # Anything without an explicit rule needs human approval

rules:
  # --- Safe: auto-approve ---
  - name: search_tools
    match:
      type: "search*"
      target: "web"
    risk_level: low
    approval: auto

  - name: retrieval_tools
    match:
      type: "retriev*"
      target: "*"
    risk_level: low
    approval: auto

  # --- Needs approval: write operations ---
  - name: write_ops
    match:
      type: "write_*"
      target: "*"
    risk_level: high
    approval: approve

  # --- Hard block: no writes after hours (6 PM - 8 AM UTC) ---
  - name: after_hours_write_block
    match:
      type: "write_*"
      target: "*"
    conditions:
      time_after: "18:00"
    risk_level: critical
    approval: block

  - name: early_morning_write_block
    match:
      type: "write_*"
      target: "*"
    conditions:
      time_before: "08:00"
    risk_level: critical
    approval: block

  # --- Hard block: deletes never run ---
  - name: block_deletes
    match:
      type: "delete_*"
      target: "*"
    risk_level: critical
    approval: block

How rules work:

Rules are evaluated top to bottom. The first match wins.
match.type maps to the action_type you pass to @governed_tool. Glob patterns (*, ?) are supported.
match.target is the system being acted on (e.g., "web", "database", "crm").
conditions add time-based and parameter-based guards.
approval: auto means execute immediately. approve means ask a human. block means reject unconditionally.

Step 2: Wrap Function Tools with `@governed_tool`¶

The @governed_tool decorator wraps any function tool with Aegis governance. The function's parameters are automatically captured and passed to the policy engine for evaluation.

from agents import Agent, Runner
from aegis import Action, Policy, Runtime
from aegis.adapters.base import BaseExecutor
from aegis.adapters.openai_agents import governed_tool
from aegis.runtime.approval import AutoApprovalHandler

# A simple executor (replace with your real logic)
class MyExecutor(BaseExecutor):
    async def execute(self, action):
        from aegis import Result, ResultStatus
        return Result(action=action, status=ResultStatus.SUCCESS, data={"ok": True})

# 1. Build the governed runtime
runtime = Runtime(
    executor=MyExecutor(),
    policy=Policy.from_yaml("policy.yaml"),
    approval_handler=AutoApprovalHandler(),
)

# 2. Decorate your function tools
@governed_tool(runtime=runtime, action_type="search", action_target="web")
async def web_search(query: str) -> str:
    """Search the web for information."""
    return f"Results for: {query}"

@governed_tool(runtime=runtime, action_type="write_record", action_target="database")
async def write_record(table: str, data: str) -> str:
    """Write a record to the database."""
    return f"Written to {table}: {data}"

@governed_tool(runtime=runtime, action_type="delete_record", action_target="database")
async def delete_record(record_id: str) -> str:
    """Delete a record from the database."""
    return f"Deleted record {record_id}"

What each decorator parameter does:

Parameter	Description
`runtime`	An Aegis `Runtime` instance with your policy
`action_type`	The Aegis action type for policy matching
`action_target`	The Aegis action target (default: `"default"`)
`description`	Override the function's docstring for the action description

When the agent calls delete_record, the function body never executes. Aegis evaluates the policy, finds block_deletes, and returns:

[AEGIS BLOCKED] Action blocked by policy rule: block_deletes

The agent sees this as a tool response and can explain to the user why the action was not performed.

Step 3: Full Example -- Governed OpenAI Agent¶

Here is a complete, runnable example. Copy it, set your API key, and run it.

"""governed_agent.py -- OpenAI Agents SDK agent with Aegis governance."""

import asyncio

from agents import Agent, Runner

from aegis import Action, Policy, Result, ResultStatus, Runtime
from aegis.adapters.base import BaseExecutor
from aegis.adapters.openai_agents import governed_tool
from aegis.runtime.approval import AutoApprovalHandler


# -- Executor (simulated for demo; replace with real API calls) --

class SimulatedExecutor(BaseExecutor):
    async def execute(self, action: Action) -> Result:
        responses = {
            "search": {"results": ["AI safety paper", "Governance framework"]},
            "write_report": {"written": True},
            "delete_document": {"deleted": True},
        }
        data = responses.get(action.type, {"result": "ok"})
        return Result(action=action, status=ResultStatus.SUCCESS, data=data)


# -- Define the policy inline (or load from policy.yaml) --

POLICY = Policy.from_dict({
    "version": "1",
    "defaults": {
        "risk_level": "medium",
        "approval": "approve",
    },
    "rules": [
        {
            "name": "search_auto",
            "match": {"type": "search*", "target": "web"},
            "risk_level": "low",
            "approval": "auto",
        },
        {
            "name": "write_needs_approval",
            "match": {"type": "write_*", "target": "*"},
            "risk_level": "high",
            "approval": "approve",
        },
        {
            "name": "block_deletes",
            "match": {"type": "delete_*", "target": "*"},
            "risk_level": "critical",
            "approval": "block",
        },
    ],
})


# -- Build the governed runtime --

runtime = Runtime(
    executor=SimulatedExecutor(),
    policy=POLICY,
    approval_handler=AutoApprovalHandler(),  # Auto-approve for demo; use CLIApprovalHandler() for interactive
)


# -- Create governed tools --

@governed_tool(runtime=runtime, action_type="search", action_target="web")
async def web_search(query: str) -> str:
    """Search the web for information."""
    return f"Search results for '{query}': AI governance frameworks, NIST AI RMF, EU AI Act..."


@governed_tool(runtime=runtime, action_type="write_report", action_target="storage")
async def write_report(title: str, content: str) -> str:
    """Write a report to storage."""
    return f"Report '{title}' saved successfully."


@governed_tool(runtime=runtime, action_type="delete_document", action_target="storage")
async def delete_document(document_id: str) -> str:
    """Delete a document from storage."""
    return f"Deleted document {document_id}"


# -- Build the agent --

agent = Agent(
    name="Research Assistant",
    instructions=(
        "You are a research assistant. Use web_search to find information, "
        "write_report to save findings, and delete_document to clean up. "
        "If a tool call is blocked, explain why to the user."
    ),
    tools=[web_search, write_report, delete_document],
)


# -- Run it --

async def main():
    # This will auto-approve (search is low risk)
    result = await Runner.run(agent, "Search for 'AI governance best practices'.")
    print("\n--- Agent Output ---")
    print(result.final_output)

    print("\n\n--- Now try a blocked action ---\n")

    # This will be blocked by policy
    result = await Runner.run(agent, "Delete document ID 42 from storage.")
    print("\n--- Agent Output ---")
    print(result.final_output)

    # -- Check the audit trail --
    print("\n--- Audit Trail ---")
    for entry in runtime.audit.get_log():
        print(
            f"  {entry['action_type']:>15} | risk={entry['risk_level']:<8} | "
            f"rule={entry.get('matched_rule', '-'):<20} | "
            f"result={entry.get('result_status') or '-'}"
        )


asyncio.run(main())

What happens when you run this:

The agent calls web_search. Aegis evaluates the policy, matches search_auto (low risk, auto-approve), and executes the function.
The agent calls delete_document. Aegis matches block_deletes (critical risk, block). The function returns [AEGIS BLOCKED] Action blocked by policy rule: block_deletes. The agent sees this and reports to the user that the action was not allowed.
Every decision is recorded in aegis_audit.db.

Step 4: Check the Audit Trail¶

Aegis logs every action to a local SQLite database (aegis_audit.db by default).

From the CLI¶

# View all audit entries
aegis audit

# Filter by risk level
aegis audit --risk high

# Filter by result status
aegis audit --status blocked

# Export to JSON Lines for external analysis
aegis audit --export audit_export.jsonl

From Python¶

from aegis.runtime.audit import AuditLogger

logger = AuditLogger(db_path="aegis_audit.db")

# Get all blocked actions
blocked = logger.get_log(result_status="blocked")
for entry in blocked:
    print(f"[{entry['timestamp']}] {entry['action_type']} -> {entry['action_target']}: "
          f"{entry['result_error']}")

# Count high-risk actions in this session
count = logger.count(risk_level="HIGH")
print(f"High-risk actions: {count}")

# Export everything to JSON Lines
logger.export_jsonl("audit_export.jsonl")

What the audit log captures¶

Each entry records the full lifecycle of one tool call:

Field	Example
`session_id`	`a3f1b2c4d5e6`
`timestamp`	`2024-11-15T14:32:01+00:00`
`action_type`	`delete_document`
`action_target`	`storage`
`action_params`	`{"document_id": "42"}`
`risk_level`	`CRITICAL`
`approval`	`block`
`matched_rule`	`block_deletes`
`result_status`	`blocked`
`result_error`	`Action blocked by policy rule: block_deletes`
`human_decision`	`null` (no human involved)

Advanced Patterns¶

Policy Merge: Base + Production Overrides¶

Maintain a base policy for development and layer production-specific rules on top:

# base-policy.yaml
version: "1"
defaults:
  risk_level: medium
  approval: approve

rules:
  - name: search_auto
    match: { type: "search*" }
    risk_level: low
    approval: auto

# prod-overrides.yaml
version: "1"
rules:
  - name: prod_block_deletes
    match: { type: "delete_*" }
    risk_level: critical
    approval: block

  - name: prod_require_approval_writes
    match: { type: "write_*" }
    risk_level: high
    approval: approve

# Rules from prod-overrides are appended after base rules.
# First-match-wins, so put higher-priority overrides in the first file.
policy = Policy.from_yaml_files("prod-overrides.yaml", "base-policy.yaml")

Or merge programmatically:

base = Policy.from_yaml("base-policy.yaml")
prod = Policy.from_yaml("prod-overrides.yaml")
combined = prod.merge(base)  # prod rules checked first, then base rules

Hot-Reload: Update Policy Without Restarting¶

PolicyWatcher monitors your YAML file and swaps the policy atomically whenever the file changes. In-flight executions are not affected.

from aegis import Runtime, PolicyWatcher

runtime = Runtime(
    executor=SimulatedExecutor(),
    policy=Policy.from_yaml("policy.yaml"),
)

# Start watching -- policy reloads automatically on file change
async with PolicyWatcher(runtime, "policy.yaml", interval=2.0):
    # Your agent runs here. Edit policy.yaml and changes take effect
    # within 2 seconds -- no restart needed.
    result = await Runner.run(agent, "Do something")

Runtime Hooks: Custom Logging and Alerting¶

Attach callbacks to observe every policy decision, approval gate, and execution result without modifying the core pipeline:

from aegis import Runtime, RuntimeHooks, PolicyDecision, Result

async def log_decision(decision: PolicyDecision) -> None:
    """Called after every policy evaluation."""
    if decision.risk_level.name in ("HIGH", "CRITICAL"):
        print(f"[ALERT] High-risk action: {decision.action.type} "
              f"-> {decision.action.target} (rule: {decision.matched_rule})")

async def log_approval(decision: PolicyDecision, approved: bool) -> None:
    """Called after every approval gate."""
    status = "APPROVED" if approved else "DENIED"
    print(f"[APPROVAL] {decision.action.type}: {status}")

async def log_execution(result: Result) -> None:
    """Called after every action execution."""
    print(f"[EXEC] {result.action.type}: {result.status.value}")

hooks = RuntimeHooks(
    on_decision=log_decision,
    on_approval=log_approval,
    on_execute=log_execution,
)

runtime = Runtime(
    executor=SimulatedExecutor(),
    policy=Policy.from_yaml("policy.yaml"),
    hooks=hooks,
)

Governing Both Sync and Async Tools¶

The @governed_tool decorator works with both sync and async functions. Aegis detects the function type automatically:

# Async tool -- runs natively
@governed_tool(runtime=runtime, action_type="search", action_target="web")
async def async_search(query: str) -> str:
    """Async search tool."""
    result = await some_async_client.search(query)
    return str(result)

# Sync tool -- also works
@governed_tool(runtime=runtime, action_type="lookup", action_target="cache")
def sync_lookup(key: str) -> str:
    """Sync lookup tool."""
    return cache.get(key)

Multiple Agents with Shared Policy¶

Use one runtime across multiple agents for consistent governance:

runtime = Runtime(
    executor=SimulatedExecutor(),
    policy=Policy.from_yaml("policy.yaml"),
    approval_handler=AutoApprovalHandler(),
)

@governed_tool(runtime=runtime, action_type="search", action_target="web")
async def shared_search(query: str) -> str:
    """Shared search tool."""
    return f"Results for: {query}"

@governed_tool(runtime=runtime, action_type="write_email", action_target="email")
async def send_email(to: str, subject: str, body: str) -> str:
    """Send an email (governed)."""
    return f"Email sent to {to}"

researcher = Agent(
    name="Researcher",
    instructions="You research topics.",
    tools=[shared_search],
)

communicator = Agent(
    name="Communicator",
    instructions="You draft and send emails based on research.",
    tools=[shared_search, send_email],
)

# Both agents share the same policy and audit trail
result = await Runner.run(researcher, "Find info about AI safety")
result = await Runner.run(communicator, "Email a summary to team@example.com")

Dry Run: Test Policies Without Executing¶

Validate your policy against a set of actions without running any tools:

plan = runtime.plan([
    Action("search", target="web", params={"query": "test"}),
    Action("write_record", target="database", params={"table": "users"}),
    Action("delete_record", target="database", params={"id": "42"}),
])

# Print what would happen
print(plan.summary())
#   1. [   AUTO] Action(search -> web)              (risk=LOW, rule=search_auto)
#   2. [APPROVE] Action(write_record -> database)   (risk=HIGH, rule=write_needs_approval)
#   3. [ BLOCK] Action(delete_record -> database)   (risk=CRITICAL, rule=block_deletes)

Quick Reference¶

Concept	Code
Load policy from YAML	`Policy.from_yaml("policy.yaml")`
Load + merge policies	`Policy.from_yaml_files("overrides.yaml", "base.yaml")`
Govern a function tool	`@governed_tool(runtime=rt, action_type=..., action_target=...)`
Plan actions	`plan = runtime.plan([Action(...)])`
Execute with governance	`results = await runtime.execute(plan)`
Dry run	`await runtime.execute(plan, dry_run=True)`
Hot-reload policy	`async with PolicyWatcher(runtime, "policy.yaml"): ...`
Manual policy update	`runtime.update_policy(new_policy)`
Query audit log	`AuditLogger().get_log(result_status="blocked")`
Export audit log	`AuditLogger().export_jsonl("out.jsonl")`

Next Steps¶

Policy syntax reference -- all match patterns, conditions, and operators
Approval handlers -- CLI, webhook, callback, and custom handlers
Audit log -- filtering, export, and programmatic access
Custom adapters -- build an executor for any system
Full API docs -- Runtime, ExecutionPlan, PolicyDecision