Testing Guide¶

How to run, write, and extend the Aegis test suite.

Running Tests¶

# Full suite (verbose)
make test

# Or directly with pytest
pytest tests/ -v

# Single file
pytest tests/test_runtime.py -v

# Single test
pytest tests/test_runtime.py::test_plan_evaluates_actions -v

# With coverage report
make coverage
# Generates htmlcov/index.html + terminal summary

Test Structure¶

tests/
├── conftest.py              # Shared fixtures (FakeExecutor, sample_policy, runtime)
├── test_runtime.py          # Core runtime engine tests
├── test_policy.py           # Policy loading and evaluation
├── test_action.py           # Action model
├── test_conditions.py       # Time/param-based policy conditions
├── test_approval.py         # Approval handlers
├── test_audit.py            # Audit logger (SQLite)
├── test_audit_export.py     # JSONL export
├── test_audit_logging.py    # Python logging backend
├── test_schema.py           # Policy JSON schema
├── test_cli.py              # CLI commands
├── test_adapters.py         # Adapter import guards
├── test_anthropic_adapter.py
├── test_httpx_adapter.py
├── test_playwright_adapter.py
├── test_langchain_adapter.py
├── test_crewai_adapter.py
├── test_openai_agents_adapter.py
└── ...

Shared Fixtures (`conftest.py`)¶

tests/conftest.py provides reusable fixtures available to all test files:

from tests.conftest import FakeExecutor  # records calls, configurable failures

# Fixtures available via pytest injection:
# sample_policy  — Policy with read(AUTO), write(APPROVE), delete(BLOCK) rules
# fake_executor  — FakeExecutor instance
# runtime        — Pre-configured Runtime with fake executor + auto-approval + tmp audit DB

FakeExecutor is the go-to mock for testing anything that needs a BaseExecutor:

class FakeExecutor(BaseExecutor):
    def __init__(self, fail_on: set[str] | None = None) -> None:
        self.executed: list[Action] = []   # inspect what was executed
        self._fail_on = fail_on or set()   # action types that return FAILED

    async def execute(self, action: Action) -> Result:
        self.executed.append(action)
        if action.type in self._fail_on:
            return Result(action=action, status=ResultStatus.FAILED, ...)
        return Result(action=action, status=ResultStatus.SUCCESS, ...)

Writing Async Tests¶

All async tests use pytest-asyncio with asyncio_mode = "auto" (configured in pyproject.toml). Simply mark tests with @pytest.mark.asyncio:

import pytest
from aegis.core.action import Action

@pytest.mark.asyncio
async def test_run_one_convenience(tmp_path):
    executor = FakeExecutor()
    runtime = _make_runtime(tmp_path, executor=executor)
    result = await runtime.run_one(Action("read", "salesforce"))

    assert result.ok
    assert len(executor.executed) == 1

Use tmp_path (pytest built-in) for any test that needs a temporary database:

from aegis.runtime.audit import AuditLogger

audit = AuditLogger(db_path=tmp_path / "test.db")

Mocking Adapters¶

Adapters that depend on external libraries (httpx, playwright, etc.) are tested by mocking the underlying client. The pattern:

1. Mock the response object¶

from unittest.mock import AsyncMock, MagicMock

mock_response = MagicMock()
mock_response.status_code = 200
mock_response.is_success = True
mock_response.json.return_value = {"id": 1, "name": "Alice"}

2. Inject the mock client¶

from aegis.adapters.httpx_adapter import HttpxExecutor

executor = HttpxExecutor(base_url="https://api.example.com")
mock_client = AsyncMock()
mock_client.request = AsyncMock(return_value=mock_response)
executor._client = mock_client  # bypass setup()

3. Assert on results¶

action = Action("get", "/users/1")
result = await executor.execute(action)

assert result.status == ResultStatus.SUCCESS
assert result.data["body"] == {"id": 1, "name": "Alice"}
mock_client.request.assert_called_once()

Testing import guards¶

Every adapter has a guard that raises ImportError when its optional dependency is missing:

def test_import_guard():
    import sys
    saved = sys.modules.pop("httpx", None)
    sys.modules["httpx"] = None
    try:
        from aegis.adapters.httpx_adapter import _require_httpx
        with pytest.raises(ImportError, match="httpx"):
            _require_httpx()
    finally:
        if saved:
            sys.modules["httpx"] = saved
        else:
            sys.modules.pop("httpx", None)

Coverage Goals¶

Target: 98%+ line coverage.

make coverage
# pytest tests/ --cov=aegis --cov-report=term-missing --cov-report=html

The coverage configuration in pyproject.toml excludes lines that cannot be meaningfully tested:

[tool.coverage.report]
exclude_lines = [
    "pragma: no cover",
    "if TYPE_CHECKING:",
    "if __name__ ==",
    "raise NotImplementedError",
    "\\.\\.\\.",
]

Adding Tests for a New Adapter¶

When you add a new adapter (e.g., src/aegis/adapters/my_adapter.py):

Create the test file: tests/test_my_adapter.py

Test the import guard -- every adapter with an optional dependency needs one:

def test_my_adapter_import_guard():
    try:
        from aegis.adapters.my_adapter import MyExecutor
        MyExecutor()
    except ImportError as e:
        assert "my-library" in str(e)

Test execute() with mocked dependencies -- follow the mock injection pattern above.

Test error paths -- network errors, unsupported action types, bad responses:

@pytest.mark.asyncio
async def test_execute_error_handling():
    executor, mock_client = setup_mocked_executor()
    mock_client.request = AsyncMock(side_effect=Exception("Network error"))

    result = await executor.execute(Action("get", "/test"))
    assert result.status == ResultStatus.FAILED
    assert "Network error" in result.error

Test with the runtime -- verify end-to-end governance works:

@pytest.mark.asyncio
async def test_my_adapter_with_runtime(tmp_path):
    from aegis.runtime.engine import Runtime
    from aegis.runtime.approval import AutoApprovalHandler
    from aegis.runtime.audit import AuditLogger

    runtime = Runtime(
        executor=MyExecutor(...),
        policy=Policy(rules=[
            PolicyRule(match_type="*", approval=Approval.AUTO, risk_level=RiskLevel.LOW),
        ]),
        approval_handler=AutoApprovalHandler(),
        audit_logger=AuditLogger(db_path=tmp_path / "test.db"),
    )
    result = await runtime.run_one(Action("read", "target"))
    assert result.ok

Run the full check pipeline before submitting:

ruff check src/ tests/
ruff format --check src/ tests/
mypy src/aegis/
pytest tests/ -v --cov=aegis