Plugins & Hooks
Overview — what are plugins?
Plugins let you intercept and modify Mellea's execution at well-defined points without changing library code. Whether you need to enforce token budgets, redact PII, log every generation call, or block unsafe tool invocations, plugins give you fine-grained control over the entire pipeline.
When an event fires during Mellea's execution (e.g., "about to call the LLM" or "tool invocation requested"), the plugin system:
- Dispatches a typed payload describing the event to all registered plugins
- Runs plugins in priority order, grouped by execution mode
- Returns a result — continue unchanged, continue with a modified payload, or block execution entirely
Plugins are organized into six hook categories:
- Session Lifecycle — session init, reset, and cleanup (
session_pre_init,session_post_init,session_reset,session_cleanup) - Component Lifecycle — before and after component execution (
component_pre_execute,component_post_success,component_post_error) - Generation Pipeline — before and after LLM calls (
generation_pre_call,generation_post_call,generation_error) - Validation — before and after requirement checks (
validation_pre_check,validation_post_check) - Sampling Pipeline — sampling loop events (
sampling_loop_start,sampling_iteration,sampling_repair,sampling_loop_end) - Tool Execution — before and after tool invocations (
tool_pre_invoke,tool_post_invoke)
Plugins require the hooks extra: pip install 'mellea[hooks]'
Quick start — your first plugin in 5 minutes
Here is a complete, working plugin in under 20 lines of user code. It logs a one-line summary before every LLM call.
## file: https://github.com/generative-computing/mellea/blob/main/docs/examples/plugins/quickstart.py
import logging
from mellea import start_session
from mellea.plugins import HookType, hook, register
log = logging.getLogger("quickstart")
@hook(HookType.GENERATION_PRE_CALL)
async def log_generation(payload, ctx):
"""Log a one-line summary before every LLM call."""
action_preview = str(payload.action)[:80].replace("\n", " ")
log.info("[log_generation] About to call LLM: %r", action_preview)
register(log_generation)
with start_session() as m:
result = m.instruct("What is the capital of France?")
log.info("Result: %s", result)
Every hook function receives two arguments:
payload— A frozen, typed object containing all the data relevant to this hook point. You can read any field but cannot mutate it directly.ctx— Read-only context with session metadata (backend name, model ID, etc.).
A hook returns either None (continue unchanged) or a PluginResult (to modify the payload or block execution). The quickstart hook returns None implicitly, as it observes without interfering.
See the full example.
Standalone function hooks
The @hook decorator turns any async function into a plugin. This is the simplest and most common way to extend Mellea.
Anatomy of a hook function
from mellea.plugins import HookType, PluginMode, hook
@hook(HookType.GENERATION_PRE_CALL, mode=PluginMode.SEQUENTIAL, priority=10)
async def my_hook(payload, ctx):
# payload: frozen, typed — read any field
# ctx: read-only metadata (backend, model_id, etc.)
# return None to pass through, or a PluginResult to modify/block
pass
hook_type— Which event to listen for (e.g.,HookType.GENERATION_PRE_CALL).mode— How the hook executes. Default:PluginMode.SEQUENTIAL. See Execution Modes Deep Dive.priority— Lower numbers run first. Default:50.
Execution modes at a glance
| Mode | Serial/Parallel | Can Block | Can Modify | Errors Propagated |
|---|---|---|---|---|
SEQUENTIAL | Serial | Yes | Yes | Yes |
TRANSFORM | Serial | No | Yes | Yes |
AUDIT | Serial | No | No | Yes |
CONCURRENT | Parallel | Yes | No | Yes |
FIRE_AND_FORGET | Background | No | No | No |
Blocking execution
Use block() to reject an operation. The caller receives a PluginViolationError.
from mellea.plugins import HookType, PluginMode, block, hook
TOKEN_BUDGET = 4000
@hook(HookType.GENERATION_PRE_CALL, mode=PluginMode.SEQUENTIAL, priority=10)
async def enforce_token_budget(payload, ctx):
estimated = estimate_tokens(payload)
if estimated > TOKEN_BUDGET:
return block(
f"Estimated {estimated} tokens exceeds budget of {TOKEN_BUDGET}",
code="TOKEN_BUDGET_001",
details={"estimated": estimated, "budget": TOKEN_BUDGET},
)
block() accepts:
reason(required) — Human-readable explanation.code— Machine-readable error code for programmatic handling.details— A dict with additional structured data.
The caller catches the violation as a PluginViolationError with .reason, .code, .hook_type, and .plugin_name attributes.
Modifying payloads
Payloads are frozen Pydantic models. Direct mutation raises FrozenModelError. Instead, use the modify() helper:
from mellea.plugins import HookType, PluginMode, hook, modify
@hook(HookType.GENERATION_PRE_CALL, mode=PluginMode.SEQUENTIAL, priority=10)
async def cap_max_tokens(payload, ctx):
opts = dict(payload.model_options or {})
if opts.get("max_tokens", float("inf")) > 256:
opts["max_tokens"] = 256
return modify(payload, model_options=opts)
For more control, use model_copy(update={...}) directly and return a PluginResult:
from mellea.plugins import PluginResult
modified = payload.model_copy(update={"model_options": {**payload.model_options, "max_tokens": 256}})
return PluginResult(continue_processing=True, modified_payload=modified)
Payload policies
Each hook type declares which fields are writable. Changes to non-writable fields are silently discarded by the framework. For example, generation_pre_call allows modifying model_options, format, and tool_calls, but not action or context. This ensures plugins cannot make changes the framework hasn't sanctioned.
See the full policy table in the Hook Types Reference.
See the standalone hooks example and the payload modification example.
Class-based plugins
When you need shared state across multiple hooks (e.g., a redaction counter, a rate limiter's token bucket), group them in a Plugin subclass.
When to use a class vs. standalone functions
- Standalone functions — Best for single-concern hooks that don't share state.
- Class-based plugins — Best when multiple hooks operate on a shared concern and need access to the same instance state.
Defining a plugin
from mellea.plugins import Plugin, hook, modify, HookType
class PIIRedactor(Plugin, name="pii-redactor", priority=5):
def __init__(self, patterns=None):
self.patterns = patterns or [r"\d{3}-\d{2}-\d{4}"]
self.redaction_count = 0
@hook(HookType.COMPONENT_PRE_EXECUTE)
async def redact_input(self, payload, ctx):
"""Scan input for PII and redact before it reaches the LLM."""
# self.patterns, self.redaction_count available here
...
@hook(HookType.GENERATION_POST_CALL)
async def redact_output(self, payload, ctx):
"""Scan LLM output for PII and redact before returning."""
...
Key syntax:
- Inherit from
Pluginand setnameandpriorityas class keyword arguments. - Decorate methods with
@hook(HookType.XXX). Theselfparameter gives access to shared state. - The class
priorityis the default for all methods. Override per-method with@hook(HookType.XXX, priority=M).
Priority resolution
Priority is resolved in this order (highest precedence first):
PluginSet(priority=N)override — applies to all items in the set@hook(priority=M)on the method — overrides the class defaultPlugin(priority=N)class keyword — default for all methods50— the global default if nothing else is set
Registering a plugin
from mellea.plugins import register
redactor = PIIRedactor()
register(redactor)
See the full PII redaction example.
Registration and scoping
Plugins can be activated at three levels. Each level determines when hooks fire and when they are cleaned up.
Global scope
Register at module level, fires for every session and every functional API call.
from mellea.plugins import register
register(log_generation) # single hook
register([hook_a, hook_b]) # multiple hooks
register(redactor) # Plugin instance
register(observability_set) # PluginSet
Remove with unregister():
from mellea.plugins import unregister
unregister(log_generation)
Session scope
Pass plugins to start_session(), fires only within that session.
from mellea import start_session
with start_session(plugins=[enforce_content_policy, log_component]) as m:
result = m.instruct("Explain photosynthesis.")
# plugins deregistered when session exits
With-block scope
Activate plugins for a specific block of code with guaranteed cleanup.
plugin_scope():
from mellea.plugins import plugin_scope
with plugin_scope(log_request, log_response, content_guard):
result = m.instruct("Name the planets.")
# all three deregistered here
Plugin instance as context manager:
guard = ContentGuard()
with guard:
result = m.instruct("What is the boiling point of water?")
# guard deregistered here
PluginSet as context manager:
with observability:
result = m.instruct("What is the capital of France?")
# observability hooks deregistered here
All three forms support async with for async code:
async with plugin_scope(log_request, ContentGuard()):
result = await m.ainstruct("Describe the solar system.")
Nesting
Scopes stack cleanly. Each exit deregisters only its own plugins.
with plugin_scope(log_request): # outer scope
with ContentGuard() as guard: # inner scope
result = m.instruct("...") # log_request + guard active
result = m.instruct("...") # only log_request active
# no plugins active
Cleanup guarantee
Plugins are always deregistered on scope exit, even if the block raises an exception. There is no resource leak on error.
Re-entrant restriction
The same instance cannot be active in two overlapping scopes. Create separate instances if you need parallel or nested activation:
guard1 = ContentGuard()
guard2 = ContentGuard() # separate instance
with guard1:
with guard2: # OK — different instances
...
See the scoped plugins example and the session-scoped example.
PluginSets — composing plugins
A PluginSet groups related hooks and plugins into a reusable, named bundle. Use it to organize plugins by concern (security, observability, compliance) and register or scope them as a unit.
Creating a pluginset
from mellea.plugins import PluginSet
security = PluginSet("security", [enforce_token_budget, enforce_description_length])
observability = PluginSet("observability", [trace_session, trace_component, trace_cleanup])
A PluginSet accepts any mix of standalone @hook functions, Plugin instances, or nested PluginSets.
Registering
# Global
register(observability)
# Session-scoped
with start_session(plugins=[security]) as m:
...
# With-block
with security:
...
Priority override
PluginSet(priority=N) overrides the priority of all contained items, including nested sets:
# All items in this set run at priority 1, regardless of their own priority settings
critical = PluginSet("critical", [hook_a, hook_b, nested_set], priority=1)
Real-world pattern
Register observability globally (fires everywhere) and security per-session (fires only where needed):
register(observability) # global
with start_session(plugins=[security]) as m:
# both security and observability active
result = m.instruct("Name three prime numbers.")
with start_session() as m:
# only observability active
result = m.instruct("What is 2 + 2?")
See the full PluginSet composition example.
Hook types reference
This section is a comprehensive reference for every implemented hook type. For each hook, you'll find when it fires, what payload fields are available, which fields are writable, and typical use cases.
Session lifecycle
session_pre_init
Fires: Immediately when start_session() is called, before backend initialization.
Payload fields: backend_name, model_id, model_options, context_type
Writable fields: model_id, model_options
Use cases:
- Enforcing model usage restrictions
- Injecting default model options
@hook(HookType.SESSION_PRE_INIT)
async def enforce_model_policy(payload, ctx):
if "gpt-4" in str(payload.model_id):
return block("GPT-4 usage not permitted", code="MODEL_POLICY")
session_post_init
Fires: After the session is fully initialized, before any operations.
Payload fields: session (the MelleaSession instance)
Writable fields: (observe-only)
Use cases:
- Initializing telemetry for the session
- Logging session configuration
session_reset
Fires: When session.reset() is called to clear context.
Payload fields: previous_context
Writable fields: (observe-only)
Use cases:
- Preserving audit trails before reset
- Resetting plugin-specific state
session_cleanup
Fires: When the session closes (via close(), cleanup(), or context manager exit).
Payload fields: context, interaction_count
Writable fields: (observe-only)
Use cases:
- Flushing telemetry buffers
- Aggregating session metrics
Component lifecycle
component_pre_execute
Fires: Before any component is executed via aact(). This is the primary interception point for all generation requests.
Payload fields: component_type, action, context_view, requirements, model_options, format, strategy, tool_calls_enabled
Writable fields: requirements, model_options, format, strategy, tool_calls_enabled
Use cases:
- Policy enforcement on generation requests
- Injecting or modifying model options
- Content filtering and authorization checks
- Routing to different sampling strategies
@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.SEQUENTIAL, priority=5)
async def enforce_content_policy(payload, ctx):
desc = str(payload.action._description).lower()
if "financial advice" in desc:
return block("Restricted topic", code="CONTENT_001")
component_post_success
Fires: After successful component execution.
Payload fields: component_type, action, result, context_before, context_after, generate_log, sampling_results, latency_ms
Writable fields: (observe-only)
Use cases:
- Latency and metrics collection
- Audit logging
@hook(HookType.COMPONENT_POST_SUCCESS, mode=PluginMode.AUDIT)
async def log_latency(payload, ctx):
log.info("component=%s latency=%dms", payload.component_type, payload.latency_ms)
component_post_error
Fires: When component execution fails with an exception.
Payload fields: component_type, action, error, error_type, stack_trace, context, model_options
Writable fields: (observe-only)
Use cases:
- Error logging and alerting
- Failure analysis
Generation pipeline
generation_pre_call
Fires: Just before the backend transmits data to the LLM API.
Payload fields: action, context, model_options, format, tool_calls
Writable fields: model_options, format, tool_calls
Use cases:
- Token budget enforcement
- Prompt injection detection
- Last-mile model option adjustments
@hook(HookType.GENERATION_PRE_CALL, mode=PluginMode.SEQUENTIAL, priority=10)
async def cap_tokens(payload, ctx):
opts = dict(payload.model_options or {})
opts["max_tokens"] = min(opts.get("max_tokens", 4096), 256)
return modify(payload, model_options=opts)
generation_post_call
Fires: After the LLM response is fully materialized (model_output.value is available).
Payload fields: prompt, model_output, latency_ms
Writable fields: (observe-only)
Use cases:
- Output logging and inspection
- Response caching
- Quality metrics and hallucination detection
generation_error
Fires: When the LLM backend raises an exception during output materialization, just before the exception is re-raised.
Payload fields: exception, model_output
Writable fields: (observe-only)
Use cases:
- Error telemetry and alerting on backend failures
- Logging structured error information for debugging
Validation
validation_pre_check
Fires: Before running requirement validation.
Payload fields: requirements, target, context, model_options
Writable fields: requirements, model_options
Use cases:
- Injecting additional requirements
- Overriding validation model options
validation_post_check
Fires: After all validations complete.
Payload fields: requirements, results, all_validations_passed, passed_count, failed_count, generate_logs
Writable fields: results, all_validations_passed
Use cases:
- Logging validation outcomes
- Overriding validation results
- Triggering alerts on failures
Sampling pipeline
sampling_loop_start
Fires: When a sampling strategy begins execution.
Payload fields: strategy_name, action, context, requirements, loop_budget
Writable fields: loop_budget
Use cases:
- Dynamically adjusting the iteration budget
- Logging sampling configuration
sampling_iteration
Fires: After each sampling attempt.
Payload fields: iteration, action, result, validation_results, all_validations_passed, valid_count, total_count
Writable fields: (observe-only)
Use cases:
- Iteration-level metrics
- Debugging sampling behavior
sampling_repair
Fires: When repair is invoked after a validation failure.
Payload fields: repair_type, failed_action, failed_result, failed_validations, repair_action, repair_context, repair_iteration
Writable fields: (observe-only)
Use cases:
- Analyzing failure patterns
- Logging repair events
sampling_loop_end
Fires: When sampling completes (success or failure).
Payload fields: success, iterations_used, final_result, final_action, final_context, failure_reason, all_results, all_validations
Writable fields: (observe-only)
Use cases:
- Sampling effectiveness metrics
- Cost tracking
Tool execution
tool_pre_invoke
Fires: Before invoking a tool from LLM output.
Payload fields: model_tool_call (contains name, args, callable), is_control_flow
Writable fields: model_tool_call
Use cases:
- Tool authorization (allow-listing)
- Argument validation and sanitization
@hook(HookType.TOOL_PRE_INVOKE, mode=PluginMode.CONCURRENT, priority=5)
async def enforce_tool_allowlist(payload, ctx):
if payload.model_tool_call.name not in ALLOWED_TOOLS:
return block(f"Tool '{payload.model_tool_call.name}' not permitted", code="TOOL_NOT_ALLOWED")
The payload includes an is_control_flow field that is True for framework control-flow tools (e.g. the ReAct loop's final_answer). Allowlist plugins should check this field to avoid blocking internal tools. See Control-flow tools for the recommended pattern.
tool_post_invoke
Fires: After tool execution completes.
Payload fields: model_tool_call, tool_output, tool_message, execution_time_ms, success, error, is_control_flow
Writable fields: tool_output
Use cases:
- Audit logging of tool calls
- Output transformation
- Error handling
Hook payload policy table
This table summarizes which fields are writable for each hook type. Changes to non-writable fields are silently discarded.
| Hook Point | Writable Fields |
|---|---|
session_pre_init | model_id, model_options |
session_post_init | (observe-only) |
session_reset | (observe-only) |
session_cleanup | (observe-only) |
component_pre_execute | requirements, model_options, format, strategy, tool_calls_enabled |
component_post_success | (observe-only) |
component_post_error | (observe-only) |
generation_pre_call | model_options, format, tool_calls |
generation_post_call | (observe-only) |
generation_error | (observe-only) |
validation_pre_check | requirements, model_options |
validation_post_check | results, all_validations_passed |
sampling_loop_start | loop_budget |
sampling_iteration | (observe-only) |
sampling_repair | (observe-only) |
sampling_loop_end | (observe-only) |
tool_pre_invoke | model_tool_call |
tool_post_invoke | tool_output |
Only SEQUENTIAL and TRANSFORM modes can modify payloads. AUDIT, CONCURRENT, and FIRE_AND_FORGET modes have their modifications silently discarded regardless of the policy table.
Execution modes deep dive
All hooks for a given hook type are sorted by priority, then dispatched in groups by execution mode. The execution order is always: SEQUENTIAL → TRANSFORM → AUDIT → CONCURRENT → FIRE_AND_FORGET.
SEQUENTIAL (default)
Serial, chained execution. Each hook receives the payload from the prior hook. Can both block and modify. This is the default mode, use it when you need full control over the pipeline.
@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.SEQUENTIAL, priority=10)
async def enforce_policy(payload, ctx):
# Can block:
if is_unsafe(payload):
return block("Unsafe content", code="UNSAFE")
# Can modify:
return modify(payload, model_options={"temperature": 0.1})
TRANSFORM
Serial, chained execution after all SEQUENTIAL hooks. Can modify but cannot block (block() calls are suppressed with a warning). Use for data transformation (PII redaction, prompt rewriting) where you want to guarantee the pipeline continues.
@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.TRANSFORM, priority=20)
async def enrich_options(payload, ctx):
opts = dict(payload.model_options or {})
opts.setdefault("temperature", 0.7)
return modify(payload, model_options=opts)
AUDIT
Awaited inline after TRANSFORM. Observe-only: payload modifications are discarded and violations are logged but do not block. Use for shadow policies, canary deployments, and gradual policy rollout.
@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.AUDIT, priority=30)
async def shadow_policy(payload, ctx):
# This block() is logged but does NOT stop execution
return block("Would block in production", code="SHADOW_001")
CONCURRENT
Dispatched in parallel after AUDIT. Can block (fail-fast on first blocking result) but cannot modify; modifications are discarded to avoid non-deterministic last-writer-wins races. Use for independent validation checks that benefit from parallel execution.
@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.CONCURRENT, priority=40)
async def rate_limit_check(payload, ctx):
if await is_rate_limited(ctx):
return block("Rate limit exceeded", code="RATE_LIMIT")
FIRE_AND_FORGET
Dispatched via asyncio.create_task() after all other phases. Receives a copy-on-write snapshot of the payload. Cannot modify or block. Exceptions are logged but never propagated. Use for telemetry, async logging, and side-effects that must not slow down the pipeline.
@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.FIRE_AND_FORGET, priority=50)
async def send_telemetry(payload, ctx):
await telemetry_client.send({"component": payload.component_type})
The FIRE_AND_FORGET log output may appear after the main result is printed. This is expected behavior, as these hooks run in the background.
Chaining
In SEQUENTIAL and TRANSFORM modes, when multiple plugins modify the same payload, modifications are composed. Plugin B sees the output of Plugin A (after policy filtering). This enables pipelines like:
- Plugin A caps
max_tokensto 256 - Plugin B (seeing the capped value) adds a
temperaturedefault - The final payload has both modifications applied
Error handling
SEQUENTIAL,TRANSFORM,AUDIT,CONCURRENT,FIRE_AND_FORGET— exceptions are logged and swallowed. They never affect the pipeline.block()— this is intentional control flow, not an error. It raises aPluginViolationErrorto the caller.
See the execution modes example.
Tool hooks — securing tool calls
The tool_pre_invoke and tool_post_invoke hooks give you fine-grained control over tool-call governance. See the MCP integration guide for tool calling basics.
Tool allow-listing
Block any tool not on an explicit approved list. The is_control_flow guard ensures framework tools like final_answer are not blocked:
ALLOWED_TOOLS = frozenset({"get_weather", "calculator"})
@hook(HookType.TOOL_PRE_INVOKE, mode=PluginMode.CONCURRENT, priority=5)
async def enforce_tool_allowlist(payload, ctx):
if payload.is_control_flow:
return # framework control-flow tools are exempt
tool_name = payload.model_tool_call.name
if tool_name not in ALLOWED_TOOLS:
return block(f"Tool '{tool_name}' is not permitted", code="TOOL_NOT_ALLOWED")
Argument validation
Inspect and reject unsafe arguments before invocation:
@hook(HookType.TOOL_PRE_INVOKE, mode=PluginMode.CONCURRENT, priority=10)
async def validate_calculator_args(payload, ctx):
if payload.model_tool_call.name == "calculator":
expr = payload.model_tool_call.args.get("expression", "")
if not set(expr).issubset(set("0123456789 +-*/(). ")):
return block("Unsafe calculator expression", code="UNSAFE_EXPRESSION")
Argument sanitization
Auto-fix arguments instead of blocking (a repair pattern):
import dataclasses
from mellea.plugins import PluginResult
@hook(HookType.TOOL_PRE_INVOKE, mode=PluginMode.CONCURRENT, priority=15)
async def sanitize_args(payload, ctx):
mtc = payload.model_tool_call
if mtc.name == "get_weather":
location = mtc.args.get("location", "").strip().title()
if location != mtc.args.get("location"):
new_call = dataclasses.replace(mtc, args={**mtc.args, "location": location})
modified = payload.model_copy(update={"model_tool_call": new_call})
return PluginResult(continue_processing=True, modified_payload=modified)
Audit logging
Fire-and-forget logging of every tool call for audit trails:
@hook(HookType.TOOL_POST_INVOKE, mode=PluginMode.FIRE_AND_FORGET)
async def audit_tool_calls(payload, ctx):
status = "OK" if payload.success else "ERROR"
log.info("tool=%r status=%s latency=%dms", payload.model_tool_call.name, status, payload.execution_time_ms)
Composing tool hooks
Group tool security hooks into a PluginSet for clean per-session registration:
tool_security = PluginSet("tool-security", [enforce_tool_allowlist, validate_calculator_args, audit_tool_calls])
with start_session(plugins=[tool_security]) as m:
result = m.instruct("What's the weather in Boston?", tool_calls=True)
See the full tool hooks example.
Control-flow tools
Mellea's frameworks use internal tools for control flow. For example, the ReAct loop uses a final_answer tool to signal that the agent has finished reasoning. These tools flow through the same invocation path as user-defined tools — hooks always fire for them — but the payload carries an is_control_flow flag so each plugin can decide its own policy.
The recommended pattern for allowlist plugins is to skip control-flow tools explicitly:
ALLOWED_TOOLS = frozenset({"get_weather", "calculator"})
@hook(HookType.TOOL_PRE_INVOKE, mode=PluginMode.CONCURRENT, priority=5)
async def enforce_tool_allowlist(payload, ctx):
if payload.is_control_flow:
return # framework control-flow tools are exempt
if payload.model_tool_call.name not in ALLOWED_TOOLS:
return block(f"Tool '{payload.model_tool_call.name}' not permitted")
Logging and telemetry plugins typically do not check this flag — they observe all tool calls including control-flow tools:
@hook(HookType.TOOL_POST_INVOKE, mode=PluginMode.FIRE_AND_FORGET)
async def log_all_tools(payload, ctx):
logger.info("tool=%s control_flow=%s ms=%d", payload.model_tool_call.name,
payload.is_control_flow, payload.execution_time_ms)
Querying the registry
Use is_internal_tool() to check whether a tool name is a known control-flow tool:
from mellea.plugins import is_internal_tool
is_internal_tool("final_answer") # True
is_internal_tool("get_weather") # False
Patterns and best practices
Observability stack
Combine session tracing, component latency, and generation logging, all using FIRE_AND_FORGET or AUDIT mode so they never slow down the pipeline:
observability = PluginSet("observability", [
trace_session_start, # AUDIT — session_post_init
trace_component_success, # AUDIT — component_post_success
trace_session_end, # AUDIT — session_cleanup
])
register(observability) # global — fires for all sessions
Layered security
Stack enforcement across scopes:
- Global: Token budget enforcement (
SEQUENTIAL) - Session-scoped: Content policy for sensitive sessions
- With-block: Feature flags for specific operations
register(enforce_token_budget) # global
with start_session(plugins=[content_policy]) as m:
with plugin_scope(feature_flag_hook):
result = m.instruct("...")
Input/output guardrails
Block PII on input (component_pre_execute) and redact PII from output (generation_post_call) using a class-based plugin with shared state:
class PIIRedactor(Plugin, name="pii-redactor", priority=5):
@hook(HookType.COMPONENT_PRE_EXECUTE)
async def reject_pii_input(self, payload, ctx): ...
@hook(HookType.GENERATION_POST_CALL)
async def redact_output(self, payload, ctx): ...
Graceful degradation with AUDIT mode
Deploy a new policy in AUDIT mode first, where violations are logged but do not block. Monitor the logs. When you're confident, promote to SEQUENTIAL:
# Phase 1: shadow mode — log violations without blocking
@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.AUDIT)
async def new_content_policy(payload, ctx):
if is_prohibited(payload):
return block("Would block", code="NEW_POLICY_001")
# Phase 2: enforce — change mode to SEQUENTIAL when ready
Testing plugins
You can unit-test hook functions without running a full Mellea session. Construct a payload mock, call the function directly, and assert the result:
from unittest.mock import MagicMock
async def test_blocks_long_description():
payload = MagicMock()
payload.action._description = "x" * 1000
ctx = MagicMock()
result = await enforce_description_length(payload, ctx)
assert result is not None
assert result.continue_processing is False
assert result.violation.code == "DESC_TOO_LONG"
See the full testing example.
Idempotent lifecycle hooks
If you use the advanced MelleaPlugin base class (which provides initialize() and shutdown() callbacks), make them idempotent, as they may be called once per @hook method on your plugin.
API reference
All public symbols are available from a single import:
from mellea.plugins import (
HookType, # Enum of all hook types (e.g., GENERATION_PRE_CALL)
Plugin, # Base class for class-based plugins
PluginMode, # Execution mode enum (SEQUENTIAL, TRANSFORM, ...)
PluginResult, # Return type for hooks that modify or block
PluginSet, # Named group of hooks/plugins for composition
PluginViolationError, # Exception raised when a hook blocks execution
block, # Helper to create a blocking PluginResult
hook, # Decorator to register an async function as a hook handler
is_internal_tool, # Check if a tool is a framework control-flow tool
modify, # Helper to create a modifying PluginResult
plugin_scope, # Context manager for with-block scoped activation
register, # Register hooks/plugins globally or per-session
unregister, # Remove globally-registered hooks/plugins
)
| Symbol | Description |
|---|---|
@hook(hook_type, *, mode, priority) | Decorator that marks an async function as a hook handler |
Plugin | Base class for multi-hook plugins with shared state. Set name and priority via class keywords |
PluginSet(name, items, *, priority) | Groups hooks, plugins, and nested sets into a reusable bundle |
register(items, *, session_id) | Register hooks/plugins. session_id=None for global scope |
unregister(items) | Remove globally-registered items |
plugin_scope(*items) | Context manager that registers on enter, deregisters on exit |
block(reason, *, code, details) | Create a blocking PluginResult |
modify(payload, **field_updates) | Create a modifying PluginResult via model_copy |
is_internal_tool(tool_name) | Returns True if the tool is a framework control-flow tool (e.g. final_answer) |
HookType | Enum with all 18 hook types |
PluginMode | Enum: SEQUENTIAL, TRANSFORM, AUDIT, CONCURRENT, FIRE_AND_FORGET |
PluginResult | Typed result with continue_processing, modified_payload, and violation |
PluginViolationError | Exception with .reason, .code, .hook_type, .plugin_name |
See also: Glossary, Tools and Agents, Safety Guardrails, OpenTelemetry Tracing