跳到主要内容
版本:0.4

Composition Rules

When you pass multiple middlewares — Agent(middleware=[m1, m2, m3]) — CubePi composes them according to per-hook rules that differ on purpose. The right way to think about it is: each hook has the composition rule that makes sense for its job, and you don't have to remember "before" or "after" precedence guesses.

The rules at a glance

HookRuleOrder matters?
transform_contextChain — each sees previous outputYes
convert_to_llmLast winsOnly the last one runs
transform_system_promptChainYes
before_tool_callFirst block stopsFirst in list wins blocks
after_tool_callLater overrides earlierLast write wins
should_stop_after_turnAny True stops (OR)No
after_model_responseChain with merge semanticsSee below

transform_context and transform_system_prompt

Chain: m1's output becomes m2's input becomes m3's input. Useful for layered transforms:

agent = Agent(
middleware=[
SlidingWindow(max_messages=20), # m1: drop oldest
InjectSummary(), # m2: prepend a summary block
],
)

m2 sees the truncated list. The user-visible agent.state.messages is untouched — middleware only changes what the model receives.

convert_to_llm

Last-wins on purpose: this is the final transform before wire serialisation. Multiple owners would fight; pick one. CubePi enforces that the last middleware in the list that implements convert_to_llm is the one that runs.

If you find yourself needing two convert_to_llm middlewares, collapse them into one (call site composition: write one that calls both).

before_tool_call

First block=True short-circuits the rest. Use to chain policy layers from most-restrictive to least:

agent = Agent(
middleware=[
RateLimiter(), # blocks on rate quota
SafetyFilter(), # blocks on dangerous args
AuditLogger(), # never blocks; just records
],
)

If RateLimiter returns block=True, SafetyFilter and AuditLogger's before_tool_call don't run. AuditLogger.after_tool_call still fires because that's a different hook.

after_tool_call

Each middleware can return an AfterToolCallResult with some fields set; CubePi merges them, with later results overriding earlier ones for any field that's not None. The full result:

class AfterToolCallResult(BaseModel):
content: list[Content] | None = None
details: Any = None
is_error: bool | None = None
terminate: bool | None = None

Pattern: an early middleware adds rich details, a later one sanitises content for the model. Both run; the merged result combines details from one with the redacted content from the other.

should_stop_after_turn

Any middleware returning True ends the run. The rest of the chain isn't evaluated.

agent = Agent(
middleware=[
MaxTurns(10),
BudgetCap(usd=0.5),
FinalAnswerSentinel(), # stops when assistant says "FINAL ANSWER"
],
)

after_model_response

Chain with structured merge. Each middleware sees the current response (which may have been replaced by an earlier middleware) and returns a TurnAction:

  • response: AssistantMessage | None — if non-None, replaces the current response for downstream middlewares and for what the loop ultimately persists.
  • inject_messages: list[Message] — appended into a single list across the whole chain, then added to context before the next turn.
  • decision: "natural" | "stop" | "loop_to_model" — the last middleware's value wins.
agent = Agent(
middleware=[
ProfanityRedactor(), # rewrites response
StructuredOutputValidator(), # may decide="loop_to_model"
EventLogger(), # decision unchanged
],
)

If StructuredOutputValidator returns decision="loop_to_model" and EventLogger returns decision="natural", the loop sees "natural" — because last wins. Reorder if that's not what you wanted.

Mixing middleware with constructor callables

Agent(...) also accepts explicit hook callables (convert_to_llm=…, before_tool_call=…, etc.). When both are present, the explicit callable wins:

agent = Agent(
middleware=[LoggingMiddleware()],
before_tool_call=my_explicit_hook, # overrides the middleware version
)

Use the explicit form for one-off hooks; use middleware classes when behaviour is a coherent bundle.

A note on Middleware base class

The base Middleware class's unimplemented methods raise NotImplementedError. compose_middleware detects this by comparing to the base method and only wires hooks the middleware actually overrides. You don't need to pass-implement every method.

class JustTransform(Middleware):
async def transform_context(self, messages, *, signal=None):
return messages[-10:]
# No other hooks. CubePi won't call them.

See also