Lifecycle hooks

A module can implement up to four hooks. Each fires at a specific point in the request lifecycle.

`init(storage)`

Fires: once per process, at gateway startup.

Parameters: the resolved StorageAdapter. Use it to seed state, run a migration check, warm a local cache.

Returns: Promise<void>.

Failures: if init throws, the gateway logs the error and does not load the module for the rest of the process. Other modules continue.


async init(storage) {
  await storage.db.raw(`
    CREATE TABLE IF NOT EXISTS my_module_state (
      id TEXT PRIMARY KEY,
      value TEXT NOT NULL
    )
  `);
}

The cloud and local DB syntax differ. Either restrict raw() SQL to a portable subset, or branch on storage.kind.

`pre(ctx)`

Fires: before the provider call, in pipeline order. The first module’s pre runs first.

Parameters: RequestContext — request (mutable), metadata, storage, apiKey, logger, startTime.

Returns: Promise<PreResult>:

{ continue: true } — proceed to next module.
{ continue: false, response } — short-circuit. Remaining pre hooks AND the provider call are skipped. The supplied response is sent to the client. Post hooks run on it.

Failures: an uncaught throw is logged and the module is skipped — pipeline continues as if the module returned { continue: true }. The metadata key module-name.preFailed is set so downstream modules can react.


async pre(ctx) {
  try {
    const cached = await ctx.storage.kv.get(cacheKey(ctx.request));
    if (cached) {
      return {
        continue: false,
        response: JSON.parse(cached) as CanonicalResponse,
      };
    }
    return { continue: true };
  } catch (err) {
    ctx.logger.warn({ err }, 'cache lookup failed');
    return { continue: true };  // never deny the user a response
  }
}

`stream(chunk, ctx)`

Fires: for each chunk during a streaming response, in pipeline order.

Parameters:

chunk: CanonicalChunk — the chunk (immutable in spirit, but you return a possibly-modified one).
ctx: ResponseContext — same as RequestContext + response (accumulated so far) + durationMs.

Returns: Promise<CanonicalChunk> — the chunk to forward to the next module / client.

Failures: an uncaught throw is logged and the chunk is forwarded unmodified to the next module.

v1 has limited stream-hook coverage. mcp-optimizer and cost-guard use it; semantic-cache accumulates chunks for its post-write but doesn’t transform them. Treat this hook as planned-but-fragile until v1.1.

`post(ctx)`

Fires: after the response is sent to the client. Fire-and-forget — does not block the client.

Parameters: ResponseContext — full request + response + duration.

Returns: Promise<void>. The return value is ignored.

Failures: caught and logged. The client never sees a post-hook error.


async post(ctx) {
  // Fire-and-forget: write to a slow analytics backend without blocking
  await ctx.storage.db.from('events').insert({
    user_id: ctx.apiKey.userId,
    model: ctx.request.model,
    duration_ms: ctx.durationMs,
    input_tokens: ctx.response.usage.inputTokens,
    output_tokens: ctx.response.usage.outputTokens,
  });
}

In v1, post hooks are skipped on streaming responses. Pre hooks (including cache short-circuits which replay as synthetic streams) still fire. Post-hook-on-stream lands in v1.1.

Order of execution

For a non-streaming request through [A, B, C]:


A.init()  ──── (only at boot)
B.init()
C.init()

A.pre() ──▶ B.pre() ──▶ C.pre() ──▶ provider ──▶ A.post() ──▶ B.post() ──▶ C.post()
                                                  (fire-and-forget; client already returned)

Short-circuit at B (returns { continue: false, response }):


A.pre() ──▶ B.pre() (returns response) ──▶ A.post() ──▶ B.post() ──▶ C.post()
                                            ^^^^^^^^^ post still runs on the cached response

C’s pre is skipped. C’s post still runs (it’s a side effect — caches need to know the request happened).

Cross-module communication

Use ctx.metadata (a Map<string, unknown>) to pass values between modules:


// In your "estimator" module
async pre(ctx) {
  const cost = estimateRequestCost(ctx.request);
  ctx.metadata.set('estimated_cost', cost);
  return { continue: true };
}
 
// In a downstream "router" module
async pre(ctx) {
  const cost = ctx.metadata.get('estimated_cost') as number | undefined;
  if (cost && cost > 0.10) ctx.request.model = 'claude-haiku-4-5';
  return { continue: true };
}

Convention: namespace your keys with the module name (my-module.thing) to avoid collisions.

Lifecycle hooks

init(storage)

pre(ctx)

stream(chunk, ctx)

post(ctx)