Docs / Human-in-the-loop

Human-in-the-loop.

Real production agents stop and ask a human at decision points: approve this wire transfer, confirm this medical recommendation, sign off on this generated contract. The wait can be seconds, hours, or days. Mesedi models the pause as a first-class lifecycle state so detectors do not falsely trip a time-budget while a human is thinking, and the eventual response is captured as a structured event the dashboard and downstream detectors can read.

The execution lifecycle

An execution lives in three categories:

started — the agent is actively working
awaiting_human — the agent is paused waiting on a human decision
terminal — completed, crashed, halted, timeout, or validation_failed

Valid transitions are started → awaiting_human (pause), awaiting_human → started (resume), and either of those to any terminal status. Mesedi accumulates the wall-clock time spent in awaiting_human on the execution's total_paused_ms field, increments pause_count on each pause cycle, and subtracts paused time from time-budget checks so a long HITL wait does not falsely trip a time_budget alert.

Pausing and resuming (low-level)

For raw lifecycle control, the SDK exposes thin wrappers around the PATCH endpoint:

# Python
import mesedi

@mesedi.wrap
def my_agent(input):
# do some work...
mesedi.pause_for_human()
# host application blocks here while waiting on the human...
mesedi.resume_for_agent()
# do remaining work...

// TypeScript
import { wrap, pauseForHuman, resumeForAgent } from "mesedi";

export const myAgent = wrap(async (input) => {
  // do some work...
  await pauseForHuman();
  // host application waits on the human here...
  await resumeForAgent();
  // do remaining work...
});

Both helpers are synchronous PATCH calls (not async shipper) because the lifecycle transition must be committed before the host application blocks or releases the agent. They flush the shipper first so the initial POST /executions for a fresh wrap has landed; otherwise a sync PATCH would race the async POST and return 404.

Capturing the full ask/answer cycle

For production HITL workflows the higher-level helpers are usually what you want. They emit a structured human_intervention event capturing the question, the SLA, the response kind, the response payload, and who decided.

# Python
import mesedi

@mesedi.wrap
def banking_agent(transfer_request):
# prepare the transfer...
handle = mesedi.request_human_intervention(
    question="Approve $50,000 wire transfer to account 1234?",
    sla_seconds=3600,                   # 1-hour SLA
    metadata={
        "transfer_id": "tx_demo_42",
        "amount_usd": 50000,
        "destination_account": "1234",
    },
)
# host application persists `handle.to_dict()` and blocks here.
# when the human responds via your UI / webhook / queue:
handle.complete(
    response_kind="approved",          # or "rejected" / "edited" / "timeout" / "cancelled"
    response_payload={"approver": "alice@example.com"},
    decided_by="alice@example.com",
)
# continue with the transfer...

// TypeScript
import {
  wrap,
  requestHumanIntervention,
  completeHumanIntervention,
} from "mesedi";

export const bankingAgent = wrap(async (transferRequest) => {
  const handle = await requestHumanIntervention(
"Approve $50,000 wire transfer to account 1234?",
{
  slaSeconds: 3600,
  metadata: {
    transferId: "tx_demo_42",
    amountUsd: 50000,
    destinationAccount: "1234",
  },
},
  );
  // host application persists handle and resumes from the response handler.
  // when the human responds:
  await completeHumanIntervention(handle, {
responseKind: "approved",
responsePayload: { approver: "alice@example.com" },
decidedBy: "alice@example.com",
  });
  // continue with the transfer...
});

The handle returned by request_human_intervention is JSON-serializable (Python via handle.to_dict() / HumanInterventionHandle.from_dict(); TypeScript natively), so it can survive a round-trip through Redis, Kafka, or a SQL row without losing correlation data.

Well-known response_kind values (recognized by the downstream HITL detectors):

approved — happy path, the human said yes
rejected — the human said no
edited — the human modified the agent's output before approving
timeout — the host application gave up waiting
cancelled — operator killed the wait

hitl_timeout

Fires when a human_intervention event indicates the human side of your loop dropped or stalled the request. Two firing conditions:

explicit — response_kind == "timeout": the host application gave up waiting before a human responded. Signature hitl_timeout:explicit.
sla_exceeded — a human responded, but wait_duration_ms > sla_seconds * 1000. The agent proceeded with the answer, but the SLA was breached. Signature hitl_timeout:sla_exceeded.

Explicit timeouts beat sla_exceeded if both conditions fire on the same execution (the strongest signal wins).

hitl_rejection_spike

Cross-execution signal that detects agent quality regressions via human verdicts. Fires when, in the last hour, at least five distinct executions had human_intervention events AND the fraction with rejected verdicts exceeds 40 percent (or the fraction with edited verdicts exceeds 30 percent).

Two signatures: hitl_rejection_spike:rejected (humans saying NO outright) and hitl_rejection_spike:edited (humans modifying outputs before approving). The former is a stronger negative signal than the latter; both indicate the agent's outputs need correction.

The detector runs only when the current execution itself recorded a human_intervention event, so it never fires on traffic that did not involve HITL in the first place.

Dashboard rendering

The execution detail page renders an accent-orange HITL banner whenever the execution is currently paused OR has accumulated paused time on previous cycles. Currently-paused executions show the paused_at timestamp; completed executions with past pause cycles show pause_count and total wait duration.

The events timeline color-codes human_intervention events by verdict: rejections and timeouts in danger red, approvals in success green, edits and cancellations in accent orange. Expanding the row reveals the full ask/answer detail including the wait duration, SLA, decided_by, and the response_payload.

What's next?

Failure classes and playbooks for the rest of the detector catalog.

Multi-agent topology and handoffs for the parent/child execution graph (HITL executions show up in the topology view too).

API reference for the lifecycle PATCH wire format.