Skip to main content

HITL (Human-in-the-Loop)

HITL (Human-in-the-Loop) is a pattern where a Graph pauses execution synchronously at designated points to wait for human input, then resumes based on the human's response.

⚠️ HITL vs Agent Handoff​

HITL and Agent Handoff are fundamentally different patterns:

AspectHITLAgent Handoff
Graph StatePaused (WAITING)Continues/Completes
Wait ModeSynchronous waitAsynchronous transfer
Decision MakerGraph designerAgent itself
Resume MethodResume APINew Comm
Use CaseApproval workflowsChatbot→Agent escalation
// HITL: Graph pauses and waits
graph("approval") {
agent("draft", draftAgent)
humanNode("approve", "Approve?") // πŸ›‘ Graph pauses here
agent("publish", publishAgent)
}

// Handoff: Agent decides and transfers, Graph continues
class SmartAgent : Agent {
override suspend fun processComm(comm: Comm): SpiceResult<Comm> {
if (needsHuman(comm)) {
return handoff(comm) // πŸ”„ Transfer to human, Graph continues
}
return processNormally(comm)
}
}

Agent Handoff is already implemented in Spice 0.5.0. View Documentation

Core Components​

1. HumanNode​

Special Node type that pauses graph execution:

humanNode(
id = "review",
prompt = "Please review the draft",
options = listOf(
HumanOption("approve", "Approve", "Approve and continue"),
HumanOption("reject", "Reject", "Reject and rewrite")
),
timeout = Duration.ofMinutes(30), // Optional
validator = { response -> // Optional
response.selectedOption != null
}
)

2. DynamicHumanNode (Added in 0.8.0)​

Breaking free from static prompts! DynamicHumanNode reads the prompt text from NodeContext at runtime, allowing agents to generate prompts dynamically based on their processing results.

dynamicHumanNode(
id = "select-reservation",
promptKey = "menu_text", // Reads from ctx.state["menu_text"] or ctx.context["menu_text"]
fallbackPrompt = "Please make a selection",
options = emptyList(), // Optional predefined options
timeout = Duration.ofMinutes(10)
)

Key Differences from HumanNode​

FeatureHumanNodeDynamicHumanNode
Prompt SourceStatic (compile-time)Dynamic (runtime from NodeContext)
Use CaseFixed approval promptsAgent-generated menus/messages
FlexibilityLimited to predefined textAdapts to execution results

Example: Agent-Generated Reservation Menu​

val workflowGraph = graph("reservation-workflow") {
// Agent lists reservations and stores menu in state
agent("list-reservations", listAgent)

// DynamicHumanNode reads menu from state["menu_text"]
dynamicHumanNode(
id = "select-reservation",
promptKey = "menu_text",
fallbackPrompt = "Please make a selection"
)

// Agent processes user selection
agent("cancel-reservation", cancelAgent)
}

// In the listAgent:
class ListReservationsAgent : Agent {
override suspend fun processComm(comm: Comm): SpiceResult<Comm> {
val reservations = fetchReservations(comm)

// Generate dynamic menu
val menuText = buildString {
appendLine("μ–΄λ–€ μ˜ˆμ•½μ„ μ„ νƒν•˜μ‹œκ² μ–΄μš”?")
appendLine()
reservations.forEachIndexed { index, res ->
appendLine("${index + 1}. ${res.name} | ${res.checkIn} | ${res.checkOut}")
}
}

// Store menu and data in comm for next node
return SpiceResult.success(
comm.reply(
content = "Found ${reservations.size} reservations",
from = id,
data = mapOf(
"menu_text" to menuText,
"reservations_json" to reservations.toJson(),
"reservations_count" to reservations.size.toString()
)
)
)
}
}

Prompt Resolution Order​

DynamicHumanNode checks the following sources in priority order:

  1. ctx.state[promptKey] - Direct state updates from previous nodes
  2. ctx.context.get(promptKey) - Metadata from AgentNode (via comm.data)
  3. fallbackPrompt - Default if key not found

This ensures maximum flexibility across different checkpoint resume scenarios.

Checkpoint Resume Support​

DynamicHumanNode works seamlessly with checkpointing:

// Turn 1: Agent generates menu, graph pauses
val pausedReport = runner.runWithCheckpoint(
graph = workflowGraph,
input = mapOf("userId" to "user123"),
store = checkpointStore
).getOrThrow()

// Checkpoint saves:
// - state["menu_text"] = "1. Hotel A\n2. Hotel B\n..."
// - context["menu_text"] = "1. Hotel A\n2. Hotel B\n..."
// - context["reservations_json"] = "[{...}, {...}]"

// Turn 2: Resume with user selection
val finalReport = runner.resumeWithHumanResponse(
graph = workflowGraph,
checkpointId = pausedReport.checkpointId!!,
response = HumanResponse.text("select-reservation", "1"),
store = checkpointStore
).getOrThrow()

// DynamicHumanNode restores menu_text from checkpoint
// Agent accesses reservations_json from restored context

3. HumanResponse​

Human's input after interaction:

// Multiple choice response
val response = HumanResponse.choice(
nodeId = "review",
optionId = "approve"
)

// Free text response
val response = HumanResponse.text(
nodeId = "feedback",
text = "Please add more details to section 3"
)

// Response with metadata (Added in 0.8.1) ⭐ NEW
val response = HumanResponse(
nodeId = "select-reservation",
selectedOption = "option-2",
metadata = mapOf(
"selected_item_id" to "RSV002",
"selected_item_name" to "Hotel California",
"user_notes" to "Need early check-in"
)
)

Metadata Propagation (0.8.1+) ⭐ NEW​

HumanResponse includes a metadata field that automatically propagates to ExecutionContext when resuming from a checkpoint. This ensures the next node (especially AgentNode) can access user selection data.

How It Works:

// Step 1: Agent generates data and pauses at HumanNode
val listAgent = object : Agent {
override suspend fun processComm(comm: Comm): SpiceResult<Comm> {
val items = fetchItems()
return SpiceResult.success(
comm.reply(
"Found ${items.size} items",
id,
data = mapOf(
"items_json" to items.toJson(),
"session_id" to "SESSION123"
)
)
)
}
}

// Step 2: Graph pauses, checkpoint saves agent data
val pausedResult = runner.runWithCheckpoint(
graph, input, store
).getOrThrow()

// Checkpoint contains:
// - context["items_json"] = "[{...}, {...}]"
// - context["session_id"] = "SESSION123"

// Step 3: Resume with HumanResponse containing metadata
val response = HumanResponse(
nodeId = "select-item",
selectedOption = "item-2",
metadata = mapOf(
"selected_id" to "ITEM002",
"user_comment" to "Looks good!"
)
)

val finalResult = runner.resumeWithHumanResponse(
graph, pausedResult.checkpointId!!, response, store
).getOrThrow()

// Step 4: Next AgentNode receives ALL context
val processAgent = object : Agent {
override suspend fun processComm(comm: Comm): SpiceResult<Comm> {
// βœ… Access original agent data
val itemsJson = comm.context?.get("items_json")
val sessionId = comm.context?.get("session_id")

// βœ… Access HumanResponse metadata
val selectedId = comm.context?.get("selected_id")
val userComment = comm.context?.get("user_comment")

// Process with complete context!
return processItem(selectedId, itemsJson, userComment)
}
}

Key Benefits:

  • βœ… Zero manual data passing between nodes
  • βœ… Complete context preservation across checkpoint/resume
  • βœ… Type-safe access to user input via ExecutionContext
  • βœ… Works seamlessly with multi-agent workflows

Under the Hood:

When resumeWithHumanResponse() is called, the framework automatically:

  1. Restores ExecutionContext from checkpoint
  2. Merges HumanResponse.metadata into ExecutionContext
  3. Passes merged context to next node

See Context API Documentation for more details.

4. Graph Execution States​

enum class GraphExecutionState {
RUNNING, // Normal execution
WAITING_FOR_HUMAN, // Paused for human input
COMPLETED, // Finished successfully
FAILED, // Failed with error
CANCELLED // Cancelled
}

Usage Examples​

1. Basic Approval Workflow​

val approvalGraph = graph("approval-workflow") {
agent("draft", draftAgent) // Create draft

// Human reviews and approves/rejects
humanNode(
id = "review",
prompt = "Please review the draft",
options = listOf(
HumanOption("approve", "Approve", "Approve draft and continue"),
HumanOption("reject", "Reject", "Reject draft and rewrite")
)
)

// Conditional branching based on human response
edge("review", "publish") { result ->
(result.data as? HumanResponse)?.selectedOption == "approve"
}
edge("review", "draft") { result ->
(result.data as? HumanResponse)?.selectedOption == "reject"
}

agent("publish", publishAgent)
}

val runner = DefaultGraphRunner()
val checkpointStore = InMemoryCheckpointStore()

// Step 1: Start graph execution (pauses at HumanNode)
val initialResult = runner.runWithCheckpoint(
graph = approvalGraph,
input = mapOf("content" to "Initial draft"),
store = checkpointStore
).getOrThrow()

// Verify graph paused
println("Status: ${initialResult.status}") // PAUSED
val interaction = initialResult.result as HumanInteraction
println("Prompt: ${interaction.prompt}") // "Please review the draft"

// Step 2: Get pending interactions
val pending = runner.getPendingInteractions(
checkpointId = initialResult.checkpointId!!,
store = checkpointStore
).getOrThrow()

println("Waiting for: ${pending.first().prompt}")

// Step 3: Human provides response
val humanResponse = HumanResponse.choice(
nodeId = "review",
optionId = "approve"
)

// Step 4: Resume execution
val finalResult = runner.resumeWithHumanResponse(
graph = approvalGraph,
checkpointId = initialResult.checkpointId!!,
response = humanResponse,
store = checkpointStore
).getOrThrow()

println("Final status: ${finalResult.status}") // SUCCESS

2. Free Text Input​

val feedbackGraph = graph("collect-feedback") {
agent("explain", explainerAgent)

// Get free text input from human
humanNode(
id = "get-feedback",
prompt = "Please provide your detailed feedback"
// No options = free text input mode
)

agent("process", processorAgent)
}

// ... execute and pause ...

// Human provides free text
val response = HumanResponse.text(
nodeId = "get-feedback",
text = "The explanation is clear, but please add examples for edge cases."
)

// Resume with text input
val result = runner.resumeWithHumanResponse(
graph = feedbackGraph,
checkpointId = checkpointId,
response = response,
store = checkpointStore
).getOrThrow()

3. Timeout Handling​

val urgentApprovalGraph = graph("urgent-approval") {
agent("create-request", requestAgent)

// Human must respond within 30 minutes
humanNode(
id = "urgent-review",
prompt = "URGENT: Approve within 30 minutes",
timeout = Duration.ofMinutes(30),
options = listOf(
HumanOption("approve", "Approve"),
HumanOption("reject", "Reject")
)
)

// Handle timeout
edge("urgent-review", "auto-reject") { result ->
// Timeout results in null response
result.data == null
}
edge("urgent-review", "approved") { result ->
(result.data as? HumanResponse)?.selectedOption == "approve"
}

agent("auto-reject", autoRejectAgent)
agent("approved", approvedAgent)
}

4. Multiple Sequential Approvals​

val multiApprovalGraph = graph("multi-stage-approval") {
agent("draft", draftAgent)

humanNode(
id = "technical-review",
prompt = "Technical review",
options = listOf(HumanOption("ok", "Approve"))
)

humanNode(
id = "legal-review",
prompt = "Legal review",
options = listOf(HumanOption("ok", "Approve"))
)

humanNode(
id = "executive-review",
prompt = "Executive approval",
options = listOf(HumanOption("ok", "Approve"))
)

agent("publish", publishAgent)
}

// First pause - technical review
val techPause = runner.runWithCheckpoint(graph, input, store).getOrThrow()
val techResume = runner.resumeWithHumanResponse(
graph, techPause.checkpointId!!,
HumanResponse.choice("technical-review", "ok"),
store
).getOrThrow()

// Second pause - legal review
val legalResume = runner.resumeWithHumanResponse(
graph, techResume.checkpointId!!,
HumanResponse.choice("legal-review", "ok"),
store
).getOrThrow()

// Third pause - executive review
val finalResult = runner.resumeWithHumanResponse(
graph, legalResume.checkpointId!!,
HumanResponse.choice("executive-review", "ok"),
store
).getOrThrow()

5. Conditional Branching​

val reviewGraph = graph("conditional-review") {
agent("analyze", analyzeAgent)

humanNode(
id = "decision",
prompt = "Choose next action",
options = listOf(
HumanOption("approve", "Approve as-is"),
HumanOption("revise", "Request revision"),
HumanOption("reject", "Reject completely")
)
)

// Three different paths based on human choice
edge("decision", "publish") { result ->
(result.data as? HumanResponse)?.selectedOption == "approve"
}
edge("decision", "revise") { result ->
(result.data as? HumanResponse)?.selectedOption == "revise"
}
edge("decision", "archive") { result ->
(result.data as? HumanResponse)?.selectedOption == "reject"
}

agent("publish", publishAgent)
agent("revise", reviseAgent)
agent("archive", archiveAgent)
}

Integration with AgentContext​

HITL automatically preserves AgentContext across pause/resume:

withAgentContext(
userId = "user-123",
tenantId = "company-abc",
sessionId = "session-xyz"
) {
// Start graph - context is saved in checkpoint
val pausedResult = runner.runWithCheckpoint(
graph = approvalGraph,
input = mapOf("document" to "Draft v1"),
store = checkpointStore
).getOrThrow()

// ... later, when resuming ...
// Context is automatically restored
val finalResult = runner.resumeWithHumanResponse(
graph = approvalGraph,
checkpointId = pausedResult.checkpointId!!,
response = HumanResponse.choice("review", "approve"),
store = checkpointStore
).getOrThrow()

// All nodes after resume still have the same AgentContext
}

API Reference​

GraphRunner Methods​

interface GraphRunner {
/**
* Resume execution after receiving human response.
*/
suspend fun resumeWithHumanResponse(
graph: Graph,
checkpointId: String,
response: HumanResponse,
store: CheckpointStore
): SpiceResult<RunReport>

/**
* Get pending human interactions from a checkpoint.
*/
suspend fun getPendingInteractions(
checkpointId: String,
store: CheckpointStore
): SpiceResult<List<HumanInteraction>>
}

RunReport​

When a graph pauses for human input:

data class RunReport(
val graphId: String,
val status: RunStatus, // PAUSED when waiting for human
val result: Any?, // HumanInteraction when paused
val duration: Duration,
val nodeReports: List<NodeReport>,
val error: Throwable? = null,
val checkpointId: String? = null // Set when status is PAUSED
)

HumanInteraction​

Information about pending human input:

data class HumanInteraction(
val nodeId: String,
val prompt: String,
val options: List<HumanOption>,
val pausedAt: String, // ISO-8601 timestamp
val expiresAt: String? = null, // ISO-8601 timestamp (if timeout set)
val allowFreeText: Boolean = false
)

Use Cases​

  1. Document Approval Workflow: Draft β†’ Review β†’ Approve/Reject β†’ Publish
  2. Data Validation: AI analysis β†’ Human verification β†’ Final decision
  3. Risky Operation Approval: Request generation β†’ Manager approval β†’ Execution
  4. Collaborative Workflow: AI suggestion β†’ Human modification β†’ AI reprocessing
  5. Quality Control: Automated check β†’ Human inspection β†’ Release
  6. Multi-stage Approval: Technical β†’ Legal β†’ Executive approvals

Best Practices​

βœ… Do​

  • Use HITL when graph designer determines where human input is needed
  • Save checkpointId from paused report for resuming later
  • Check status == RunStatus.PAUSED to detect HITL pause
  • Use getPendingInteractions() to get interaction details
  • Validate human responses before resuming (optional)
  • Set reasonable timeouts for time-sensitive approvals
  • Use descriptive prompts and option labels

❌ Don't​

  • Don't use HITL when agent should decide to escalate (use Agent Handoff instead)
  • Don't lose the checkpointId - you need it to resume
  • Don't assume graph will complete without checking status
  • Don't ignore timeout requirements for urgent workflows

Next Steps​