Incidents¶
The most heavily used section of the dashboard. Three routes: the list, the incident detail, and the per-execution live log.
List¶
Route: /incidents
Role gating: none.
Searchable, filterable table of every incident in the tenant.
Columns¶
| Column | Notes |
|---|---|
| Severity | critical / high / medium / low / info |
| Type | service_down, disk_full, cpu_high, memory_high, port_unavailable, custom, etc. |
| Server | Hostname; links to the server detail |
| Status | open → classifying → recipe_proposed → awaiting_approval → executing → resolved (or failed / escalated) |
| Occurrences | Dedup counter for repeat alerts |
| Assigned | User or agent name |
| Source | daemon, webhook, manual, proactive |
| Created | Absolute timestamp |
| Resolution timer | Live ticker; turns red on SLA breach |
Filters¶
- Status, severity, source.
- Free-text search across hostname, type, evidence.
Actions¶
- Create incident manually. Modal: pick a server, type, severity,
and initial evidence. Setting type to
customtriggers an informational query — the agent runs the requested check fresh and resolves with the output. - Delete an incident.
- Click a row → incident detail.
Incident detail¶
Route: /incidents/{id}
Role gating: none for read; approve / reject requires admin.
Header¶
- Severity and status badges.
- Live SLA timer.
- Assignment control. The incident may be assigned to a human user or to an AI agent; the Assign Agent button starts the agent pipeline immediately.
Tabs¶
- Timeline. The agent's full reasoning trace. Every stage handoff (triage → diagnose → execute → review), every tool call with its arguments and output, every event recorded by the agent. This is the audit trail for what the agent did and why.
Timeline entries from OpenRemedy Guardian appear with a shield icon. Three types may be present:
- Advisory (Hook A): appears at incident creation — "OpenRemedy Guardian: pass" or a named severity such as "OpenRemedy Guardian: high". Advisory entries do not change the risk level.
- Risk elevation (Hook B): appears when Guardian raises the recipe's effective risk before the approval gate — "Guardian raised risk low→high".
- Comment flag (Hook C): appears when a human comment contains destructive intent at medium+ severity — "Guardian flagged a human instruction: high".
- Evidence. JSON dump of all evidence collected on the incident (monitor output, daemon report, alert payload, anything the agent observed).
- Executions. List of recipe executions tied to this incident. Pending executions show Approve and Reject buttons; approval starts the playbook immediately. Each row links to the execution detail.
- Report. Post-mortem RCA generated by the review agent on resolved incidents.
Review section¶
Resolved incidents show a Review Agent Performance panel that lets operators score the diagnosis and remediation quality. Reviews feed into agent learning over time.
Execution detail¶
Route: /executions/{id}
Role gating: none for read; rollback requires admin.
Reached from the Executions tab on an incident.
Sections¶
- Metadata. Recipe, server, parent incident, who approved it, timestamps.
- Live Output. WebSocket-fed stdout from the running playbook
(
/ws/executions/{id}). - Playbook Output. Per-Ansible-task summary with status
(
ok/changed/failed), expandable stdout/stderr, return code.
Actions¶
- Rollback — only enabled when the execution succeeded and the recipe defines a rollback playbook. Re-runs the rollback against the same target.
Related routes¶
servers.md— incidents are scoped to a serverrecipes.md— executions reference a recipeagents.md— assignments reference an agentsla.md— the resolution timer compares against an SLA policy