Hermes Agent InfoOps control dashboard
Home / Tools / Hermes / Achievements / Test Suite Tamer
Hermes achievement #13

Test Suite Tamer

Hermes changes often become operational risk when verification is treated as an optional final step instead of a required tool-use ritual after code, confi

#13Tool Masteryunlocked

Finding

Hermes changes often become operational risk when verification is treated as an optional final step instead of a required tool-use ritual after code, config, dashboard, prompt, cron, or skill edits.

Current

A typical Hermes installation may run tests when editing application code, but weaker areas often slip through: cron prompt changes are not smoke-tested, config edits are not validated with the Hermes CLI, dashboard copy changes are not checked for public-safety regressions, and skill edits are not verified against the workflow they describe. This creates a gap where “the task is done” means “the file changed,” not “the system still behaves correctly.”

Suggested

  1. Create a change-type verification matrix. Exact change: Add docs/runbooks/hermes-verification-matrix.md with rows for code, config, dashboard content, cron prompts, skills, memory/Honcho behavior, and gateway-facing changes; each row should name the minimum command, smoke check, or manual verification required before marking the work complete.
  2. Add verification gates to the operator and optimizer prompts. Exact change: Patch SOUL.md or the relevant profile prompt so every Hermes change ends with a short “Verification performed:” line naming the command, smoke check, preview check, or reason a test was not applicable.
  3. Turn recurring smoke checks into a lightweight cron habit. Exact change: Add or update an Optimizer Agent cron prompt named Hermes smoke-check review that weekly reviews recent sessions for changed files, skipped verification, repeated failures, or “works in theory” endings, with restricted toolsets such as session_search, skills, and file.

Impact

This reduces regressions by making verification visible, repeatable, and proportionate to the type of change. It also protects the public Hermes Agent Info site from accidentally exposing internal details after dashboard or content edits. Over time, the installation develops a culture where green checks are not a bonus; they are part of the definition of done.

Effort

Small — the main work is one runbook, one prompt habit, and one lightweight review cron; no new infrastructure is required.

Public page note

Safe public content includes the maturity recommendation, generic verification categories, example runbook names, smoke-check habits, and the principle that Hermes changes should be tested before being considered complete. Internal-only content includes raw test logs, private session excerpts, exact config values, environment variables, credentials, local filesystem dumps, failing production traces, and any verification output that reveals sensitive deployment details.