Finding
A high lifetime count of Hermes tool calls only proves autonomy maturity when tool usage is measured for purposeful agentic work, not accidental over-tooling.
Current
A real Hermes installation can accumulate many tool calls across file work, web research, cron reviews, memory updates, delegation, browser sessions, and system checks. The weak point is that raw volume can hide whether Hermes is becoming more self-directed or simply spending more tokens and actions to reach the same outcome. Without a review loop, tool-call growth is hard to interpret: it may indicate better autonomy, but it may also reveal vague prompts, oversized toolsets, repeated errors, or missing skills.
Suggested
- Track tool usage as an autonomy signal, not a vanity metric. Exact change: add an “Autonomy tool usage review” section to the Optimizer Agent cron prompt that asks for patterns in recent sessions: repeated tool sequences, unnecessary browser/terminal escalation, failed tool retries, and cases where Hermes completed work without extra user steering.
- Define healthy versus unhealthy tool-call growth. Exact change: add a short rule to
SOUL.mdor the main operator runbook: “Healthy tool growth means more verified work completed autonomously; unhealthy growth means repeated errors, broad default toolsets, manual rework, or tool calls that should have become a skill, cron job, or runbook.” - Convert repeated tool sequences into durable operating assets. Exact change: add a verification habit to the task completion runbook: “When the same 3+ tool sequence appears across sessions, decide whether to create or patch a skill, restrict a cron job’s
enabled_toolsets, add a dashboard note, or document the workflow in an internal runbook.”
Impact
This makes the achievement operationally useful: Hermes is judged by productive autonomy, not just activity. The installation can identify where tool use proves maturity, such as autonomous research, scheduled review, or safe file inspection, while also catching waste from repeated retries and vague prompts. Over time, repeated tool patterns become skills, cron monitors, and better routing rules, reducing token waste and user intervention.
Effort
Small — the change is mainly one optimizer review prompt update, one operating rule, and one post-task verification habit. No new infrastructure is required unless the installation later chooses to expose a public-safe aggregate dashboard.
Public page note
Safe public content includes the maturity principle, generic examples of healthy and unhealthy tool-call growth, and recommendations for reviewing tool usage patterns. Internal-only content includes raw tool logs, session transcripts, exact tool-call counts tied to private work, filesystem paths, credentials, environment values, cron IDs, and any live operational controls.