Hermes operating model

Hermes Achievements

Public-safe operational notes for the 60 Hermes achievements. Each page explains the maturity signal, current gap, concrete improvements, impact, effort, and what can safely be shown publicly.

60 subpages44 unlocked16 discovered

Hermes Native

#1 · unlocked

Skill Issue? Skill Created.

Hard or repeated Hermes work is often solved once in chat, then lost as context instead of being turned into a durable skill that improves the installation.

#2 · unlocked

Skillsmith

Hermes skills often exist as useful procedures, but they lose operational value when they are not loaded, patched, pruned, and verified as part of normal work.

#3 · unlocked

Memory Keeper

Memory Keeper is weak when durable knowledge is saved opportunistically instead of through a clear memory boundary and review habit.

#4 · unlocked

Memory Palace

Durable memory becomes powerful only when it is curated into a compact operating layer instead of becoming a second chat transcript.

#5 · unlocked

Toolset Cartographer

Hermes tool choice often happens too late, after the agent has already reached for a broad default toolset instead of deliberately selecting the smallest correct…

#6 · unlocked

Config Surgeon

Hermes configuration changes often become risky when provider, gateway, TTS, model, or dashboard edits are made directly without a small-change, backup, and verification…

#14 · unlocked

Rollback Wizard

Risky Hermes changes are often made with confidence in the toolchain, but without a visible checkpoint, rollback path, and post-change verification rule.

#21 · unlocked

Gateway Dweller

Gateway-connected Hermes workflows become fragile when platform behavior is treated as an afterthought instead of a first-class operating rule.

#22 · unlocked

Plugin Goblin

Hermes plugin capability is weak when teams jump to custom builds before checking whether a native plugin, dashboard extension, or small Hermes-native integration…

#23 · unlocked

Context Dragon

Long Hermes research and coding runs become fragile when context management depends on the model surviving a huge transcript instead of deliberately preserving…

Agent Autonomy

#7 · discovered

Cron Necromancer

Recurring Hermes checks are valuable, but they become brittle when cron jobs are created as one-off reminders instead of self-contained operational monitors with clear…

#8 · unlocked

Subagent Commander

Complex Hermes work often collapses research, analysis, QA, and recommendation into one oversized agent run, which weakens evidence quality and makes errors harder to…

#17 · discovered

Full Send

Hermes can solve complete workflows end-to-end, but autonomy is weak when web, file, and terminal work are used opportunistically instead of through a deliberate…

#18 · unlocked

Toolchain Maxxer

Hermes gains autonomy from combining tools, but reliability drops when broad tool use is improvised instead of planned, scoped, and verified.

#19 · discovered

Let Him Cook

Hermes autonomy stays fragile when agents are given freedom without an explicit scope, success criteria, stop rules, and verification boundary.

#20 · unlocked

Autonomous Avalanche

A high lifetime count of Hermes tool calls only proves autonomy maturity when tool usage is measured for purposeful agentic work, not accidental over-tooling.

#60 · unlocked

Background Process Enjoyer

Long-running tests, servers, and watchers become operationally weak when they are started as unmanaged terminal work instead of tracked background processes with…

Debugging Chaos

#9 · unlocked

Actually Read The Logs

Hermes debugging becomes slower and riskier when fixes are proposed before the relevant logs, traces, and execution output have been inspected.

#37 · unlocked

Stack Trace Sommelier

Hermes debugging loses time when tracebacks are skimmed for the loudest error line instead of parsed systematically from failure boundary to root cause.

#38 · unlocked

Red Text Connoisseur

Hermes debugging stays chaotic when every red-text failure is treated as a unique emergency instead of being classified into a repeatable failure taxonomy.

#39 · unlocked

Forgot The Env Var

Auth, provider, gateway, and tool failures become slower to diagnose when Hermes does not have a safe, repeatable way to verify required configuration without exposing…

#40 · unlocked

YAML Colon Incident

Hermes configuration reliability is weak when YAML or JSON edits are treated as text changes instead of validated operational changes.

#41 · unlocked

Dependency Hell Tourist

Hermes install and build failures become expensive when dependency fixes are discovered interactively but not pinned, verified, and captured as reusable installation…

#42 · unlocked

Permission Denied Any%

Permission failures slow Hermes operations when agents do not have a clear escalation boundary between what Hermes can inspect safely and what requires Codex root, host…

#43 · discovered

The Fix Was Restarting It

Hermes restarts become unreliable when they are used as a generic first fix instead of a controlled reload step tied to configuration, environment, gateway, or…

#44 · discovered

Port 3000 Is Taken

Local Hermes development becomes fragile when dev-server startup assumes the default port is free instead of treating port ownership as a preflight check.

#45 · discovered

Docker Name Collision

Docker-based Hermes operations become fragile when container creation, restart, and test commands assume names are free instead of checking existing containers first.

Research/Web

#10 · unlocked

Docs Archaeologist

Hermes decisions often become slower and riskier when operators jump straight into building, prompting, or configuration changes without first checking official Hermes…

#11 · unlocked

Citation Goblin

Hermes research can move faster than its evidence trail, leaving recommendations that are hard to verify after the session ends.

#12 · discovered

Rabbit Hole Certified

Broad web discovery can turn into an unbounded research spiral unless Hermes defines the question, source categories, stopping criteria, and preservation path before…

#34 · unlocked

Browser Possession

Browser automation becomes operationally weak when it is used as a fallback for failed scraping instead of a deliberate research and visual verification tool with clear…

Tool Mastery

#13 · unlocked

Test Suite Tamer

Hermes changes often become operational risk when verification is treated as an optional final step instead of a required tool-use ritual after code, config, dashboard,…

#15 · unlocked

File Archaeologist

Hermes decisions become weaker when the agent guesses from conversation context instead of first inspecting the local filesystem, saved artifacts, project files, and…

#16 · discovered

Patch Wizard

Small Hermes maintenance edits become risky when agents rewrite whole files instead of applying narrow, reviewable patches.

#35 · unlocked

Screenshot Hunter

Visual proof is weak when Hermes reports that a dashboard, UI, or public page “works” without capturing and reviewing evidence of what a user would actually see.

#36 · unlocked

Terminal Goblin

Hermes loses operational credibility when agents answer status, build, test, or system questions from assumptions instead of verifying live state with terminal evidence.

#54 · discovered

Image Whisperer

Visual work is weak when image generation, screenshot review, diagram QA, and vision analysis are treated as occasional tricks instead of a defined quality gate for…

#55 · unlocked

Voice Of The Machine

Voice and TTS support becomes operationally weak when agents write normal chat-length updates for an audio-first channel instead of designing responses for short,…

Model Lore

#24 · unlocked

Model Sommelier

Model routing is weak when Hermes chooses models by habit instead of matching provider, cost, latency, reliability, and reasoning depth to the actual workflow type.

#25 · unlocked

Provider Polyglot

A multi-provider Hermes setup is fragile when providers exist as fallback names only, without clear roles, verification, and failure drills.

#26 · unlocked

OpenRouter Enjoyer

OpenRouter access improves Hermes experimentation only when it is governed by a routing, fallback, and review policy instead of being used as an unlimited model buffet.

#27 · unlocked

Multi-Model Mage

A Hermes installation can use many model names across history, but the maturity gap is turning that variety into deliberate role-based routing and judged decisions…

#28 · unlocked

Five-Model Flight

Hermes model selection is weak when a single impressive answer is treated as proof of operational fit instead of being compared against a small, representative model…

#29 · unlocked

Model Hopper

Model switching becomes operationally weak when providers and models are changed by instinct instead of being routed, logged, and reviewed as part of a deliberate…

#30 · unlocked

Claude Confidant

Claude-style reasoning is operationally weak when it is used as an occasional model preference instead of a defined specialist lane for high-reasoning work.

#31 · unlocked

Gemini Cartographer

Gemini routing is operationally weak when it is treated as a model experiment instead of a mapped lane for long-context, multimodal, and research-heavy Hermes work.

#32 · discovered

Open Weights Pilgrim

Open-weight model use stays immature when Hermes only treats local or open-weight models as fallback curiosities instead of testing them through a deliberate,…

#33 · unlocked

Codex Conjurer

Codex-style coding escalation becomes risky when Hermes hands off terminal or root-level work without a clear boundary, request format, and verification loop.

Vibe Coding

#46 · discovered

Vibe Architect

Large Hermes feature or dashboard sessions become fragile when agents start editing across many files before writing down the architecture, boundaries, and verification…

#47 · discovered

One More Small Change

Iterative “small changes” become operationally expensive when edits are applied one at a time without batching, scope control, and verification after each coherent…

#48 · discovered

This Was Supposed To Be Quick

Small Hermes tasks become operationally expensive when the agent keeps expanding scope instead of stopping early to re-scope, batch, or defer the new work.

#49 · discovered

Ship First, Ask Later

Fast Hermes prototyping becomes unsafe when “ship first” is not explicitly limited to low-risk experiments with a mandatory QA gate before anything public or operational…

#50 · discovered

Pixel Goblin

Hermes-facing dashboards and public pages lose trust when visual polish depends on ad-hoc screenshot iteration instead of a repeatable frontend review loop.

#51 · discovered

CSS Exorcist

Styling bugs become operational debt when Hermes-facing pages are fixed by broad visual rewrites instead of isolated CSS diagnosis, minimal diffs, and rendered…

#52 · discovered

One Character Fix

A tiny code or prompt edit after major debugging is operationally weak if Hermes records only the final diff and not the reasoning path that proved the root cause.

#53 · unlocked

Rebase Acrobat

Git history surgery becomes operationally dangerous when agents rebase, merge, resolve conflicts, or force-push without a visible branch safety and verification routine.

Lifestyle

#56 · unlocked