Hermes Achievements
Public-safe operational notes for the 60 Hermes achievements. Each page explains the maturity signal, current gap, concrete improvements, impact, effort, and what can safely be shown publicly.
Hermes Native
Skill Issue? Skill Created.
Hard or repeated Hermes work is often solved once in chat, then lost as context instead of being turned into a durable skill that improves the installation.
#2 · unlockedSkillsmith
Hermes skills often exist as useful procedures, but they lose operational value when they are not loaded, patched, pruned, and verified as part of normal work.
#3 · unlockedMemory Keeper
Memory Keeper is weak when durable knowledge is saved opportunistically instead of through a clear memory boundary and review habit.
#4 · unlockedMemory Palace
Durable memory becomes powerful only when it is curated into a compact operating layer instead of becoming a second chat transcript.
#5 · unlockedToolset Cartographer
Hermes tool choice often happens too late, after the agent has already reached for a broad default toolset instead of deliberately selecting the smallest correct…
#6 · unlockedConfig Surgeon
Hermes configuration changes often become risky when provider, gateway, TTS, model, or dashboard edits are made directly without a small-change, backup, and verification…
#14 · unlockedRollback Wizard
Risky Hermes changes are often made with confidence in the toolchain, but without a visible checkpoint, rollback path, and post-change verification rule.
#21 · unlockedGateway Dweller
Gateway-connected Hermes workflows become fragile when platform behavior is treated as an afterthought instead of a first-class operating rule.
#22 · unlockedPlugin Goblin
Hermes plugin capability is weak when teams jump to custom builds before checking whether a native plugin, dashboard extension, or small Hermes-native integration…
#23 · unlockedContext Dragon
Long Hermes research and coding runs become fragile when context management depends on the model surviving a huge transcript instead of deliberately preserving…
Agent Autonomy
Cron Necromancer
Recurring Hermes checks are valuable, but they become brittle when cron jobs are created as one-off reminders instead of self-contained operational monitors with clear…
#8 · unlockedSubagent Commander
Complex Hermes work often collapses research, analysis, QA, and recommendation into one oversized agent run, which weakens evidence quality and makes errors harder to…
#17 · discoveredFull Send
Hermes can solve complete workflows end-to-end, but autonomy is weak when web, file, and terminal work are used opportunistically instead of through a deliberate…
#18 · unlockedToolchain Maxxer
Hermes gains autonomy from combining tools, but reliability drops when broad tool use is improvised instead of planned, scoped, and verified.
#19 · discoveredLet Him Cook
Hermes autonomy stays fragile when agents are given freedom without an explicit scope, success criteria, stop rules, and verification boundary.
#20 · unlockedAutonomous Avalanche
A high lifetime count of Hermes tool calls only proves autonomy maturity when tool usage is measured for purposeful agentic work, not accidental over-tooling.
#60 · unlockedBackground Process Enjoyer
Long-running tests, servers, and watchers become operationally weak when they are started as unmanaged terminal work instead of tracked background processes with…
Debugging Chaos
Actually Read The Logs
Hermes debugging becomes slower and riskier when fixes are proposed before the relevant logs, traces, and execution output have been inspected.
#37 · unlockedStack Trace Sommelier
Hermes debugging loses time when tracebacks are skimmed for the loudest error line instead of parsed systematically from failure boundary to root cause.
#38 · unlockedRed Text Connoisseur
Hermes debugging stays chaotic when every red-text failure is treated as a unique emergency instead of being classified into a repeatable failure taxonomy.
#39 · unlockedForgot The Env Var
Auth, provider, gateway, and tool failures become slower to diagnose when Hermes does not have a safe, repeatable way to verify required configuration without exposing…
#40 · unlockedYAML Colon Incident
Hermes configuration reliability is weak when YAML or JSON edits are treated as text changes instead of validated operational changes.
#41 · unlockedDependency Hell Tourist
Hermes install and build failures become expensive when dependency fixes are discovered interactively but not pinned, verified, and captured as reusable installation…
#42 · unlockedPermission Denied Any%
Permission failures slow Hermes operations when agents do not have a clear escalation boundary between what Hermes can inspect safely and what requires Codex root, host…
#43 · discoveredThe Fix Was Restarting It
Hermes restarts become unreliable when they are used as a generic first fix instead of a controlled reload step tied to configuration, environment, gateway, or…
#44 · discoveredPort 3000 Is Taken
Local Hermes development becomes fragile when dev-server startup assumes the default port is free instead of treating port ownership as a preflight check.
#45 · discoveredDocker Name Collision
Docker-based Hermes operations become fragile when container creation, restart, and test commands assume names are free instead of checking existing containers first.
Research/Web
Docs Archaeologist
Hermes decisions often become slower and riskier when operators jump straight into building, prompting, or configuration changes without first checking official Hermes…
#11 · unlockedCitation Goblin
Hermes research can move faster than its evidence trail, leaving recommendations that are hard to verify after the session ends.
#12 · discoveredRabbit Hole Certified
Broad web discovery can turn into an unbounded research spiral unless Hermes defines the question, source categories, stopping criteria, and preservation path before…
#34 · unlockedBrowser Possession
Browser automation becomes operationally weak when it is used as a fallback for failed scraping instead of a deliberate research and visual verification tool with clear…
Tool Mastery
Test Suite Tamer
Hermes changes often become operational risk when verification is treated as an optional final step instead of a required tool-use ritual after code, config, dashboard,…
#15 · unlockedFile Archaeologist
Hermes decisions become weaker when the agent guesses from conversation context instead of first inspecting the local filesystem, saved artifacts, project files, and…
#16 · discoveredPatch Wizard
Small Hermes maintenance edits become risky when agents rewrite whole files instead of applying narrow, reviewable patches.
#35 · unlockedScreenshot Hunter
Visual proof is weak when Hermes reports that a dashboard, UI, or public page “works” without capturing and reviewing evidence of what a user would actually see.
#36 · unlockedTerminal Goblin
Hermes loses operational credibility when agents answer status, build, test, or system questions from assumptions instead of verifying live state with terminal evidence.
#54 · discoveredImage Whisperer
Visual work is weak when image generation, screenshot review, diagram QA, and vision analysis are treated as occasional tricks instead of a defined quality gate for…
#55 · unlockedVoice Of The Machine
Voice and TTS support becomes operationally weak when agents write normal chat-length updates for an audio-first channel instead of designing responses for short,…
Model Lore
Model Sommelier
Model routing is weak when Hermes chooses models by habit instead of matching provider, cost, latency, reliability, and reasoning depth to the actual workflow type.
#25 · unlockedProvider Polyglot
A multi-provider Hermes setup is fragile when providers exist as fallback names only, without clear roles, verification, and failure drills.
#26 · unlockedOpenRouter Enjoyer
OpenRouter access improves Hermes experimentation only when it is governed by a routing, fallback, and review policy instead of being used as an unlimited model buffet.
#27 · unlockedMulti-Model Mage
A Hermes installation can use many model names across history, but the maturity gap is turning that variety into deliberate role-based routing and judged decisions…
#28 · unlockedFive-Model Flight
Hermes model selection is weak when a single impressive answer is treated as proof of operational fit instead of being compared against a small, representative model…
#29 · unlockedModel Hopper
Model switching becomes operationally weak when providers and models are changed by instinct instead of being routed, logged, and reviewed as part of a deliberate…
#30 · unlockedClaude Confidant
Claude-style reasoning is operationally weak when it is used as an occasional model preference instead of a defined specialist lane for high-reasoning work.
#31 · unlockedGemini Cartographer
Gemini routing is operationally weak when it is treated as a model experiment instead of a mapped lane for long-context, multimodal, and research-heavy Hermes work.
#32 · discoveredOpen Weights Pilgrim
Open-weight model use stays immature when Hermes only treats local or open-weight models as fallback curiosities instead of testing them through a deliberate,…
#33 · unlockedCodex Conjurer
Codex-style coding escalation becomes risky when Hermes hands off terminal or root-level work without a clear boundary, request format, and verification loop.
Vibe Coding
Vibe Architect
Large Hermes feature or dashboard sessions become fragile when agents start editing across many files before writing down the architecture, boundaries, and verification…
#47 · discoveredOne More Small Change
Iterative “small changes” become operationally expensive when edits are applied one at a time without batching, scope control, and verification after each coherent…
#48 · discoveredThis Was Supposed To Be Quick
Small Hermes tasks become operationally expensive when the agent keeps expanding scope instead of stopping early to re-scope, batch, or defer the new work.
#49 · discoveredShip First, Ask Later
Fast Hermes prototyping becomes unsafe when “ship first” is not explicitly limited to low-risk experiments with a mandatory QA gate before anything public or operational…
#50 · discoveredPixel Goblin
Hermes-facing dashboards and public pages lose trust when visual polish depends on ad-hoc screenshot iteration instead of a repeatable frontend review loop.
#51 · discoveredCSS Exorcist
Styling bugs become operational debt when Hermes-facing pages are fixed by broad visual rewrites instead of isolated CSS diagnosis, minimal diffs, and rendered…
#52 · discoveredOne Character Fix
A tiny code or prompt edit after major debugging is operationally weak if Hermes records only the final diff and not the reasoning path that proved the root cause.
#53 · unlockedRebase Acrobat
Git history surgery becomes operationally dangerous when agents rebase, merge, resolve conflicts, or force-push without a visible branch safety and verification routine.
Lifestyle
Marathon Operator
High session volume proves Hermes is being used seriously, but it becomes an operational maturity signal only when session count is tied to learning extraction, drift…
#57 · unlockedCache Hit Appreciator
Hermes loses avoidable speed and cost gains when long prompts, cron prompts, and reusable task instructions change shape unnecessarily between runs.
#58 · unlockedNight Shift Operator
Hermes can reveal unhealthy or inefficient operating rhythms, but the signal is wasted if late-night usage is treated as normal activity instead of a scheduling and…
#59 · unlockedWeekend Warrior
Weekend Hermes usage is useful as an adoption signal, but it becomes unhealthy when repeated weekend work reveals automation gaps that should have been moved into…