Lisa’s daily start page
Forsiden prioriterer beslutninger, vigtigste interne links og seneste researchstatus — uden at blive et operations-panel.
Today’s answers
Use primary reasoning model for planning/review; route coding prototype builds to allowed test builders.
Use free/low-cost candidates only after tool-use and reliability checks pass.
Do not move live sessions, logs, config/env, OAuth or controls into Agent Info.
Most visited internal links
Model ranking, price, provider notes
Research memory and rejected sources
Hermes vs n8n vs LangGraph
Recently tested / rejected
Useful for compact static prototype code.
Rejected because dashboard is internal.
Track Top 20/ranking changes over time.
Open decisions
- Choose real auth: SSO, basic auth, magic link or reverse proxy.
- Decide first live data source for LLM prices.
- Define write permissions for link additions.
- Pick evaluation cadence for testcenter.
Model intelligence and routing decisions
Use for complex planning, review and synthesis.
Use only after passing tool/reliability tests.
Track Hermes Agent app ranking, Top 20 movement, provider availability.
| Model | Provider | Status | Best use | Cost signal | Reason |
|---|---|---|---|---|---|
| deepseek/deepseek-v4-flash | OpenRouter | builder-tested | Static prototype coding | Low | Allowed coding model for this test. |
| minimax/minimax-m2.7 | OpenRouter | design-tested | Product/IA synthesis | Medium | Good broad planning output. |
| Free candidate pool | Multiple | monitor | Cheap auxiliary tasks | Free/low | Must pass quality/reliability tests first. |
| Rejected/unstable models | Multiple | rejected | Do not route | Variable | Failed tool-use, latency or quality gates. |
Research memory, not bookmarks
Alle eksterne kilder skal kunne markeres som useful, adopted, rejected, monitoring, later eller outdated — inkl. hvorfor de blev brugt eller fravalgt.
Use for app ranking, Top 20 context and external model ecosystem signals.
Track but do not trust blindly; compare with local tests.
Checked but not used when claims lack repeatable examples.
Relevant for Phase 3 multi-agent evaluation work.
Useful when workflow is visual, repeatable and human-in-the-loop.
Useful when stateful orchestration matters more than simple automation.
Agent & Orchestration
Personas, skills, tools, memory, task planning, iterative loops, prompt/context engineering and specialist roles.
Phase 2When to use
- Specialist agent when domain is narrow and repeatable.
- Router/reviewer when outputs need validation.
- Simple Hermes task when orchestration is overkill.
Needed data views
Agent type, skills, tools, memory mode, risks, success examples, failure modes.
Tool Dashboard
Tool list, status, tests, guides, dependencies, alternatives, reliability and when-not-to-use guidance.
Phase 2Status taxonomy
testedmonitorrejectedrevisitDecision use
Helps Lisa choose native Hermes tool vs external service vs script.
Workflow Dashboard
Research, docs, coding, testing, review, monitoring, publishing, n8n, LangGraph, Hermes-native and multi-agent workflows.
Routing rule
- Hermes: adaptive agent work.
- n8n: repeatable visual workflow.
- LangGraph: stateful orchestration.
- Script: simple deterministic task.
Workflow record
Inputs, outputs, agents/tools, failure modes, reuse potential and examples.
Testcenter
Roadmap for model, agent, tool, workflow, integration and regression tests.
Result labels
best resultcheapest passneeds retestdo not useBenchmark links
External evals must include trust/rejection notes.
Multi-agent / Harness
Planner, worker, reviewer, critic, router-driven development, Kanban coordination and evaluation harnesses.
Worth it when
Parallel research, independent review, complex build/eval loops or specialist isolation are needed.
Overkill when
Single deterministic task, simple script, or no meaningful review boundary.
Info / Knowledge Hub
Guides, tips, cheapest/best/fastest/safest solution, FAQ, glossary, decision trees and reusable checklists.
Guide examples
- Use Hermes when…
- Use n8n when…
- Use LangGraph when…
- Do not overbuild when…
Boundary
Explains decisions; does not expose live controls or raw operational data.
Must never be exposed
Secrets, credentials, raw logs, raw sessions, raw configs, env values, OAuth states, private paths, live controls or runtime operations.
Allowed after login
Curated research, sanitized summaries, decision frameworks, link intelligence, model/provider notes, test summaries and workflow maps.
Agent group
Product/IA, Access/Safety, UX Prototype, Builder, Spec Review, Quality Review.
Builder model
deepseek/deepseek-v4-flash via OpenRouter for build artifact; MiniMax used for design roles.
Artifacts saved
Prompts, outputs, model/session info, build files, checksums, review notes and evaluation rubric saved in run directory.