final ready Current free model
OpenAI: gpt-oss-120b (free)
openai/gpt-oss-120b:free
Denne side er genereret fra Hermes LLM knowledge registry. Den viser status, research, test-evidens og næste testplan. Free-status er midlertidig metadata, ikke sidens formål.
Capabilities
include_reasoningmax_tokensmin_preasoningseedstoptemperaturetool_choicetoolstop_atop_ktop_p
Input modalities: text
Recommended use cases
- Reasoning/planning tasks with explicit constraints and step checks
- Research synthesis, source comparison, and analyst/critic work
- Tool-using agent tasks with evidence-grounded final answers
Skills and prompt patterns
- Mini-skill: use tools before claims, cite source/tool IDs, do not invent results, no production authority
- Reasoning guardrail: final answer must separate evidence, assumptions, and recommendation
- Research skill: cite sources, separate source evidence from inference, flag missing context
Best practices
- Use external source evidence before final recommendation; keep local eval evidence authoritative for behavior.
- Start with low-temperature deterministic tests before creative tasks
- Log provider, returned model, latency, status, score, prompt/skill version, and redacted errors
- Persist every result immediately, including bad outcomes
- Do not test capabilities that catalog/provider metadata says are unsupported
- Prefer native tool/function calling over brittle strict JSON when tool support exists
Test evidence
Status counts
{"200": 177, "503": 5}
Provider counts
{"OpenInference": 177, "unknown": 5}
Bad signals
{"empty_output": 5}
Recent records
| Lane | Scenario | Status | Provider | Score | Signal |
| scenario_battery | t1_smoke_danish_exact | 200 | OpenInference | 100.0 | |
| scenario_battery | t1_smoke_danish_exact | 200 | OpenInference | 100.0 | |
| scenario_battery | t1_smoke_danish_exact | 200 | OpenInference | 100.0 | |
| scenario_battery | t1_smoke_danish_exact | 200 | OpenInference | 100.0 | |
| scenario_battery | t1_smoke_danish_exact | 200 | OpenInference | 100.0 | |
| scenario_battery | t1_smoke_danish_exact | 200 | OpenInference | 100.0 | |
| scenario_battery | t1_smoke_danish_exact | 200 | OpenInference | 100.0 | |
| scenario_battery | t1_smoke_danish_exact | 200 | OpenInference | 100.0 | |
Next test plan
researchsmokemodel_specific_optimaltools_skillsworkflow_langgraph_mockfinal_recommendation_pagevalidator_loopreasoning_planningresearch_synthesis
Skipped lanes:- strict_structured_output: catalog_supported_parameters_missing_structured_outputs
Research sources
External source enrichment present: yes
Recommendation status
final ready: Ready for public final recommendation for the listed roles, subject to normal availability monitoring.
Recommended roles
research_synthesisstructured_agent_tasksvalidator_or_criticprimary_candidate_for_sandbox_retest
Risks and caveats
- reviewed_soft: empty_output (5)
- policy: free_status_is_transient (1)