OpenAI: gpt-oss-120b (free)

openai/gpt-oss-120b:free

Denne side er genereret fra Hermes LLM knowledge registry. Den viser status, research, test-evidens og næste testplan. Free-status er midlertidig metadata, ikke sidens formål.

Availability

current_free

Pricing

free

Context

131072

Test results

182

Average score

96.4

Research records

Capabilities

include_reasoningmax_tokensmin_preasoningseedstoptemperaturetool_choicetoolstop_atop_ktop_p

Input modalities: text

Recommended use cases

Reasoning/planning tasks with explicit constraints and step checks
Research synthesis, source comparison, and analyst/critic work
Tool-using agent tasks with evidence-grounded final answers

Skills and prompt patterns

Mini-skill: use tools before claims, cite source/tool IDs, do not invent results, no production authority
Reasoning guardrail: final answer must separate evidence, assumptions, and recommendation
Research skill: cite sources, separate source evidence from inference, flag missing context

Best practices

Use external source evidence before final recommendation; keep local eval evidence authoritative for behavior.
Start with low-temperature deterministic tests before creative tasks
Log provider, returned model, latency, status, score, prompt/skill version, and redacted errors
Persist every result immediately, including bad outcomes
Do not test capabilities that catalog/provider metadata says are unsupported
Prefer native tool/function calling over brittle strict JSON when tool support exists

Test evidence

Status counts

{"200": 177, "503": 5}

Provider counts

{"OpenInference": 177, "unknown": 5}

Bad signals

{"empty_output": 5}

Recent records

Lane	Scenario	Status	Provider	Score
scenario_battery	t1_smoke_danish_exact	200	OpenInference	100.0
scenario_battery	t1_smoke_danish_exact	200	OpenInference	100.0
scenario_battery	t1_smoke_danish_exact	200	OpenInference	100.0
scenario_battery	t1_smoke_danish_exact	200	OpenInference	100.0
scenario_battery	t1_smoke_danish_exact	200	OpenInference	100.0
scenario_battery	t1_smoke_danish_exact	200	OpenInference	100.0
scenario_battery	t1_smoke_danish_exact	200	OpenInference	100.0
scenario_battery	t1_smoke_danish_exact	200	OpenInference	100.0

Next test plan

researchsmokemodel_specific_optimaltools_skillsworkflow_langgraph_mockfinal_recommendation_pagevalidator_loopreasoning_planningresearch_synthesis

Skipped lanes:

strict_structured_output: catalog_supported_parameters_missing_structured_outputs

Research sources

External source enrichment present: yes

https://openrouter.ai/openai/gpt-oss-120b

status 200; gpt-oss-120b - API Pricing & Benchmarks | OpenRouter

https://openrouter.ai/openai/gpt-oss-120b%3Afree

status 200; gpt-oss-120b (free) - API Pricing & Benchmarks | OpenRouter

https://openrouter.ai/models/openai/gpt-oss-120b

status 200; gpt-oss-120b - API Pricing & Benchmarks | OpenRouter

https://huggingface.co/openai/gpt-oss-120b

status 200; openai/gpt-oss-120b · Hugging Face

Recommendation status

final ready: Ready for public final recommendation for the listed roles, subject to normal availability monitoring.

Recommended roles

research_synthesisstructured_agent_tasksvalidator_or_criticprimary_candidate_for_sandbox_retest

Risks and caveats

reviewed_soft: empty_output (5)
policy: free_status_is_transient (1)