Skip to content

Gpt oss 120b

Model Overview

OpenAI’s GPT-OSS models (gpt-oss-120b & gpt-oss-20b) are open-weight models designed for powerful reasoning, agentic tasks and versatile developer use cases. GPT-OSS-120B is used for production, general purpose, high reasoning use cases that fit into a single 80GB GPU (like NVIDIA H100 or AMD MI300X).

  • Model Architecture: 117B parameters with 5.1B active parameters. Trained on harmony response format and should only be used with the harmony format as it will not work correctly otherwise.
  • Model Source: openai/gpt-oss-120b
  • License: Apache 2.0 license. Build freely without copyleft restrictions or patent risk—ideal for experimentation, customization, and commercial deployment.
  • Configurable reasoning effort: Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs.
  • Full chain-of-thought: Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. It’s not intended to be shown to end users.
  • Fine-tunable: Fully customize models to your specific use case through parameter fine-tuning.
  • Agentic capabilities: Use the models’ native capabilities for function calling, web browsing, Python code execution, and Structured Outputs.
  • Native MXFP4 quantization: The models were post-trained with MXFP4 quantization of the MoE weights, making gpt-oss-120b run on a single 80GB GPU (like NVIDIA H100 or AMD MI300X).

QPC Configurations

Precision SoCs / Tensor slicing NSP-Cores (per SoC) Batch Size Chunking Prompt Length Context Length (CL) Generated URL Download
MXFP6 4 16 1 1 8192 https://qualcom-qpc-models.s3-accelerate.amazonaws.com/SDK1.20.2/openai/gpt-oss-120b/qpc_120b_ts4_cl8k_bs1_1spec_decode.tar.gz Download
MXFP6 8 16 1 1 8192 https://qualcom-qpc-models.s3-accelerate.amazonaws.com/SDK1.20.2/openai/gpt-oss-120b/qpc_120b_ts8_cl8k_bs1_1spec_decode.tar.gz Download
MXFP6 4 16 1 1 4096 https://qualcom-qpc-models.s3-accelerate.amazonaws.com/SDK1.20.2/openai/gpt-oss-120b/qpc_120b_ts4_cl4k_bs1_1spec_decode.tar.gz Download
MXFP6 8 16 1 1 4096 https://qualcom-qpc-models.s3-accelerate.amazonaws.com/SDK1.20.2/openai/gpt-oss-120b/qpc_120b_ts8_cl4k_bs1_1spec_decode.tar.gz Download