Skip to content

Qwen2.5 Coder 32B Instruct

Model Overview

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen).

Qwen2.5-Coder brings the following improvements: 1. Significant improvements in code generation, code reasoning and code fixing. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc. 2. Qwen2.5-Coder-32B has become the current state-of-the-art open-source codeLLM, with its coding abilities matching those of GPT-4o. 3. A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies. 4. Long-context Support up to 128K tokens.

  • Model Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias.Number of Parameters: 32.5B. Number of Paramaters (Non-Embedding): 31.0B. Number of Layers: 64. Number of Attention Heads (GQA): 40 for Q and 8 for KV. Context Length: Full 131,072 tokens.
  • Model Source: Qwen/Qwen2.5-Coder-32B-Instruct
  • Model Repository: QwenLM/Qwen2.5-Coder
  • License: apache-2.0

QPC Configurations

Precision SoCs / Tensor slicing NSP-Cores (per SoC) Full Batch Size Chunking Prompt Length Context Length (CL) Generated URL Download
MXFP6 4 16 1 128 8192 https://dc00tk1pxen80.cloudfront.net/SDK1.19.6/Qwen/Qwen2.5-Coder-32B-Instruct/qpc_16cores_128pl_8192cl_1fbs_4devices_mxfp6_mxint8.tar.gz Download
MXFP6 8 16 8 128 8192 https://dc00tk1pxen80.cloudfront.net/SDK1.19.6/Qwen/Qwen2.5-Coder-32B-Instruct/qpc_16cores_128pl_8192cl_8fbs_8devices_mxfp6_mxint8.tar.gz Download