Qwen2.5 Coder 32B Instruct

Model Overview¶

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen).

Qwen2.5-Coder brings the following improvements: 1. Significant improvements in code generation, code reasoning and code fixing. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc. 2. Qwen2.5-Coder-32B has become the current state-of-the-art open-source codeLLM, with its coding abilities matching those of GPT-4o. 3. A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies. 4. Long-context Support up to 128K tokens.

Model Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias.Number of Parameters: 32.5B. Number of Paramaters (Non-Embedding): 31.0B. Number of Layers: 64. Number of Attention Heads (GQA): 40 for Q and 8 for KV. Context Length: Full 131,072 tokens.
Model Source: Qwen/Qwen2.5-Coder-32B-Instruct
Model Repository: QwenLM/Qwen2.5-Coder
License: apache-2.0

QPC Configurations¶

Precision	SoCs / Tensor slicing	NSP-Cores (per SoC)	Full Batch Size	Chunking Prompt Length	Context Length (CL)	Generated URL	Download
MXFP6	4	16	1	128	8192	https://dc00tk1pxen80.cloudfront.net/SDK1.19.6/Qwen/Qwen2.5-Coder-32B-Instruct/qpc_16cores_128pl_8192cl_1fbs_4devices_mxfp6_mxint8.tar.gz	Download
MXFP6	8	16	8	128	8192	https://dc00tk1pxen80.cloudfront.net/SDK1.19.6/Qwen/Qwen2.5-Coder-32B-Instruct/qpc_16cores_128pl_8192cl_8fbs_8devices_mxfp6_mxint8.tar.gz	Download