Zentree-Qualcomm Pre-compiled Model Catalog for Cloud AI Accelerators

Mistral 7B Instruct v0.1

Zentree-Qualcomm Pre-compiled Model Catalog for Cloud AI Accelerators

User Guide
User Guide
Catalog
Catalog
- Overview
- Meta-Llama-SPD
  Meta-Llama-SPD
  - 1.20.2
    1.20.2
    
    Meta llama
  - 1.20.4
    1.20.4
    
    Llama31 70b awq int4 draft llama32 1b
    
    Llama33 70b awq draft llama32 1b
- bge-base-en-v1.5
  bge-base-en-v1.5
  - 1.19.6
    1.19.6
    
    Bge base en v1.5
- bge-large-en-v1.5
  bge-large-en-v1.5
  - 1.19.6
    1.19.6
    
    Bge large en v1.5
- bge-m3
  bge-m3
  - 1.19.6
    1.19.6
    
    Bge m3
- deepseek-r1-distill-llama-8b-awq
  deepseek-r1-distill-llama-8b-awq
  - 1.19.6
    1.19.6
    
    Deepseek r1 distill llama 8b awq
  - 1.20.4
    1.20.4
    
    Deepseek r1 distill llama 8b awq
  - 1.21.2
    1.21.2
    
    Deepseek r1 distill llama 8b awq
- DeepSeek-R1-Distill-Llama-8B
  DeepSeek-R1-Distill-Llama-8B
  - 1.19.6
    1.19.6
    
    DeepSeek R1 Distill Llama 8B
- DeepSeek-R1-Distill-Llama-70B-AWQ
  DeepSeek-R1-Distill-Llama-70B-AWQ
  - 1.19.6
    1.19.6
    
    DeepSeek R1 Distill Llama 70B AWQ
  - 1.20.4
    1.20.4
    
    DeepSeek R1 Distill Llama 70B AWQ
  - 1.21.2
    1.21.2
    
    DeepSeek R1 Distill Llama 70B AWQ
- DeepSeek-R1-Distill-Llama-70B
  DeepSeek-R1-Distill-Llama-70B
  - 1.19.6
    1.19.6
    
    DeepSeek R1 Distill Llama 70B
  - 1.20.4
    1.20.4
    
    DeepSeek R1 Distill Llama 70B
- deepseek-r1-distill-qwen-7b-awq
  deepseek-r1-distill-qwen-7b-awq
  - 1.19.6
    1.19.6
    
    Deepseek r1 distill qwen 7b awq
  - 1.20.4
    1.20.4
    
    Deepseek r1 distill qwen 7b awq
  - 1.21.2
    1.21.2
    
    Deepseek r1 distill qwen 7b awq
- DeepSeek-R1-Distill-Qwen-7B
  DeepSeek-R1-Distill-Qwen-7B
  - 1.19.6
    1.19.6
    
    DeepSeek R1 Distill Qwen 7B
- DeepSeek-R1-Distill-Qwen-32B-AWQ
  DeepSeek-R1-Distill-Qwen-32B-AWQ
  - 1.19.6
    1.19.6
    
    DeepSeek R1 Distill Qwen 32B AWQ
  - 1.20.4
    1.20.4
    
    DeepSeek R1 Distill Qwen 32B AWQ
  - 1.21.2
    1.21.2
    
    DeepSeek R1 Distill Qwen 32B AWQ
- DeepSeek-R1-Distill-Qwen-32B
  DeepSeek-R1-Distill-Qwen-32B
  - 1.19.6
    1.19.6
    
    DeepSeek R1 Distill Qwen 32B
  - 1.20.2
    1.20.2
    
    DeepSeek R1 Distill Qwen 32B
- gemma-3-27b-it
  gemma-3-27b-it
  - 1.20.4
    1.20.4
    
    Gemma 3 27b it
  - 1.21.4
    1.21.4
    
    Gemma 3 27b it
- gemma-3-4b-it
  gemma-3-4b-it
  - 1.21.4
    1.21.4
    
    Gemma 3 4b it
- gpt-oss 20b
  gpt-oss 20b
  - 1.20.2
    1.20.2
    
    Gpt oss 20b
  - 1.20.4
    1.20.4
    
    Gpt oss 20b
  - 1.21.2
    1.21.2
    
    Gpt oss 20b
  - 1.21.4
    1.21.4
    
    Gpt oss 20b
- gpt-oss 120b
  gpt-oss 120b
  - 1.20.2
    1.20.2
    
    Gpt oss 120b
  - 1.21A1.1
    1.21A1.1
    
    Gpt oss 120b
  - 1.21.2
    1.21.2
    
    Gpt oss 120b
  - 1.21.4
    1.21.4
    
    Gpt oss 120b
- granite-3.2-8b-instruct
  granite-3.2-8b-instruct
  - 1.19.8
    1.19.8
    
    Granite 3.2 8b instruct
  - 1.20.2
    1.20.2
    
    Granite 3.2 8b instruct
  - 1.20.4
    1.20.4
    
    Granite 3.2 8b instruct
  - 1.21.2
    1.21.2
    
    Granite 3.2 8b instruct
- granite-3.3-8b-instruct
  granite-3.3-8b-instruct
  - 1.20.2
    1.20.2
    
    Granite 3.3 8b instruct
  - 1.20.4
    1.20.4
    
    Granite 3.3 8b instruct
  - 1.21.2
    1.21.2
    
    Granite 3.3 8b instruct
- Llama3.1 8B Instruct
  Llama3.1 8B Instruct
  - 1.19.6
    1.19.6
    
    Llama 3.1 8B
  - 1.20.4
    1.20.4
    
    Llama 3.1 8B Instruct
  - 1.21.2
    1.21.2
    
    Llama 3.1 8B Instruct
- Llama3.1 70B
  Llama3.1 70B
  - 1.19.6
    1.19.6
    
    Llama 3.1 70B
  - 1.20.4
    1.20.4
    
    Llama 3.1 70B
- Llama-3.1-Nemotron-70B-Instruct-HF
  Llama-3.1-Nemotron-70B-Instruct-HF
  - 1.19.6
    1.19.6
    
    Llama 3.1 Nemotron 70B Instruct HF
- Llama-3.1-Nemotron-Nano-8B-v1
  Llama-3.1-Nemotron-Nano-8B-v1
  - 1.19.6
    1.19.6
    
    Llama 3.1 Nemotron Nano 8B v1
  - 1.20.4
    1.20.4
    
    Llama 3.1 Nemotron Nano 8B v1
  - 1.21.2
    1.21.2
    
    Llama 3.1 Nemotron Nano 8B v1
- Llama-3.2-1B
  Llama-3.2-1B
  - 1.20.4
    1.20.4
    
    Llama 3.2 1B
  - 1.21.2
    1.21.2
    
    Llama 3.2 1B
- Llama-3.2-1B-Instruct
  Llama-3.2-1B-Instruct
  - 1.20.4
    1.20.4
    
    Llama 3.2 1B Instruct
  - 1.21.2
    1.21.2
    
    Llama 3.2 1B Instruct
- Llama-3.2-11B-Vision-Instruct
  Llama-3.2-11B-Vision-Instruct
  - 1.20.4
    1.20.4
    
    Llama 3.2 11B Vision Instruct
- Llama-3.2-3B
  Llama-3.2-3B
  - 1.20.4
    1.20.4
    
    Llama 3.2 3B
  - 1.21.2
    1.21.2
    
    Llama 3.2 3B
- Llama-3.2-3B Instruct
  Llama-3.2-3B Instruct
  - 1.20.4
    1.20.4
    
    Llama 3.2 3B Instruct
  - 1.21.2
    1.21.2
    
    Llama 3.2 3B Instruct
- Llama-3.2-3B-Instruct-GGUF
  Llama-3.2-3B-Instruct-GGUF
  - 1.20.4
    1.20.4
    
    Llama 3.2 3B Instruct GGUF
- llama-3.3-70b-instruct-awq
  llama-3.3-70b-instruct-awq
  - 1.19.6
    1.19.6
    
    Llama 3.3 70b instruct awq
  - 1.20.4
    1.20.4
    
    Llama 3.3 70b instruct awq
  - 1.21.2
    1.21.2
    
    Llama 3.3 70b instruct awq
- Llama3.3 70B
  Llama3.3 70B
  - 1.19.6
    1.19.6
    
    Llama 3.3 70B
  - 1.20.2
    1.20.2
    
    Llama 3.3 70B
  - 1.20.4
    1.20.4
    
    Llama 3.3 70B
  - 1.21.2
    1.21.2
    
    Llama 3.3 70B
  - 1.21.4
    1.21.4
    
    Llama 3.3 70B
- Llama-4-Scout-17B-16E-Instruct
  Llama-4-Scout-17B-16E-Instruct
  - 1.20.1.2
    1.20.1.2
    
    Llama 4 Scout 17B 16E Instruct
  - 1.20.2
    1.20.2
    
    Llama 4 Scout 17B 16E Instruct
  - 1.20.4
    1.20.4
    
    Llama 4 Scout 17B 16E Instruct
  - 1.21.4
    1.21.4
    
    Llama 4 Scout 17B 16E Instruct
- Meta-Llama-3.1-8B-Instruct-AWQ-INT4
  Meta-Llama-3.1-8B-Instruct-AWQ-INT4
  - 1.19.6
    1.19.6
    
    Meta Llama 3.1 8B Instruct AWQ INT4
  - 1.20.4
    1.20.4
    
    Meta Llama 3.1 8B Instruct AWQ INT4
  - 1.21.2
    1.21.2
    
    Meta Llama 3.1 8B Instruct AWQ INT4
- Meta-Llama-3.1-70B-Instruct-AWQ-INT4
  Meta-Llama-3.1-70B-Instruct-AWQ-INT4
  - 1.19.6
    1.19.6
    
    Meta Llama 3.1 70B Instruct AWQ INT4
  - 1.20.4
    1.20.4
    
    Meta Llama 3.1 70B Instruct AWQ INT4
- multilingual-e5-large
  multilingual-e5-large
  - 1.19.6
    1.19.6
    
    Multilingual e5 large
  - 1.20.4
    1.20.4
    
    Multilingual e5 large
- multilingual-e5-small
  multilingual-e5-small
  - 1.19.6
    1.19.6
    
    Multilingual e5 small
- Mistral-7B-Instruct-v0.1
  Mistral-7B-Instruct-v0.1
  - 1.19.6
    1.19.6
    
    Mistral 7B
  - 1.20.4
    1.20.4
    
    Mistral 7B Instruct v0.1
  - 1.21.2
    1.21.2
    
    Mistral 7B Instruct v0.1 Mistral 7B Instruct v0.1
    Table of contents
    
    Model Overview
    
    QPC Configurations
- Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF-AWQ-INT4
  Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF-AWQ-INT4
  - 1.19.6
    1.19.6
    
    Nvidia Llama 3.1 Nemotron 70B Instruct HF AWQ INT4
  - 1.20.4
    1.20.4
    
    Nvidia Llama 3.1 Nemotron 70B Instruct HF AWQ INT4
  - 1.21.2
    1.21.2
    
    Nvidia Llama 3.1 Nemotron 70B Instruct HF AWQ INT4
- phi-4-AWQ
  phi-4-AWQ
  - 1.19.6
    1.19.6
    
    phi 4 AWQ
  - 1.20.4
    1.20.4
    
    phi 4 AWQ
- Phi-4
  Phi-4
  - 1.19.6
    1.19.6
    
    Phi 4
  - 1.20.4
    1.20.4
    
    Phi 4
  - 1.21.2
    1.21.2
    
    Phi 4
- Qwen2.5-Coder-32B-Instruct-AWQ
  Qwen2.5-Coder-32B-Instruct-AWQ
  - 1.19.6
    1.19.6
    
    Qwen2.5 Coder 32B Instruct AWQ
  - 1.20.4
    1.20.4
    
    Qwen2.5 Coder 32B Instruct AWQ
- Qwen2.5-Coder-32B-Instruct
  Qwen2.5-Coder-32B-Instruct
  - 1.19.6
    1.19.6
    
    Qwen2.5 Coder 32B Instruct
  - 1.20.4
    1.20.4
    
    Qwen2.5 Coder 32B Instruct
- Qwen2.5-VL-32B-Instruct
  Qwen2.5-VL-32B-Instruct
  - 1.20.4
    1.20.4
    
    Qwen2.5 VL 32B Instruct
  - 1.21.2
    1.21.2
    
    Qwen2.5 VL 32B Instruct
  - 1.21.4
    1.21.4
    
    Qwen2.5 VL 32B Instruct
- Qwen3-30B-A3B-Instruct-2507
  Qwen3-30B-A3B-Instruct-2507
  - 1.20.2
    1.20.2
    
    Qwen3 30B A3B Instruct 2507
  - 1.20.4
    1.20.4
    
    Qwen3 30B A3B Instruct 2507
  - 1.21.2
    1.21.2
    
    Qwen3 30B A3B Instruct 2507
  - 1.21.4
    1.21.4
    
    Qwen3 30B A3B Instruct 2507
- Qwen3-Coder-30B-A3B-Instruct
  Qwen3-Coder-30B-A3B-Instruct
  - 1.20.4
    1.20.4
    
    Qwen3 Coder 30B A3B Instruct
  - 1.21.4
    1.21.4
    
    Qwen3 Coder 30B A3B Instruct
- Qwen3-4B
  Qwen3-4B
  - 1.20.4
    1.20.4
    
    Qwen3 4B
- QwQ-32B
  QwQ-32B
  - 1.19.6
    1.19.6
    
    QwQ 32B
  - 1.20.4
    1.20.4
    
    QwQ 32B
- QwQ-32B-AWQ
  QwQ-32B-AWQ
  - 1.19.6
    1.19.6
    
    QwQ 32B AWQ
  - 1.20.4
    1.20.4
    
    QwQ 32B AWQ
- sarvam-1
  sarvam-1
  - 1.20.4
    1.20.4
    
    Sarvam 1
- sarvam-m
  sarvam-m
  - 1.20.4
    1.20.4
    
    Sarvam m
- sdxl-turbo
  sdxl-turbo
  - 1.19.6
    1.19.6
    
    Sdxl turbo
  - 1.19.8
    1.19.8
    
    Sdxl turbo
  - 1.20.4
    1.20.4
    
    Sdxl turbo
- stable-diffusion-xl-base-1.0
  stable-diffusion-xl-base-1.0
  - 1.20.4
    1.20.4
    
    Stable diffusion xl base 1.0
- whisper-base
  whisper-base
  - 1.20.2
    1.20.2
    
    Whisper base
  - 1.20.4
    1.20.4
    
    Whisper base
- whisper-large-v3-turbo
  whisper-large-v3-turbo
  - 1.20.2
    1.20.2
    
    Whisper large v3 turbo
  - 1.20.4
    1.20.4
    
    Whisper large v3 turbo
- whisper-large
  whisper-large
  - 1.20.2
    1.20.2
    
    Whisper large
  - 1.20.4
    1.20.4
    
    Whisper large
- whisper-medium
  whisper-medium
  - 1.20.2
    1.20.2
    
    Whisper medium
  - 1.20.4
    1.20.4
    
    Whisper medium
- whisper-small
  whisper-small
  - 1.20.2
    1.20.2
    
    Whisper small
  - 1.20.4
    1.20.4
    
    Whisper small
- whisper-tiny
  whisper-tiny
  - 1.20.2
    1.20.2
    
    Whisper tiny
  - 1.20.4
    1.20.4
    
    Whisper tiny
Cloud AI Images
Cloud AI Images
- Cloud AI Images

Mistral 7B Instruct v0.1

Model Overview¶

The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.1 generative text model using a variety of publicly available conversation datasets.

Paper: Mistral 7B
Model Source: mistralai/Mistral-7B-Instruct-v0.1

QPC Configurations¶

Precision	SoCs / Tensor slicing	NSP-Cores (per SoC)	Full Batch Size	Chunking Prompt Length	Context Length (CL)	QPC URL	QPC Size	QPC Download	Onnx URL	Onnx Download	Generation Date
MXFP6	2	16	1	128	4096	https://dc00tk1pxen80.cloudfront.net/SDK1.21.2/mistralai/Mistral-7B-Instruct-v0.1/mistralai_Mistral-7B-Instruct-v0.1_qpc_16cores_128pl_4096cl_1fbs_2devices_mxfp6_mxint8.tar.gz	6.3GB	Download	https://dc00tk1pxen80.cloudfront.net/SDK1.21.2/mistralai/Mistral-7B-Instruct-v0.1/mistralai_Mistral-7B-Instruct-v0.1_ONNX.tar.gz	Download	18-Mar-2026