Skip to content

Model Catalog

This catalog lists all supported models optimized for Qualcomm Cloud AI accelerators, grouped by SDK version. Each entry links to the corresponding QPC configuration page.

SDK Version: 1.21.4.0


Multimodal Language Models

Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: QEFFAutoModelForImageTextToText

Architecture Model Family Representative Model QPC Configuration Link
Qwen2_5_VLForConditionalGeneration Qwen2.5-VL Qwen/Qwen2.5-VL-32B-Instruct Link
Gemma3ForConditionalGeneration Gemma3 google/gemma-3-27b-it Link
Llama4ForConditionalGeneration Llama4 meta-llama/Llama-4-Scout-17B-16E-Instruct Link
Gemma3ForConditionalGeneration Gemma3 google/gemma-3-4b-it Link

Text-only Language Models

Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM

Architecture Model Family Representative Model QPC Configuration Link
GptOssForCausalLM GPT-OSS openai/gpt-oss-120b Link
LlamaForCausalLM Llama 3.3 meta-llama/Llama-3.3-70B-Instruct Link
GptOssForCausalLM GPT-OSS openai/gpt-oss-20b Link
Qwen3MoeForCausalLM Qwen3Moe Qwen/Qwen3-30B-A3B-Instruct-2507 Link
Qwen3MoeForCausalLM Qwen3Moe Qwen/Qwen3-Coder-30B-A3B-Instruct Link

SDK Version: 1.21.2.0 / 1.21.4.0


Text-only Language Models

Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM

Architecture Model Family Representative Model QPC Configuration Link
LlamaForCausalLM Llama 3.1 meta-llama/Llama-3.1-8B-Instruct Link
GraniteForCausalLM Granite-3.2 ibm-granite/granite-3.2-8b-instruct Link
GraniteForCausalLM Granite-3.3 ibm-granite/granite-3.3-8b-instruct Link
Phi3ForCausalLM Phi-4 microsoft/phi-4 Link
LlamaForCausalLM DeepSeek-R1-Distill-Llama Valdemardi/DeepSeek-R1-Distill-Llama-70B-AWQ Link
Qwen2ForCausalLM DeepSeek-R1-Distill-Qwen Valdemardi/DeepSeek-R1-Distill-Qwen-32B-AWQ Link
LlamaForCausalLM DeepSeek-R1-Distill-Llama casperhansen/deepseek-r1-distill-llama-8b-awq Link
Qwen2ForCausalLM DeepSeek-R1-Distill-Qwen casperhansen/deepseek-r1-distill-qwen-7b-awq Link
LlamaForCausalLM Llama-3.1-Nemotron ibnzterrell/Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF-AWQ-INT4 Link
LlamaForCausalLM Llama-3.1-Nemotron nvidia/Llama-3.1-Nemotron-Nano-8B-v1 Link
MistralForCausalLM Mistral mistralai/Mistral-7B-Instruct-v0.1 Link
LlamaForCausalLM Llama 3.2 meta-llama/Llama-3.2-1B-Instruct Link
LlamaForCausalLM Llama 3.2 meta-llama/Llama-3.2-3B Link
GptOssForCausalLM GPT-OSS openai/gpt-oss-20b Link
LlamaForCausalLM Llama 3.1 hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 Link
LlamaForCausalLM Llama 3.2 meta-llama/Llama-3.2-3B-Instruct Link
Qwen3MoeForCausalLM Qwen3Moe Qwen/Qwen3-30B-A3B-Instruct-2507 Link
LlamaForCausalLM Llama 3.3 casperhansen/llama-3.3-70b-instruct-awq Link
LlamaForCausalLM Llama 3.3 meta-llama/Llama-3.3-70B-Instruct Link
GptOssForCausalLM GPT-OSS openai/gpt-oss-120b Link

Multimodal Language Models

Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: QEFFAutoModelForImageTextToText

Architecture Model Family Representative Model QPC Configuration Link
Qwen2_5_VLForConditionalGeneration Qwen2.5-VL Qwen/Qwen2.5-VL-32B-Instruct Link

SDK Version: 1.21A1.1


Text-only Language Models

Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM

Architecture Model Family Representative Model QPC Configuration Link
GptOssForCausalLM GPT-OSS openai/gpt-oss-120b Link

SDK Version: 1.20.4


Multimodal Language Models

Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: QEFFAutoModelForImageTextToText

Architecture Model Family Representative Model QPC Configuration Link
Llama4ForConditionalGeneration Llama4 meta-llama/Llama-4-Scout-17B-16E-Instruct Link
StableDiffusionForTextToImage SDXL Turbo stabilityai/sdxl-turbo Link
Gemma3ForConditionalGeneration Gemma3 google/gemma-3-27b-it Link
MllamaForConditionalGeneration Llama 3.2 meta-llama/Llama-3.2-11B-Vision-Instruct Link
StableDiffusionForTextToImage Stable Diffusion XL stabilityai/stable-diffusion-xl-base-1.0 Link
Qwen2_5_VLForConditionalGeneration Qwen2.5-VL Qwen/Qwen2.5-VL-32B-Instruct Link

Text-only Language Models

Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM

Architecture Model Family Representative Model QPC Configuration Link
LlamaForCausalLM Llama 3.3 casperhansen/llama-3.3-70b-instruct-awq Link
LlamaForCausalLM Llama 3.1 hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 Link
LlamaForCausalLM Llama 3.1 hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 Link
Phi3ForCausalLM Phi-4 stelterlab/phi-4-AWQ Link
LlamaForCausalLM DeepSeek-R1-Distill-Llama Valdemardi/DeepSeek-R1-Distill-Llama-70B-AWQ Link
Qwen2ForCausalLM DeepSeek-R1-Distill-Qwen Valdemardi/DeepSeek-R1-Distill-Qwen-32B-AWQ Link
LlamaForCausalLM DeepSeek-R1-Distill-Llama casperhansen/deepseek-r1-distill-llama-8b-awq Link
Qwen2ForCausalLM DeepSeek-R1-Distill-Qwen casperhansen/deepseek-r1-distill-qwen-7b-awq Link
LlamaForCausalLM Llama-3.1-Nemotron ibnzterrell/Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF-AWQ-INT4 Link
Qwen2ForCausalLM QwQ Qwen/QwQ-32B-AWQ Link
Qwen2ForCausalLM Qwen2.5 Qwen/Qwen2.5-Coder-32B-Instruct-AWQ Link
GraniteForCausalLM Granite-3.3 ibm-granite/granite-3.3-8b-instruct Link
Qwen2ForCausalLM Qwen2.5 Qwen/Qwen2.5-Coder-32B-Instruct Link
Qwen2ForCausalLM QwQ Qwen/QwQ-32B Link
MistralForCausalLM Mistral mistralai/Mistral-7B-Instruct-v0.1 Link
GptOssForCausalLM GPT-OSS openai/gpt-oss-20b Link
LlamaForCausalLM Llama 3.3 meta-llama/Llama-3.3-70B-Instruct Link
LlamaForCausalLM DeepSeek-R1-Distill-Llama deepseek-ai/DeepSeek-R1-Distill-Llama-70B Link
LlamaForCausalLM Llama 3.1 meta-llama/Llama-3.1-8B-Instruct Link
LlamaForCausalLM Llama 3.1 meta-llama/Llama-3.1-70B-Instruct Link
Phi3ForCausalLM Phi-4 microsoft/phi-4 Link
Qwen3MoeForCausalLM Qwen3Moe Qwen/Qwen3-30B-A3B-Instruct-2507 Link
Qwen3MoeForCausalLM Qwen3Moe Qwen/Qwen3-Coder-30B-A3B-Instruct Link
LlamaForCausalLM sarvamai sarvamai/sarvam-1 Link
LlamaForCausalLM sarvamai sarvamai/sarvam-m Link
LlamaForCausalLM Llama 3.2 meta-llama/Llama-3.2-3B Link
LlamaForCausalLM Llama 3.2 meta-llama/Llama-3.2-3B-Instruct Link
LlamaForCausalLM Llama 3.2 meta-llama/Llama-3.2-1B Link
LlamaForCausalLM Llama 3.2 meta-llama/Llama-3.2-1B-Instruct Link
GraniteForCausalLM Granite-3.2 ibm-granite/granite-3.2-8b-instruct Link
LlamaForCausalLM Llama-3.1-Nemotron nvidia/Llama-3.1-Nemotron-Nano-8B-v1 Link
Qwen3MoeForCausalLM Qwen3Moe Qwen/Qwen3-4B Link
LlamaForCausalLM Llama 3.2 bartowski/Llama-3.2-3B-Instruct-GGUF Link

SPD Models

Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM

Architecture Model Family Representative Model QPC Configuration Link
LlamaForCausalLM Llama 3.x Target: Meta-Llama-3.1-70B-Instruct-AWQ-INT4 / Draft: Llama-3.2-1B-Instruct Link
LlamaForCausalLM Llama 3.x Target: llama-3.3-70b-instruct-awq / Draft: Llama-3.2-1B-Instruct Link

Audio Models

Task: (Automatic Speech Recognition) - Transcription Task
QEff Auto Class: QEFFAutoModelForSpeechSeq2Seq

Architecture Model Family Representative Model QPC Configuration Link
Whisper Whisper openai/whisper-base Link
Whisper Whisper openai/whisper-large-v3-turbo Link
Whisper Whisper openai/whisper-large Link
Whisper Whisper openai/whisper-medium Link
Whisper Whisper openai/whisper-small Link
Whisper Whisper openai/whisper-tiny Link

Embedding Models

Task: Text Embedding
QEff Auto Class: QEFFAutoModel

Architecture Model Family Representative Model QPC Configuration Link
XLMRobertaModel XLM-RoBERTa intfloat/multilingual-e5-large Link

SDK Version: 1.20.2


Audio Models

Task: (Automatic Speech Recognition) - Transcription Task
QEff Auto Class: QEFFAutoModelForSpeechSeq2Seq

Architecture Model Family Representative Model QPC Configuration Link Last Updated
Whisper Whisper openai/whisper-base Link 28-Oct-25
Whisper Whisper openai/whisper-large-v3-turbo Link 28-Oct-25
Whisper Whisper openai/whisper-large Link 28-Oct-25
Whisper Whisper openai/whisper-medium Link 28-Oct-25
Whisper Whisper openai/whisper-small Link 28-Oct-25
Whisper Whisper openai/whisper-tiny Link 28-Oct-25

SPD Models

Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM

Architecture Model Family Representative Model QPC Configuration Link
LlamaForCausalLM Llama 3.x Target: Llama-3.3-70B-Instruct / Draft: Llama-3.2-1B-Instruct Link

Multimodal Language Models

Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image

Architecture Model Family Representative Model QPC Configuration Link
Llama4ForConditionalGeneration Llama4 meta-llama/Llama-4-Scout-17B-16E-Instruct Link

Text-only Language Models

Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM

Architecture Model Family Representative Model QPC Configuration Link Last Updated
GraniteForCausalLM Granite-3.2 ibm-granite/granite-3.2-8b-instruct Link
GraniteForCausalLM Granite-3.3 ibm-granite/granite-3.3-8b-instruct Link
LlamaForCausalLM Llama 3.3 meta-llama/Llama-3.3-70B-Instruct Link 29-Oct-2025
GptOssForCausalLM GPT-OSS openai/gpt-oss-20b Link
GptOssForCausalLM GPT-OSS openai/gpt-oss-120b Link 07-Oct-2025
Qwen2ForCausalLM DeepSeek-R1-Distill-Qwen deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Link
Qwen3ForCausalLM Qwen3 Qwen/Qwen3-30B-A3B-Instruct-2507 Link 07-Oct-2025

SDK Version: 1.20.1.2


Multimodal Language Models

Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image

Architecture Model Family Representative Model QPC Configuration Link
Llama4ForConditionalGeneration Llama4 meta-llama/Llama-4-Scout-17B-16E-Instruct Link

SDK Version: 1.19.8

Text-only Language Models

Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM

Architecture Model Family Representative Model QPC Configuration Link
GraniteForCausalLM Granite-3.2 ibm-granite/granite-3.2-8b-instruct Link

Multimodal Language Models

Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image

Architecture Model Family Representative Model QPC Configuration Link
StableDiffusionForTextToImage SDXL Turbo stabilityai/sdxl-turbo Link

SDK Version: 1.19.6

Text-only Language Models

Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM

Architecture Model Family Representative Model QPC Configuration Link
LlamaForCausalLM DeepSeek-R1-Distill-Llama deepseek-ai/DeepSeek-R1-Distill-Llama-70B Link
LlamaForCausalLM DeepSeek-R1-Distill-Llama Valdemardi/DeepSeek-R1-Distill-Llama-70B-AWQ Link
LlamaForCausalLM DeepSeek-R1-Distill-Llama deepseek-ai/DeepSeek-R1-Distill-Llama-8B Link
LlamaForCausalLM DeepSeek-R1-Distill-Llama casperhansen/deepseek-r1-distill-llama-8b-awq Link
Qwen2ForCausalLM DeepSeek-R1-Distill-Qwen deepseek-ai/DeepSeek-R1-Distill-Qwen-7B Link
Qwen2ForCausalLM DeepSeek-R1-Distill-Qwen casperhansen/deepseek-r1-distill-qwen-7b-awq Link
Qwen2ForCausalLM DeepSeek-R1-Distill-Qwen deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Link
Qwen2ForCausalLM DeepSeek-R1-Distill-Qwen Valdemardi/DeepSeek-R1-Distill-Qwen-32B-AWQ Link
LlamaForCausalLM Llama 3.1 meta-llama/Llama-3.1-8B-Instruct Link
LlamaForCausalLM Llama 3.1 hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 Link
LlamaForCausalLM Llama 3.1 meta-llama/Llama-3.1-70B-Instruct Link
LlamaForCausalLM Llama 3.1 hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 Link
LlamaForCausalLM Llama 3.3 meta-llama/Llama-3.3-70B-Instruct Link
LlamaForCausalLM Llama 3.3 casperhansen/llama-3.3-70b-instruct-awq Link
LlamaForCausalLM Llama-3.1-Nemotron nvidia/Llama-3.1-Nemotron-Nano-8B-v1 Link
LlamaForCausalLM Llama-3.1-Nemotron nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Link
LlamaForCausalLM Llama-3.1-Nemotron ibnzterrell/Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF-AWQ-INT4 Link
Phi3ForCausalLM Phi-4 microsoft/phi-4 Link
Phi3ForCausalLM Phi-4 stelterlab/phi-4-AWQ Link
Qwen2ForCausalLM Qwen2.5 Qwen/Qwen2.5-Coder-32B-Instruct Link
Qwen2ForCausalLM Qwen2.5 Qwen/Qwen2.5-Coder-32B-Instruct-AWQ Link
Qwen2ForCausalLM QwQ Qwen/QwQ-32B Link
Qwen2ForCausalLM QwQ Qwen/QwQ-32B-AWQ Link
MistralForCausalLM Mistral mistralai/Mistral-7B-Instruct-v0.1 Link

Embedding Models

Task: Text Embedding
QEff Auto Class: QEFFAutoModel

Architecture Model Family Representative Model QPC Configuration Link
BERTModel BERT-based BAAI/bge-large-en-v1.5 Link
BERTModel BERT-based BAAI/bge-m3 Link
BERTModel BERT-based BAAI/bge-base-en-v1.5 Link
XLMRobertaModel XLM-RoBERTa intfloat/multilingual-e5-large Link
XLMRobertaModel XLM-RoBERTa intfloat/multilingual-e5-small Link

Multimodal Language Models

Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image

Architecture Model Family Representative Model QPC Configuration Link
StableDiffusionForTextToImage SDXL Turbo stabilityai/sdxl-turbo Link

ℹ️ Click on the QPC Configuration Link to view detailed configuration and download options for each model and SDK version.