Model Catalog¶
This catalog lists all supported models optimized for Qualcomm Cloud AI accelerators, grouped by SDK version. Each entry links to the corresponding QPC configuration page.
SDK Version: 1.21.4.0¶
Multimodal Language Models¶
Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: QEFFAutoModelForImageTextToText
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| Qwen2_5_VLForConditionalGeneration | Qwen2.5-VL | Qwen/Qwen2.5-VL-32B-Instruct | Link |
| Gemma3ForConditionalGeneration | Gemma3 | google/gemma-3-27b-it | Link |
| Llama4ForConditionalGeneration | Llama4 | meta-llama/Llama-4-Scout-17B-16E-Instruct | Link |
| Gemma3ForConditionalGeneration | Gemma3 | google/gemma-3-4b-it | Link |
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| GptOssForCausalLM | GPT-OSS | openai/gpt-oss-120b | Link |
| LlamaForCausalLM | Llama 3.3 | meta-llama/Llama-3.3-70B-Instruct | Link |
| GptOssForCausalLM | GPT-OSS | openai/gpt-oss-20b | Link |
| Qwen3MoeForCausalLM | Qwen3Moe | Qwen/Qwen3-30B-A3B-Instruct-2507 | Link |
| Qwen3MoeForCausalLM | Qwen3Moe | Qwen/Qwen3-Coder-30B-A3B-Instruct | Link |
SDK Version: 1.21.2.0 / 1.21.4.0¶
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| LlamaForCausalLM | Llama 3.1 | meta-llama/Llama-3.1-8B-Instruct | Link |
| GraniteForCausalLM | Granite-3.2 | ibm-granite/granite-3.2-8b-instruct | Link |
| GraniteForCausalLM | Granite-3.3 | ibm-granite/granite-3.3-8b-instruct | Link |
| Phi3ForCausalLM | Phi-4 | microsoft/phi-4 | Link |
| LlamaForCausalLM | DeepSeek-R1-Distill-Llama | Valdemardi/DeepSeek-R1-Distill-Llama-70B-AWQ | Link |
| Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | Valdemardi/DeepSeek-R1-Distill-Qwen-32B-AWQ | Link |
| LlamaForCausalLM | DeepSeek-R1-Distill-Llama | casperhansen/deepseek-r1-distill-llama-8b-awq | Link |
| Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | casperhansen/deepseek-r1-distill-qwen-7b-awq | Link |
| LlamaForCausalLM | Llama-3.1-Nemotron | ibnzterrell/Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF-AWQ-INT4 | Link |
| LlamaForCausalLM | Llama-3.1-Nemotron | nvidia/Llama-3.1-Nemotron-Nano-8B-v1 | Link |
| MistralForCausalLM | Mistral | mistralai/Mistral-7B-Instruct-v0.1 | Link |
| LlamaForCausalLM | Llama 3.2 | meta-llama/Llama-3.2-1B-Instruct | Link |
| LlamaForCausalLM | Llama 3.2 | meta-llama/Llama-3.2-3B | Link |
| GptOssForCausalLM | GPT-OSS | openai/gpt-oss-20b | Link |
| LlamaForCausalLM | Llama 3.1 | hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 | Link |
| LlamaForCausalLM | Llama 3.2 | meta-llama/Llama-3.2-3B-Instruct | Link |
| Qwen3MoeForCausalLM | Qwen3Moe | Qwen/Qwen3-30B-A3B-Instruct-2507 | Link |
| LlamaForCausalLM | Llama 3.3 | casperhansen/llama-3.3-70b-instruct-awq | Link |
| LlamaForCausalLM | Llama 3.3 | meta-llama/Llama-3.3-70B-Instruct | Link |
| GptOssForCausalLM | GPT-OSS | openai/gpt-oss-120b | Link |
Multimodal Language Models¶
Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: QEFFAutoModelForImageTextToText
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| Qwen2_5_VLForConditionalGeneration | Qwen2.5-VL | Qwen/Qwen2.5-VL-32B-Instruct | Link |
SDK Version: 1.21A1.1¶
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| GptOssForCausalLM | GPT-OSS | openai/gpt-oss-120b | Link |
SDK Version: 1.20.4¶
Multimodal Language Models¶
Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: QEFFAutoModelForImageTextToText
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| Llama4ForConditionalGeneration | Llama4 | meta-llama/Llama-4-Scout-17B-16E-Instruct | Link |
| StableDiffusionForTextToImage | SDXL Turbo | stabilityai/sdxl-turbo | Link |
| Gemma3ForConditionalGeneration | Gemma3 | google/gemma-3-27b-it | Link |
| MllamaForConditionalGeneration | Llama 3.2 | meta-llama/Llama-3.2-11B-Vision-Instruct | Link |
| StableDiffusionForTextToImage | Stable Diffusion XL | stabilityai/stable-diffusion-xl-base-1.0 | Link |
| Qwen2_5_VLForConditionalGeneration | Qwen2.5-VL | Qwen/Qwen2.5-VL-32B-Instruct | Link |
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| LlamaForCausalLM | Llama 3.3 | casperhansen/llama-3.3-70b-instruct-awq | Link |
| LlamaForCausalLM | Llama 3.1 | hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 | Link |
| LlamaForCausalLM | Llama 3.1 | hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 | Link |
| Phi3ForCausalLM | Phi-4 | stelterlab/phi-4-AWQ | Link |
| LlamaForCausalLM | DeepSeek-R1-Distill-Llama | Valdemardi/DeepSeek-R1-Distill-Llama-70B-AWQ | Link |
| Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | Valdemardi/DeepSeek-R1-Distill-Qwen-32B-AWQ | Link |
| LlamaForCausalLM | DeepSeek-R1-Distill-Llama | casperhansen/deepseek-r1-distill-llama-8b-awq | Link |
| Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | casperhansen/deepseek-r1-distill-qwen-7b-awq | Link |
| LlamaForCausalLM | Llama-3.1-Nemotron | ibnzterrell/Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF-AWQ-INT4 | Link |
| Qwen2ForCausalLM | QwQ | Qwen/QwQ-32B-AWQ | Link |
| Qwen2ForCausalLM | Qwen2.5 | Qwen/Qwen2.5-Coder-32B-Instruct-AWQ | Link |
| GraniteForCausalLM | Granite-3.3 | ibm-granite/granite-3.3-8b-instruct | Link |
| Qwen2ForCausalLM | Qwen2.5 | Qwen/Qwen2.5-Coder-32B-Instruct | Link |
| Qwen2ForCausalLM | QwQ | Qwen/QwQ-32B | Link |
| MistralForCausalLM | Mistral | mistralai/Mistral-7B-Instruct-v0.1 | Link |
| GptOssForCausalLM | GPT-OSS | openai/gpt-oss-20b | Link |
| LlamaForCausalLM | Llama 3.3 | meta-llama/Llama-3.3-70B-Instruct | Link |
| LlamaForCausalLM | DeepSeek-R1-Distill-Llama | deepseek-ai/DeepSeek-R1-Distill-Llama-70B | Link |
| LlamaForCausalLM | Llama 3.1 | meta-llama/Llama-3.1-8B-Instruct | Link |
| LlamaForCausalLM | Llama 3.1 | meta-llama/Llama-3.1-70B-Instruct | Link |
| Phi3ForCausalLM | Phi-4 | microsoft/phi-4 | Link |
| Qwen3MoeForCausalLM | Qwen3Moe | Qwen/Qwen3-30B-A3B-Instruct-2507 | Link |
| Qwen3MoeForCausalLM | Qwen3Moe | Qwen/Qwen3-Coder-30B-A3B-Instruct | Link |
| LlamaForCausalLM | sarvamai | sarvamai/sarvam-1 | Link |
| LlamaForCausalLM | sarvamai | sarvamai/sarvam-m | Link |
| LlamaForCausalLM | Llama 3.2 | meta-llama/Llama-3.2-3B | Link |
| LlamaForCausalLM | Llama 3.2 | meta-llama/Llama-3.2-3B-Instruct | Link |
| LlamaForCausalLM | Llama 3.2 | meta-llama/Llama-3.2-1B | Link |
| LlamaForCausalLM | Llama 3.2 | meta-llama/Llama-3.2-1B-Instruct | Link |
| GraniteForCausalLM | Granite-3.2 | ibm-granite/granite-3.2-8b-instruct | Link |
| LlamaForCausalLM | Llama-3.1-Nemotron | nvidia/Llama-3.1-Nemotron-Nano-8B-v1 | Link |
| Qwen3MoeForCausalLM | Qwen3Moe | Qwen/Qwen3-4B | Link |
| LlamaForCausalLM | Llama 3.2 | bartowski/Llama-3.2-3B-Instruct-GGUF | Link |
SPD Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| LlamaForCausalLM | Llama 3.x | Target: Meta-Llama-3.1-70B-Instruct-AWQ-INT4 / Draft: Llama-3.2-1B-Instruct | Link |
| LlamaForCausalLM | Llama 3.x | Target: llama-3.3-70b-instruct-awq / Draft: Llama-3.2-1B-Instruct | Link |
Audio Models¶
Task: (Automatic Speech Recognition) - Transcription Task
QEff Auto Class: QEFFAutoModelForSpeechSeq2Seq
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| Whisper | Whisper | openai/whisper-base | Link |
| Whisper | Whisper | openai/whisper-large-v3-turbo | Link |
| Whisper | Whisper | openai/whisper-large | Link |
| Whisper | Whisper | openai/whisper-medium | Link |
| Whisper | Whisper | openai/whisper-small | Link |
| Whisper | Whisper | openai/whisper-tiny | Link |
Embedding Models¶
Task: Text Embedding
QEff Auto Class: QEFFAutoModel
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| XLMRobertaModel | XLM-RoBERTa | intfloat/multilingual-e5-large | Link |
SDK Version: 1.20.2¶
Audio Models¶
Task: (Automatic Speech Recognition) - Transcription Task
QEff Auto Class: QEFFAutoModelForSpeechSeq2Seq
| Architecture | Model Family | Representative Model | QPC Configuration Link | Last Updated |
|---|---|---|---|---|
| Whisper | Whisper | openai/whisper-base | Link | 28-Oct-25 |
| Whisper | Whisper | openai/whisper-large-v3-turbo | Link | 28-Oct-25 |
| Whisper | Whisper | openai/whisper-large | Link | 28-Oct-25 |
| Whisper | Whisper | openai/whisper-medium | Link | 28-Oct-25 |
| Whisper | Whisper | openai/whisper-small | Link | 28-Oct-25 |
| Whisper | Whisper | openai/whisper-tiny | Link | 28-Oct-25 |
SPD Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| LlamaForCausalLM | Llama 3.x | Target: Llama-3.3-70B-Instruct / Draft: Llama-3.2-1B-Instruct | Link |
Multimodal Language Models¶
Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| Llama4ForConditionalGeneration | Llama4 | meta-llama/Llama-4-Scout-17B-16E-Instruct | Link |
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
| Architecture | Model Family | Representative Model | QPC Configuration Link | Last Updated |
|---|---|---|---|---|
| GraniteForCausalLM | Granite-3.2 | ibm-granite/granite-3.2-8b-instruct | Link | |
| GraniteForCausalLM | Granite-3.3 | ibm-granite/granite-3.3-8b-instruct | Link | |
| LlamaForCausalLM | Llama 3.3 | meta-llama/Llama-3.3-70B-Instruct | Link | 29-Oct-2025 |
| GptOssForCausalLM | GPT-OSS | openai/gpt-oss-20b | Link | |
| GptOssForCausalLM | GPT-OSS | openai/gpt-oss-120b | Link | 07-Oct-2025 |
| Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | Link | |
| Qwen3ForCausalLM | Qwen3 | Qwen/Qwen3-30B-A3B-Instruct-2507 | Link | 07-Oct-2025 |
SDK Version: 1.20.1.2¶
Multimodal Language Models¶
Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| Llama4ForConditionalGeneration | Llama4 | meta-llama/Llama-4-Scout-17B-16E-Instruct | Link |
SDK Version: 1.19.8¶
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| GraniteForCausalLM | Granite-3.2 | ibm-granite/granite-3.2-8b-instruct | Link |
Multimodal Language Models¶
Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| StableDiffusionForTextToImage | SDXL Turbo | stabilityai/sdxl-turbo | Link |
SDK Version: 1.19.6¶
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| LlamaForCausalLM | DeepSeek-R1-Distill-Llama | deepseek-ai/DeepSeek-R1-Distill-Llama-70B | Link |
| LlamaForCausalLM | DeepSeek-R1-Distill-Llama | Valdemardi/DeepSeek-R1-Distill-Llama-70B-AWQ | Link |
| LlamaForCausalLM | DeepSeek-R1-Distill-Llama | deepseek-ai/DeepSeek-R1-Distill-Llama-8B | Link |
| LlamaForCausalLM | DeepSeek-R1-Distill-Llama | casperhansen/deepseek-r1-distill-llama-8b-awq | Link |
| Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | Link |
| Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | casperhansen/deepseek-r1-distill-qwen-7b-awq | Link |
| Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | Link |
| Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | Valdemardi/DeepSeek-R1-Distill-Qwen-32B-AWQ | Link |
| LlamaForCausalLM | Llama 3.1 | meta-llama/Llama-3.1-8B-Instruct | Link |
| LlamaForCausalLM | Llama 3.1 | hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 | Link |
| LlamaForCausalLM | Llama 3.1 | meta-llama/Llama-3.1-70B-Instruct | Link |
| LlamaForCausalLM | Llama 3.1 | hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 | Link |
| LlamaForCausalLM | Llama 3.3 | meta-llama/Llama-3.3-70B-Instruct | Link |
| LlamaForCausalLM | Llama 3.3 | casperhansen/llama-3.3-70b-instruct-awq | Link |
| LlamaForCausalLM | Llama-3.1-Nemotron | nvidia/Llama-3.1-Nemotron-Nano-8B-v1 | Link |
| LlamaForCausalLM | Llama-3.1-Nemotron | nvidia/Llama-3.1-Nemotron-70B-Instruct-HF | Link |
| LlamaForCausalLM | Llama-3.1-Nemotron | ibnzterrell/Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF-AWQ-INT4 | Link |
| Phi3ForCausalLM | Phi-4 | microsoft/phi-4 | Link |
| Phi3ForCausalLM | Phi-4 | stelterlab/phi-4-AWQ | Link |
| Qwen2ForCausalLM | Qwen2.5 | Qwen/Qwen2.5-Coder-32B-Instruct | Link |
| Qwen2ForCausalLM | Qwen2.5 | Qwen/Qwen2.5-Coder-32B-Instruct-AWQ | Link |
| Qwen2ForCausalLM | QwQ | Qwen/QwQ-32B | Link |
| Qwen2ForCausalLM | QwQ | Qwen/QwQ-32B-AWQ | Link |
| MistralForCausalLM | Mistral | mistralai/Mistral-7B-Instruct-v0.1 | Link |
Embedding Models¶
Task: Text Embedding
QEff Auto Class: QEFFAutoModel
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| BERTModel | BERT-based | BAAI/bge-large-en-v1.5 | Link |
| BERTModel | BERT-based | BAAI/bge-m3 | Link |
| BERTModel | BERT-based | BAAI/bge-base-en-v1.5 | Link |
| XLMRobertaModel | XLM-RoBERTa | intfloat/multilingual-e5-large | Link |
| XLMRobertaModel | XLM-RoBERTa | intfloat/multilingual-e5-small | Link |
Multimodal Language Models¶
Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image
| Architecture | Model Family | Representative Model | QPC Configuration Link |
|---|---|---|---|
| StableDiffusionForTextToImage | SDXL Turbo | stabilityai/sdxl-turbo | Link |
ℹ️ Click on the QPC Configuration Link to view detailed configuration and download options for each model and SDK version.