Model Catalog¶
This catalog lists all supported models optimized for Qualcomm Cloud AI accelerators, grouped by SDK version. Each entry links to the corresponding QPC configuration page.
SDK Version: 1.20.4¶
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
Architecture | Model Family | Representative Model | QPC Configuration Link |
---|---|---|---|
LlamaForCausalLM | Llama 3.3 | casperhansen/llama-3.3-70b-instruct-awq | Link |
LlamaForCausalLM | Llama 3.1 | hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 | Link |
LlamaForCausalLM | Llama 3.1 | hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 | Link |
Phi3ForCausalLM | Phi-4 | stelterlab/phi-4-AWQ | Link |
LlamaForCausalLM | DeepSeek-R1-Distill-Llama | Valdemardi/DeepSeek-R1-Distill-Llama-70B-AWQ | Link |
Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | Valdemardi/DeepSeek-R1-Distill-Qwen-32B-AWQ | Link |
LlamaForCausalLM | DeepSeek-R1-Distill-Llama | casperhansen/deepseek-r1-distill-llama-8b-awq | Link |
Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | casperhansen/deepseek-r1-distill-qwen-7b-awq | Link |
LlamaForCausalLM | Llama-3.1-Nemotron | ibnzterrell/Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF-AWQ-INT4 | Link |
Qwen2ForCausalLM | QwQ | Qwen/QwQ-32B-AWQ | Link |
Qwen2ForCausalLM | Qwen2.5 | Qwen/Qwen2.5-Coder-32B-Instruct-AWQ | Link |
GraniteForCausalLM | Granite-3.3 | ibm-granite/granite-3.3-8b-instruct | Link |
SDK Version: 1.20.2¶
SPD Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
Architecture | Model Family | Representative Model | QPC Configuration Link |
---|---|---|---|
LlamaForCausalLM | Llama 3.x | Target: Llama-3.3-70B-Instruct / Draft: Llama-3.2-1B-Instruct | Link |
Multimodal Language Models¶
Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image
Architecture | Model Family | Representative Model | QPC Configuration Link |
---|---|---|---|
Llama4ForConditionalGeneration | Llama4 | meta-llama/Llama-4-Scout-17B-16E-Instruct | Link |
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
Architecture | Model Family | Representative Model | QPC Configuration Link |
---|---|---|---|
GraniteForCausalLM | Granite-3.2 | ibm-granite/granite-3.2-8b-instruct | Link |
GraniteForCausalLM | Granite-3.3 | ibm-granite/granite-3.3-8b-instruct | Link |
LlamaForCausalLM | Llama 3.3 | meta-llama/Llama-3.3-70B-Instruct | Link |
GptOssForCausalLM | GPT-OSS | openai/gpt-oss-20b | Link |
GptOssForCausalLM | GPT-OSS | openai/gpt-oss-120b | Link |
Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | Link |
SDK Version: 1.20.1.2¶
Multimodal Language Models¶
Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image
Architecture | Model Family | Representative Model | QPC Configuration Link |
---|---|---|---|
Llama4ForConditionalGeneration | Llama4 | meta-llama/Llama-4-Scout-17B-16E-Instruct | Link |
SDK Version: 1.19.8¶
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
Architecture | Model Family | Representative Model | QPC Configuration Link |
---|---|---|---|
GraniteForCausalLM | Granite-3.2 | ibm-granite/granite-3.2-8b-instruct | Link |
SDK Version: 1.19.6¶
Text-only Language Models¶
Task: Text Generation
QEff Auto Class: QEFFAutoModelForCausalLM
Architecture | Model Family | Representative Model | QPC Configuration Link |
---|---|---|---|
LlamaForCausalLM | DeepSeek-R1-Distill-Llama | deepseek-ai/DeepSeek-R1-Distill-Llama-70B | Link |
LlamaForCausalLM | DeepSeek-R1-Distill-Llama | Valdemardi/DeepSeek-R1-Distill-Llama-70B-AWQ | Link |
LlamaForCausalLM | DeepSeek-R1-Distill-Llama | deepseek-ai/DeepSeek-R1-Distill-Llama-8B | Link |
LlamaForCausalLM | DeepSeek-R1-Distill-Llama | casperhansen/deepseek-r1-distill-llama-8b-awq | Link |
Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | Link |
Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | casperhansen/deepseek-r1-distill-qwen-7b-awq | Link |
Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | Link |
Qwen2ForCausalLM | DeepSeek-R1-Distill-Qwen | Valdemardi/DeepSeek-R1-Distill-Qwen-32B-AWQ | Link |
LlamaForCausalLM | Llama 3.1 | meta-llama/Llama-3.1-8B-Instruct | Link |
LlamaForCausalLM | Llama 3.1 | hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 | Link |
LlamaForCausalLM | Llama 3.1 | meta-llama/Llama-3.1-70B-Instruct | Link |
LlamaForCausalLM | Llama 3.1 | hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 | Link |
LlamaForCausalLM | Llama 3.3 | meta-llama/Llama-3.3-70B-Instruct | Link |
LlamaForCausalLM | Llama 3.3 | casperhansen/llama-3.3-70b-instruct-awq | Link |
LlamaForCausalLM | Llama-3.1-Nemotron | nvidia/Llama-3.1-Nemotron-Nano-8B-v1 | Link |
LlamaForCausalLM | Llama-3.1-Nemotron | nvidia/Llama-3.1-Nemotron-70B-Instruct-HF | Link |
LlamaForCausalLM | Llama-3.1-Nemotron | ibnzterrell/Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF-AWQ-INT4 | Link |
Phi3ForCausalLM | Phi-4 | microsoft/phi-4 | Link |
Phi3ForCausalLM | Phi-4 | stelterlab/phi-4-AWQ | Link |
Qwen2ForCausalLM | Qwen2.5 | Qwen/Qwen2.5-Coder-32B-Instruct | Link |
Qwen2ForCausalLM | Qwen2.5 | Qwen/Qwen2.5-Coder-32B-Instruct-AWQ | Link |
Qwen2ForCausalLM | QwQ | Qwen/QwQ-32B | Link |
Qwen2ForCausalLM | QwQ | Qwen/QwQ-32B-AWQ | Link |
MistralForCausalLM | Mistral | mistralai/Mistral-7B-Instruct-v0.1 | Link |
Embedding Models¶
Task: Text Embedding
QEff Auto Class: QEFFAutoModel
Architecture | Model Family | Representative Model | QPC Configuration Link |
---|---|---|---|
BERTModel | BERT-based | BAAI/bge-large-en-v1.5 | Link |
BERTModel | BERT-based | BAAI/bge-m3 | Link |
BERTModel | BERT-based | BAAI/bge-base-en-v1.5 | Link |
XLMRobertaModel | XLM-RoBERTa | intfloat/multilingual-e5-large | Link |
XLMRobertaModel | XLM-RoBERTa | intfloat/multilingual-e5-small | Link |
Multimodal Language Models¶
Task: Vision-Language Models (Text + Image Generation)
QEff Auto Class: AutoPipelineForText2Image
Architecture | Model Family | Representative Model | QPC Configuration Link |
---|---|---|---|
StableDiffusionForTextToImage | SDXL Turbo | stabilityai/sdxl-turbo | Link |
ℹ️ Click on the QPC Configuration Link to view detailed configuration and download options for each model and SDK version.