Skip to content

Gemma 3 27b it

Model Overview

Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants.

Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.

  • Model Architecture: Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions.
  • Model Source: google/gemma-3-27b-it
  • License: gemma

QPC Configuration # 1

Precision SoCs / Tensor slicing NSP-Cores (per SoC) Full Batch Size Chunking Prompt Length Context Length (CL) QPC URL QPC Size QPC Download Onnx URL Onnx Download Generation Date
MXFP6 4 8 1 1024 16192 https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_qpc_8cores_1024pl_16192cl_4devices_prefill.tar.gz 2.4GB Download https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_ONNX_prefill.tar.gz Download 15-Apr-2026
MXFP6 4 8 1 1024 16192 https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_qpc_8cores_1024pl_16192cl_4devices_decode.tar.gz 44GB Download https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_ONNX_decode.tar.gz Download 15-Apr-2026

QPC Configuration # 2

Precision SoCs / Tensor slicing NSP-Cores (per SoC) Full Batch Size Chunking Prompt Length Context Length (CL) QPC URL QPC Size QPC Download Onnx URL Onnx Download Generation Date
MXFP6 2 16 1 128 4096 https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_qpc_16cores_128pl_4098cl_2devices_prefill.tar.gz 1.1GB Download https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_ONNX_prefill.tar.gz Download 15-Apr-2026
MXFP6 2 16 1 128 4096 https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_qpc_16cores_128pl_4098cl_2devices_decode.tar.gz 33GB Download https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_ONNX_decode.tar.gz Download 15-Apr-2026

QPC Configuration # 3

Precision SoCs / Tensor slicing NSP-Cores (per SoC) Full Batch Size Chunking Prompt Length Context Length (CL) QPC URL QPC Size QPC Download Onnx URL Onnx Download Generation Date
MXFP6 4 8 1 128 65536 https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_qpc_8cores_128pl_64kcl_[4k,8k,13k,16k,32k,64k]ccl_4device_prefill.tar.gz 2.4GB Download https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_ONNX_prefill.tar.gz Download 28-Apr-2026
MXFP6 4 8 1 128 65536 https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_qpc_8cores_128pl_64kcl_[4k,8k,13k,16k,32k,64k]ccl_4devices_decode.tar.gz 43GB Download https://dc00tk1pxen80.cloudfront.net/SDK1.21.4.0/google/gemma-3-27b-it/gemma-3-27b-it_ONNX_decode.tar.gz Download 28-Apr-2026