Whisper medium
Model Overview¶
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Whisper models demonstrate a strong ability to generalise to many datasets and domains without the need for fine-tuning. The models were trained on either English-only data or multilingual data. The English-only models were trained on the task of speech recognition. The multilingual models were trained on both speech recognition and speech translation. For speech recognition, the model predicts transcriptions in the same language as the audio. For speech translation, the model predicts transcriptions to a different language to the audio.
- Model Architecture:Transformer based encoder-decoder model/sequence-to-sequence model with 769M parameters. It was trained on 680k hours of labelled speech data annotated using large-scale weak supervision.
- Model Source: openai/whisper-medium
- License: Apache 2.0 license
QPC Configurations¶
| Precision | SoCs / Tensor slicing | NSP-Cores (per SoC) | Batch Size | Chunking Prompt Length | Context Length (CL) | Generated URL | Download |
|---|---|---|---|---|---|---|---|
| MXFP6 | 2 | 8 | 1 | 1 | 150 | https://dc00tk1pxen80.cloudfront.net/SDK1.20.4/openai/whisper-medium/whisper-medium_qpc_8cores_1pl_150cl_1fbs_2devices.tar.gz | Download |