Get began with the brand new Advanced Audio Models in Azure OpenAI Service
We’re excited to announce the preview availability of Azure OpenAI’s superior audio fashions—GPT-4o-Transcribe, GPT-4o-Mini-Transcribe, and GPT-4o-Mini-TTS. This information offers builders with important insights and steps to successfully leverage these superior audio capabilities of their purposes.
What’s New in Azure OpenAI Audio Models?
Azure OpenAI introduces three highly effective new audio fashions, obtainable for deployment at this time in East US2 on Azure AI Foundry.
- GPT-4o-Transcribe and GPT-4o-Mini-Transcribe: Speech-to-text fashions outperforming earlier benchmarks.
- GPT-4o-Mini-TTS: A customizable text-to-speech mannequin enabling detailed directions on speech traits.
Model Comparison
Feature | GPT-4o-Transcribe | GPT-4o-Mini-Transcribe | GPT-4o-Mini-TTS |
---|---|---|---|
Performance | Best Quality | Great Quality | Best Quality |
Speed | Fast | Fastest | Fastest |
Input | Text, Audio | Text, Audio | Text |
Output | Text | Text | Audio |
Streaming | ✅ | ✅ | ✅ |
Ideal Use Cases | Accurate transcription for difficult environments like buyer name facilities and automatic assembly notes | Rapid transcription for dwell captioning, quick-response apps, and budget-sensitive eventualities | Customizable interactive voice outputs for chatbots, digital assistants, accessibility instruments, and academic apps |
Technical Innovations
- Targeted Audio Pretraining: OpenAI’s GPT-4o audio fashions leverage intensive pretraining on specialised audio datasets, considerably enhancing understanding of speech nuances.
- Advanced Distillation Techniques: Employing subtle distillation strategies, information from bigger fashions is transferred to environment friendly, smaller fashions, preserving excessive efficiency.
- Reinforcement Learning: Integrated RL methods dramatically enhance transcription accuracy and scale back misrecognition, attaining state-of-the-art efficiency for the speech-to-text fashions in advanced speech recognition duties.
Getting Started Guide for Developers
Use the Azure OpenAI TTS Demo repository to discover GPT‑4o audio fashions by sensible, fingers‑on examples.
[cta-button text="Get started" url="https://ift.tt/hMLoE7v" color="btn-primary"]
Step 1: Clone the Repository
git clone https://github.com/Azure-Samples/azure-openai-tts-demo.git
cd azure-openai-tts-demo
Step 2: Configure Your Environment
Create your digital surroundings and set up dependencies:
python -m venv .venv
supply .venv/bin/activate # macOS/Linux
.venvScriptsactivate # Windows
pip set up -r necessities.txt
Set up your Azure credentials by making a .env
file:
cp .env.instance .env
# Edit .env along with your Azure OpenAI endpoint and API key
Example .env
:
AZURE_OPENAI_ENDPOINT="https://<your-resource-name>.openai.azure.com/"
AZURE_OPENAI_API_KEY="your-azure-openai-api-key"
AZURE_OPENAI_API_VERSION="2025-04-14"
Step 3: Run the Interactive Gradio Soundboard
Launch the demo to experiment interactively:
python soundboard.py
Select completely different voices, vibes, and take heed to generated speech.
Step 4: Explore Additional Sample Scripts
Run pattern scripts for particular audio duties:
- Streaming audio to a file
python streaming-tts-to-file-sample.py
- Asynchronous streaming and playback
python async-streaming-tts-sample.py
Developer Impact
Integrating Azure OpenAI superior audio fashions permits builders to:
- Easily incorporate superior transcription and TTS performance.
- Create extremely interactive, intuitive voice-driven purposes.
- Enhance person expertise with customizable and expressive audio interactions.
Further Exploration
- Try the code samples
- Azure OpenAI Audio fashions documentation
- Explore the fashions in Azure AI Foundry
We encourage builders to leverage these modern audio fashions and share their insights and suggestions!
HI-FI News
through Microsoft for Developers https://ift.tt/1v2hGLk
April 16, 2025 at 06:06PM
-
Product on saleAudiophile Vinyl Records Cleaning BundleOriginal price was: €44.95.€34.95Current price is: €34.95. excl. VAT
-
Product on saleEasy Start Vinyl Records Cleaning KitOriginal price was: €39.90.€29.90Current price is: €29.90. excl. VAT
-
Vinyl Records Cleaner Easy Groove Concentrate€19.95 excl. VAT
-
Easy Groove Super Set€199.00 excl. VAT
-
Easy Groove Enzycaster – vinyl records prewash cleaner€25.00 excl. VAT
-
Easy Groove Spray&Wipe vinyl records cleaner€19.95 excl. VAT