Speech to text
An AI Speech feature that accurately transcribes spoken audio to text.
Make spoken audio actionable
Quickly and accurately transcribe audio to text in more than 100 languages and variants. Customize models to enhance accuracy for domain-specific terminology. Get more value from spoken audio by enabling search or analytics on transcribed text or facilitating action—all in your preferred programming language.
High-quality transcription
Get accurate audio to text transcriptions with state-of-the-art speech recognition.
Customizable models
Add specific words to your base vocabulary or build your own speech-to-text models.
Flexible deployment
Run Speech to Text anywhere—in the cloud or at the edge in containers.
Production-ready
Access the same robust technology that powers speech recognition across Microsoft products.
Accurately transcribe speech from various sources
Convert audio to text from a range of sources, including microphones , audio files , and blob storage . Use speaker diarisation to determine who said what and when. Get readable transcripts with automatic formatting and punctuation.
Customize speech models to your needs
Tailor your speech models to understand organization- and industry-specific terminology. Overcome speech recognition barriers such as background noise, accents, or unique vocabulary. Customize your models by uploading audio data and transcripts. Automatically generate custom models using Office 365 data to optimize speech recognition accuracy for your organization.
Deploy anywhere
Run Speech to Text wherever your data resides. Build speech applications that are optimized for robust cloud capabilities and on-premises using containers .
Fuel App Innovation with Cloud AI Services
Learn 5 key ways your organization can get started with AI to realize value quickly.
Comprehensive privacy and security
AI Speech, part of Azure AI Services, is certified by SOC, FedRAMP, PCI DSS, HIPAA, HITECH, and ISO.
View and delete your custom speech data and models at any time. Your data is encrypted while it's in storage.
Your data remains yours. Your audio input and transcription data aren't logged during audio processing.
Backed by Azure infrastructure, AI Speech offers enterprise-grade security, availability, compliance, and manageability.
Comprehensive security and compliance, built in
Microsoft invests more than $1 billion annually on cybersecurity research and development.
We employ more than 3,500 security experts who are dedicated to data security and privacy.
Azure has more certifications than any other cloud provider. View the comprehensive list .
Flexible pricing gives you the control you need
With Speech to Text, pay as you go based on the number of hours of audio you transcribe, with no upfront costs.
Get started with an Azure free account
After your credit, move to pay as you go to keep building with the same free services. Pay only if you use more than your free monthly amounts.
Documentation and resources
Get started.
Browse the documentation
Create an AI Speech service with the Microsoft Learn course
Explore code samples
Check out our sample code
See customization resources
Explore and customize your voice-to-text solution with Speech Studio . No code required.
Frequently asked questions about Speech to Text
What is speech to text.
It is a feature within the Speech service that accurately and quickly transcribes audio to text.
What are Azure AI Services?
AI Services are a collection of customizable, prebuilt AI models that can be used to add AI to applications. There are a variety of domains, including Speech, Decision, Language, and Vision. Speech to Text is one feature within the Speech service. Other Speech related features include Text to Speech , Speech Translation , and Speaker Recognition . An example of a Decision service is Personalizer , which allows you to deliver personalized, relevant experiences. Examples of AI Languages include Language Understanding , Text Analytics for natural language processing, QnA Maker for FAQ experiences, and Translator for language translation.
IMAGES
VIDEO
COMMENTS
Make spoken audio actionable. Quickly and accurately transcribe audio to text in more than 100 languages and variants. Customize models to enhance accuracy for domain-specific terminology. Get more value from spoken audio by enabling search or analytics on transcribed text or facilitating action—all in your preferred programming language.
About this Guided Project. In this 2-hour-long project-based course, you will learn how to import the necessary python modules for Azure Speech to Text SDK, Create a function to transcribe audio to text, Build a web app using Streamlit and deploy the web app to Heroku. This project is a beginner python project for anyone interested in learning ...
This demo will show how to use the Microsoft Azure Cognitive Services to convert audio files (.wav format) to text.GitHub code:https://github.com/caiomsouza/...
It uses the Microsoft Azure Cognitive Services Speech SDK to listen to the device's microphone and perform real-time speech-to-text and translations. An Azure Function app providing serverless HTTP APIs that the user interface will call to broadcast translated captions to connected devices using Azure SignalR Service.
Step 1: Go to Azure Portal and search for Speech services. Step 2: Then enter the details as shown below. Enter a name and also select a region where it should be hosted along with a pricing tier. (For Demo purposes , Select Pricing tier as Free. Step 3: Once the resource is created and deployed a success message would be displayed as shown below.
Azure AI Speech to Text Demo. This demo will show how to use the Microsoft Azure Cognitive Services to convert audio files (.wav format) to text.. Open in app.