AIDive
Back to glossary

What is Automatic Speech Recognition (ASR)

GlossaryNatural Language Processing

Technology that turns spoken speech into text using audio processing and language models.

Definition

Automatic speech recognition is used in captioning, voice assistants, dictation, call analytics, meeting minutes, and content accessibility. The system must process the sound, separate speech from noise, recognize words and assemble them into meaningful text.

Example

After an online meeting, the service automatically creates a transcript of the conversation and a short summary.

Why it matters

The term is important for users looking for transcription, voice typing, call analysis, or subtitling tools.

How it works

The system analyzes the audio signal, extracts speech features, matches them with likely words, and uses context to select the most plausible phrase.

Where it is used

  • transcription
  • subtitles
  • voice assistants and call centers

Limitations

Quality depends on language, accent, noise, microphone, overlapping voices and specialized terminology.