Navigationsmenü öffnen
AIDive
DE
Anmelden
Zurück zum Glossar

Acoustic Modeling

Natural Language Processing

An approach to speech recognition that associates an audio signal with likely sounds, syllables, or words.

Definition

Acoustic modeling underlies systems that turn speech into text. The model analyzes frequencies, pauses, intonation, noise and pronunciation features, and then estimates which speech units were spoken. Modern systems often combine the acoustic part with a language model to select a more natural version of the phrase.

Beispiel

When a person says “put the meeting at five,” the speech recognition system must distinguish the command from similar-sounding words and take into account the context.

Warum es wichtig ist

The term is important for choosing transcription services, voice assistants, subtitling, call centers and call analysis.

So funktioniert es

The sound is broken into short fragments, features are extracted and fed into the model. It estimates the probabilities of sound sequences, and the system assembles them into words, taking into account the dictionary and context.

Wo es genutzt wird

  • speech recognition
  • automatic subtitles
  • voice assistants and call analytics

Einschränkungen

The quality decreases with strong noise, poor microphone, rare accents, fast speech and specialized vocabulary. Business often requires adaptation to the domain.

FAQ

Why is “Acoustic Modeling” useful to know?

The term is important for choosing transcription services, voice assistants, subtitling, call centers and call analysis.