Converting spoken audio into written text. Also called speech-to-text or ASR (Automatic Speech Recognition). OpenAI's Whisper is a leading open model. Quality has improved to near-human levels for many languages. Powers voice assistants, transcription services, and accessibility tools.
AI systems that convert written text into natural-sounding spoken audio.
OpenAI's open-source speech recognition model.
Natural Language Processing.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.