A model that generates output one piece at a time, with each new piece depending on all the previous ones.
A model that generates output one piece at a time, with each new piece depending on all the previous ones. GPT and other large language models work this way — they predict the next token based on everything that came before it. Great for text generation, but inherently sequential.
The neural network architecture behind virtually all modern AI language models.
The fundamental task that language models are trained on: given a sequence of tokens, predict what comes next.
An AI model that understands and generates human language.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.