A parameter that controls the randomness of a language model's output. Low temperature (near 0) makes the model pick the most likely tokens, producing focused and deterministic text. High temperature (above 1) makes outputs more random and creative. A key tool for controlling generation behavior.
The process of selecting the next token from the model's predicted probability distribution during text generation.
A text generation method (also called nucleus sampling) that only considers tokens whose cumulative probability exceeds a threshold P.
A function that converts a vector of numbers into a probability distribution — all values between 0 and 1 that sum to 1.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.