AI Glossary
Your guide to understanding AI and machine learning terminology. From transformers and attention to RLHF and fine-tuning — every term explained in plain language.
178 terms found
A
Activation Function
aiA mathematical function applied to a neuron's output that introduces non-linearity into the network.
Adam Optimizer
aiAn optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
AGI
aiArtificial General Intelligence.
AI Agent
aiAn autonomous AI system that can perceive its environment, make decisions, and take actions to achieve goals.
AI Alignment
aiThe research field focused on making sure AI systems do what humans actually want them to do.
AI Safety
aiThe broad field studying how to build AI systems that are safe, reliable, and beneficial.
Anthropic
aiAn AI safety company founded in 2021 by former OpenAI researchers, including Dario and Daniela Amodei.
Artificial Intelligence
aiThe science of creating machines that can perform tasks requiring human-like intelligence — reasoning, learning, perception, language understanding, and decision-making.
ASI
aiArtificial Superintelligence.
Attention
aiA mechanism that lets neural networks focus on the most relevant parts of their input when producing output.
Autoencoder
aiA neural network trained to compress input data into a smaller representation and then reconstruct it.
Autonomous AI
aiAI systems capable of operating independently for extended periods without human intervention.
Autoregressive Model
aiA model that generates output one piece at a time, with each new piece depending on all the previous ones.
B
Backpropagation
aiThe algorithm that makes neural network training possible.
Batch Normalization
aiA technique that normalizes the inputs to each layer in a neural network, making training faster and more stable.
Batch Size
aiThe number of training examples processed together before the model updates its weights.
Beam Search
aiA decoding strategy that keeps track of multiple candidate sequences at each step instead of just picking the single best option.
Benchmark
aiA standardized test used to measure and compare AI model performance.
BERT
aiBidirectional Encoder Representations from Transformers.
Bias
aiIn AI, bias has two meanings.
BPE
aiByte Pair Encoding.
C
Catastrophic Forgetting
aiWhen a neural network trained on new data suddenly loses its ability to perform well on previously learned tasks.
Chain of Thought
aiA prompting technique where you ask an AI model to show its reasoning step by step before giving a final answer.
Chatbot
aiAn AI system designed to have conversations with humans through text or voice.
Chinchilla
aiA research paper from DeepMind that proved most large language models were over-sized and under-trained.
Classification
aiA machine learning task where the model assigns input data to predefined categories.
Claude
aiAnthropic's family of AI assistants, including Claude Haiku, Sonnet, and Opus.
CLIP
aiContrastive Language-Image Pre-training.
CNN
aiConvolutional Neural Network.
Compute
aiThe processing power needed to train and run AI models.
Computer Vision
aiThe field of AI focused on enabling machines to interpret and understand visual information from images and video.
Constitutional AI
aiAn approach developed by Anthropic where an AI system is trained to follow a set of principles (a 'constitution') rather than relying solely on human feedback for every decision.
Context Window
aiThe maximum amount of text a language model can process at once, measured in tokens.
Contrastive Learning
aiA self-supervised learning approach where the model learns by comparing similar and dissimilar pairs of examples.
Conversational AI
aiAI systems designed for natural, multi-turn dialogue with humans.
Cross-Attention
aiAn attention mechanism where one sequence attends to a different sequence.
CUDA
aiNVIDIA's parallel computing platform that lets developers use GPUs for general-purpose computing.
D
DALL-E
aiOpenAI's text-to-image generation model.
Data Augmentation
aiTechniques for artificially expanding training datasets by creating modified versions of existing data.
Data Poisoning
aiDeliberately corrupting training data to manipulate a model's behavior.
Decoder
aiThe part of a neural network that generates output from an internal representation.
Deep Learning
aiA subset of machine learning that uses neural networks with many layers (hence 'deep') to learn complex patterns from large amounts of data.
Deepfake
aiAI-generated media that realistically depicts a person saying or doing something they never actually did.
DeepMind
aiA leading AI research lab, now part of Google.
Diffusion Model
aiA generative AI model that creates data by learning to reverse a gradual noising process.
Distillation
aiA technique where a smaller 'student' model learns to mimic a larger 'teacher' model.
DPO
aiDirect Preference Optimization.
Dropout
aiA regularization technique that randomly deactivates a percentage of neurons during training.
E
Edge AI
aiRunning AI models directly on local devices (phones, laptops, IoT devices) instead of in the cloud.
Embedding
aiA dense numerical representation of data (words, images, etc.
Emergent Abilities
aiCapabilities that appear suddenly as language models reach certain sizes.
Emergent Behavior
aiCapabilities that appear in AI models at scale without being explicitly trained for.
Encoder
aiThe part of a neural network that processes input data into an internal representation.
Encoder-Decoder
aiA neural network architecture with two parts: an encoder that processes the input into a representation, and a decoder that generates the output from that representation.
Epoch
aiOne complete pass through the entire training dataset.
Ethical AI
aiThe practice of developing AI systems that are fair, transparent, accountable, and respect human rights.
Evaluation
aiThe process of measuring how well an AI model performs on its intended task.
Explainability
aiThe ability to understand and explain why an AI model made a particular decision.
F
Feature Extraction
aiThe process of identifying and pulling out the most important characteristics from raw data.
Federated Learning
aiA training approach where the model learns from data spread across many devices without that data ever leaving those devices.
Few-Shot Learning
aiThe ability of a model to learn a new task from just a handful of examples, often provided in the prompt itself.
Fine-Tuning
aiThe process of taking a pre-trained model and continuing to train it on a smaller, specific dataset to adapt it for a particular task or domain.
Flash Attention
aiAn optimized attention algorithm that's mathematically equivalent to standard attention but runs much faster and uses less GPU memory.
Foundation Model
aiA large AI model trained on broad data that can be adapted for many different tasks.
Function Calling
aiA capability that lets language models interact with external tools and APIs by generating structured function calls.
G
GAN
aiGenerative Adversarial Network.
GELU
aiGaussian Error Linear Unit.
Gemini
aiGoogle's flagship multimodal AI model family, developed by Google DeepMind.
Generative AI
aiAI systems that create new content — text, images, audio, video, or code — rather than just analyzing or classifying existing data.
GPT
aiGenerative Pre-trained Transformer.
GPU
aiGraphics Processing Unit.
Gradient Accumulation
aiA technique that simulates larger batch sizes by accumulating gradients over multiple forward passes before updating weights.
Gradient Descent
aiThe fundamental optimization algorithm used to train neural networks.
Grounding
aiConnecting an AI model's outputs to verified, factual information sources.
Guardrails
aiSafety measures built into AI systems to prevent harmful, inappropriate, or off-topic outputs.
H
Hallucination
aiWhen an AI model generates confident-sounding but factually incorrect or completely fabricated information.
Hallucination Detection
aiMethods for identifying when an AI model generates false or unsupported claims.
Hugging Face
aiThe leading platform for sharing and collaborating on AI models, datasets, and applications.
Hyperparameter
aiA setting you choose before training begins, as opposed to parameters the model learns during training.
I
Image Classification
aiThe task of assigning a label to an image from a set of predefined categories.
ImageNet
aiA massive image dataset containing over 14 million labeled images across 20,000+ categories.
In-Context Learning
aiA model's ability to learn new tasks simply from examples provided in the prompt, without any weight updates.
Inference
aiRunning a trained model to make predictions on new data.
Instruction Tuning
aiFine-tuning a language model on datasets of instructions paired with appropriate responses.
K
L
Language Model
aiAn AI model that understands and generates human language.
Large Language Model
aiAn AI model with billions of parameters trained on massive text datasets.
Latent Space
aiThe compressed, internal representation space where a model encodes data.
Layer Normalization
aiA technique that normalizes activations across the features of each training example, rather than across the batch.
Learning Rate
aiA hyperparameter that controls how much the model's weights change in response to each update.
LLaMA
aiMeta's family of open-weight large language models.
LLM
aiLarge Language Model.
LoRA
aiLow-Rank Adaptation.
Loss Function
aiA mathematical function that measures how far the model's predictions are from the correct answers.
LSTM
aiLong Short-Term Memory.
M
Machine Learning
aiA branch of AI where systems learn patterns from data instead of following explicitly programmed rules.
Masked Language Modeling
aiA pre-training technique where random words in text are hidden (masked) and the model learns to predict them from context.
Meta-Learning
aiTraining models that learn how to learn — after training on many tasks, they can quickly adapt to new tasks with very little data.
Midjourney
aiA popular AI image generation service known for its distinctive artistic style.
Mistral
aiA French AI company that builds efficient, high-performance language models.
Mixture of Experts
aiAn architecture where multiple specialized sub-networks (experts) share a model, but only a few activate for each input.
MMLU
aiMassive Multitask Language Understanding.
Model Collapse
aiA degradation that happens when AI models are trained on data generated by other AI models.
Multi-Head Attention
aiAn extension of the attention mechanism that runs multiple attention operations in parallel, each with different learned projections.
Multimodal
aiAI models that can understand and generate multiple types of data — text, images, audio, video.
N
Narrow AI
aiAI systems designed for a specific task, as opposed to general intelligence.
Natural Language Processing
aiThe field of AI focused on enabling computers to understand, interpret, and generate human language.
Neural Network
aiA computing system loosely inspired by biological brains, consisting of interconnected nodes (neurons) organized in layers.
Next-Token Prediction
aiThe fundamental task that language models are trained on: given a sequence of tokens, predict what comes next.
NLP
aiNatural Language Processing.
NVIDIA
aiThe dominant provider of AI hardware.
O
Object Detection
aiA computer vision task that identifies and locates objects within an image, drawing bounding boxes around each one.
Open Source AI
aiAI models whose weights, code, and sometimes training data are publicly released for anyone to use, modify, and build upon.
OpenAI
aiThe AI company behind ChatGPT, GPT-4, DALL-E, and Whisper.
Optimization
aiThe process of finding the best set of model parameters by minimizing a loss function.
Overfitting
aiWhen a model memorizes the training data so well that it performs poorly on new, unseen data.
P
Parameter
aiA value the model learns during training — specifically, the weights and biases in neural network layers.
Perplexity
aiA measurement of how well a language model predicts text.
Positional Encoding
aiInformation added to token embeddings to tell a transformer the order of elements in a sequence.
Pre-Training
aiThe initial, expensive phase of training where a model learns general patterns from a massive dataset.
Prompt Engineering
aiThe art and science of crafting inputs to AI models to get the best possible outputs.
Prompting
aiThe text input you give to an AI model to direct its behavior.
PyTorch
aiThe most popular deep learning framework, developed by Meta.
R
RAG
aiRetrieval-Augmented Generation.
Reasoning
aiThe ability of AI models to draw conclusions, solve problems logically, and work through multi-step challenges.
Recurrent Neural Network
aiA neural network architecture where connections form loops, letting the network maintain a form of memory across sequences.
Red Teaming
aiSystematically testing an AI system by trying to make it produce harmful, biased, or incorrect outputs.
Regression
aiA machine learning task where the model predicts a continuous numerical value.
Regularization
aiTechniques that prevent a model from overfitting by adding constraints during training.
Reinforcement Learning
aiA learning approach where an agent learns by interacting with an environment and receiving rewards or penalties.
ReLU
aiRectified Linear Unit.
Representation Learning
aiThe idea that useful AI comes from learning good internal representations of data.
Responsible AI
aiThe practice of developing and deploying AI systems with careful attention to fairness, transparency, safety, privacy, and social impact.
Reward Model
aiA model trained to predict how helpful, harmless, and honest a response is, based on human preferences.
RLHF
aiReinforcement Learning from Human Feedback.
RNN
aiRecurrent Neural Network.
RoPE
aiRotary Position Embedding.
S
Sampling
aiThe process of selecting the next token from the model's predicted probability distribution during text generation.
Scaling Laws
aiMathematical relationships showing how AI model performance improves predictably with more data, compute, and parameters.
Self-Attention
aiAn attention mechanism where a sequence attends to itself — each element looks at all other elements to understand relationships.
Self-Supervised Learning
aiA training approach where the model creates its own labels from the data itself.
Semantic Search
aiSearch that understands meaning and intent rather than just matching keywords.
Sentiment Analysis
aiAutomatically determining whether a piece of text expresses positive, negative, or neutral sentiment.
Softmax
aiA function that converts a vector of numbers into a probability distribution — all values between 0 and 1 that sum to 1.
Speech Recognition
aiConverting spoken audio into written text.
Stable Diffusion
aiAn open-source image generation model released by Stability AI.
Structured Output
aiGetting a language model to generate output in a specific format like JSON, XML, or a database schema.
Supervised Learning
aiThe most common machine learning approach: training a model on labeled data where each example comes with the correct answer.
Synthetic Data
aiArtificially generated data used for training AI models.
System Prompt
aiInstructions given to an AI model that define its role, personality, constraints, and behavior rules.
T
Temperature
aiA parameter that controls the randomness of a language model's output.
TensorFlow
aiGoogle's open-source deep learning framework.
Text-to-Image
aiAI models that generate images from text descriptions.
Text-to-Speech
aiAI systems that convert written text into natural-sounding spoken audio.
Token
aiThe basic unit of text that language models work with.
Tokenizer
aiThe component that converts raw text into tokens that a language model can process.
Tool Use
aiThe ability of AI models to interact with external tools and systems — browsing the web, running code, querying APIs, reading files.
Top-P Sampling
aiA text generation method (also called nucleus sampling) that only considers tokens whose cumulative probability exceeds a threshold P.
TPU
aiTensor Processing Unit.
Training
aiThe process of teaching an AI model by exposing it to data and adjusting its parameters to minimize errors.
Transfer Learning
aiUsing knowledge learned from one task to improve performance on a different but related task.
Transformer
aiThe neural network architecture behind virtually all modern AI language models.
Turing Test
aiA test proposed by Alan Turing in 1950: if a human can't reliably tell whether they're talking to a machine or another human, the machine passes.
U
V
VAE
aiVariational Autoencoder.
Vector Database
aiA database optimized for storing and searching high-dimensional vectors (embeddings).
Vision Transformer
aiA transformer architecture adapted for image processing.
Voice Cloning
aiUsing AI to create a synthetic copy of someone's voice from a small sample of their speech.
W
Weight
aiA numerical value in a neural network that determines the strength of the connection between neurons.
Whisper
aiOpenAI's open-source speech recognition model.
Word2Vec
aiOne of the earliest successful word embedding models, from Google in 2013.
World Model
aiAn AI system's internal representation of how the world works — understanding physics, cause and effect, and spatial relationships.