AI models that can understand and generate multiple types of data — text, images, audio, video.
AI models that can understand and generate multiple types of data — text, images, audio, video. GPT-4V, Gemini, and Claude 3 are multimodal models that can process both text and images. The trend is toward models that handle all modalities natively rather than through separate systems.
Contrastive Language-Image Pre-training.
AI systems that create new content — text, images, audio, video, or code — rather than just analyzing or classifying existing data.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
The research field focused on making sure AI systems do what humans actually want them to do.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.