How Past Conversations Skew AI Judgment
Large language models are showing biases influenced by previous conversations. The negativity effect is real, and it’s more than just a technical glitch.
AI models are supposed to be our impartial digital assistants, right? Not so fast. New research reveals that large language models from top players like OpenAI and Anthropic are influenced by previous conversations. This isn’t just a glitch in the matrix. It's the accumulated message effect on LLM judgments, or AMEL for short.
AI Bias: An Unseen Influence
Across 75,898 API calls to 11 models from big names like OpenAI, Anthropic, and Google, researchers found that these models lean toward the tone of prior conversations. Present a model with a positive history, and it swings positive. Saturate it with negativity, and it dives deeper. Specifically, models showed a -0.17 shift in judgment, with negativity having a stronger pull, 1.62 times more than positivity to be exact.
The kicker? The length of conversation history doesn’t matter much. Whether it’s five or fifty previous exchanges, the bias remains. For items where the model is genuinely on the fence, the effect is even more pronounced. The bias isn't just a statistical blip, it's a behavioral trait.
Negativity Has More Weight
Negative input has a stronger impact, and that's not just a hunch. In paired scenarios, negativity induces significantly more bias than positivity. This isn’t just about token probabilities. it touches the semantic core of how these models process information. Is that something we should ignore? Hardly.
Scaling models helps reduce bias but doesn’t eliminate it. Anthropic’s Haiku model shifts less than Nano, but biases persist even in the newer GPT-5.2 from OpenAI. The difference is subtle, not sweeping. So, what’s the solution? Start fresh each time, or if batching, ensure balance in the conversation history.
Why This Matters
It's clear: biases in AI aren't just about datasets or algorithms, they're about context. If your model is scoring content or reviewing code, prior conversations could tilt the balance unfairly. Content moderation, automated evaluations, you name it, context rules the AI roost.
So, should we care? Absolutely. As these models become more integrated into our systems, ensuring they're fair and unbiased is important. Ignoring these biases could lead to skewed results, impacting decision-making processes across industries. Solana doesn't wait for permission, and neither should our approach to tackle AI biases.
Get AI news in your inbox
Daily digest of what matters in AI.