Your LLM Evaluator Is Probably Lying to You
Mahmoud Mabrouk (X, LinkedIn), co-founder and CEO of Agenta AI, opened his AI Engineer Europe workshop with a scenario most teams will recognize: your LLM agent is in production, your observability dashboard looks clean, but customers keep saying the thing doesn't work. The culprit, he argues, isn't the agent -- it's …