AI Models Struggle with Belief Discrimination in First-Person Statements
Stanford tested 24 models (GPT-4o, DeepSeek) with 13,000 questions.
Artificial intelligence chatbots struggle to differentiate true and false beliefs.
AI struggles with first-person statements like 'I believe that.
.'.