DeepSeek R1 outperforms o3-mini (medium) on the Confabulations (Hallucinations) Benchmark