More Thinking, More Bias: Length-Driven Position Bias in Reasoning Models

推論を増やすほどバイアスも増える — 推論モデルの長さ駆動型位置バイアス

推論モデル期その他評価手法・指標設計・ベンチマーク結果（LMSys / SWE-bench / MMLU / ARC 等。性能向上の発表は capability-update を優先）AI による科学的発見（医療・生物学・物理・気候等）（応用一般は application を優先）安全性・アラインメント研究・レッドチーミング・透明性（外部規制の話は policy を優先）テキスト（自然言語）

2026-05-11 · arXiv cs.AI

English summary

Identifies a length-driven position bias in reasoning models (e.g., o1 / Claude reasoning): the longer the chain of thought, the more systematically the model favors first/last options. A cautionary finding for the naive expectation that 'more thinking → better answers' — with material implications for how we evaluate reasoning models.

推論モデル（reasoning models, e.g., o1 / Claude reasoning）における『長さ駆動型位置バイアス (length-driven position bias)』を発見した論文。推論ステップが長くなるほど、最初・最後の選択肢を選びやすくなる系統的バイアスが強まることを示す。『より考えさせれば良い答えが得られる』という素朴な期待への警鐘で、reasoning 系モデルの evaluation 設計に影響する重要な findings。

ポイント

推論モデルに『長さ駆動型位置バイアス』を発見
推論ステップが長いほど最初・最後の選択肢を選びやすくなる
『考えさせれば良い答え』への警鐘
reasoning 系モデルの evaluation 設計に影響

ソース

arXiv cs.AI