Extracting Search Trees from LLM Reasoning Traces Reveals Myopic Planning

LLM の推論トレースから探索木を抽出 — 計画が近視眼的であることを示す

推論モデル期その他 AI による科学的発見（医療・生物学・物理・気候等）（応用一般は application を優先）評価手法・指標設計・ベンチマーク結果（LMSys / SWE-bench / MMLU / ARC 等。性能向上の発表は capability-update を優先）テキスト（自然言語）エージェント・computer use・tool use

2026-05-11 · arXiv cs.AI

English summary

Proposes a method to extract internal search trees from LLM chain-of-thought traces and uses it to show that LLM planning is empirically myopic: the model handles short-horizon reward-maximizing moves well, but rarely chooses moves that pay off long-term. The findings visualize long-horizon planning limits in reasoning models and agents.

LLM の chain-of-thought 推論トレースから内部の探索木 (search tree) を抽出する手法を提案し、それを用いて『LLM の計画は近視眼的 (myopic)』であることを実証した論文。短期報酬を最大化する手は得意でも、長期的に得な探索を選ぶことが少ない。reasoning モデル・agent の長期計画の限界を可視化する findings。

ポイント

LLM 推論トレースから探索木 (search tree) を抽出
LLM の計画は近視眼的 (myopic) であることを実証
短期報酬は得意、長期最適は不得意
reasoning モデル・agent の長期計画の限界を示す

ソース

arXiv cs.AI