Responsible Scaling Policy

English summary

Anthropic publishes an updated Responsible Scaling Policy (RSP) — its core AI-safety framework. The revision tightens AI Safety Level (ASL) thresholds, monitoring requirements, and category-specific evaluations (cybersecurity, bio-related risks), and codifies additional commitments for more capable models. The update advances transparency in self-regulation for frontier model developers.

Anthropic は AI 安全性の中核フレームワークである Responsible Scaling Policy (RSP) を更新した。AI Safety Level (ASL) の閾値・要件・モニタリング指標を見直し、より能力が高いモデルに対する追加コミットメントを明文化した。サイバーセキュリティ・生物兵器関連リスク等の category-specific evaluation を強化し、フロンティアモデル開発における自主規制の透明化を進める。

ポイント

Responsible Scaling Policy を更新公開
AI Safety Level (ASL) 閾値・要件・モニタリングを強化
サイバー / 生物兵器等のカテゴリ別 evaluation を充実
フロンティアモデル開発の自主規制を透明化

ソース

Anthropic News