Anthropic, the artificial intelligence company behind the popular Claude chatbot, today announced a sweeping update to its Responsible Scaling Policy (RSP), aimed at mitigating the risks of highly capable AI systems.
The policy, originally introduced in 2023, has evolved with new protocols to ensure that AI models, as they grow more powerful, are developed and deployed safely.
This revised policy sets out specific Capability Thresholds—benchmarks that indicate when an AI model’s abilities have reached a point where additional safeguards are necessary.