Structure vs Constitution in AI Safety
Anthropic publishes its constitution along with research about where the constitution works and where it does not. The current version (January 2026) is an ethical treatise written to Claude. The priorities are safety, ethics, Anthropic’s guidelines, and helpfulness, in that order when they conflict. Anthropic favours cultivating good values and judgment over strict rules, comparing their approach to trusting an experienced professional rather than enforcing a checklist. In March 2026, Anthropic’s operational judgment failed catastrophically: a one-line .npmignore error led to the leak of 512,000 lines of Claude Code source code.The constitution asks Claude to imagine how a “thoughtful senior Anthropic employee would react”, but what happens when the organisation’s structure fails? ...