Anthropic's Fable model draws fire for guardrails that block even basic questions
Anthropic's newly released Claude Fable model has attracted widespread criticism for safety filters so restrictive they prevent cybersecurity researchers from doing legitimate work and reportedly refuse to answer straightforward biology questions. Anthropic apologised for what it described as 'invisible distillation guardrails' — constraints baked into the model that were not clearly communicated to users. Microsoft has separately restricted internal employee access to Fable over data retention concerns, a significant vote of no-confidence from one of the industry's closest AI partners. The episode illustrates the persistent tension between deploying models safely and making them genuinely useful for professional applications. It also raises questions about transparency: if a model has hidden behavioural constraints, users cannot make informed decisions about whether it fits their needs.