Claude Opus 4.7 System Prompt Changes Analyzed

Technical deep-dive into the system prompt modifications between Claude Opus 4.6 and 4.7 versions. Simon Willison breaks down the key differences in how the latest model behaves.

Anthropic released Claude Opus 4.7 without publishing a changelog of system prompt modifications, which normally wouldn't matter much except that someone actually compared the two versions side-by-side. What they found reveals how Anthropic is iterating on safety without announcing it publicly.

Simon Willison's technical analysis shows the new version is measurably more cautious about certain requests. The changes aren't dramatic, but they're deliberate.

The safety guardrails got tighter

The 4.7 system prompt added more explicit language around what Claude should refuse or be skeptical about. Willison documented specific additions to how the model should handle requests involving deception, bias, and potentially harmful content.

One notable shift: the model now explicitly states it should "avoid being manipulated" through certain framing techniques. In 4.6, this was implied or buried deeper in the instructions. In 4.7, it's front and center.

This matters because system prompts are the actual ruleset the model follows. Unlike fine-tuning or training data, they can be changed instantly and measured precisely. When you see a sudden behavior change in a Claude version, the system prompt is often where it happened.

What this means for Claude users

If you've been using Opus 4.6 for anything that pushes boundaries - security research, jailbreak testing, adversarial prompting - you'll notice 4.7 refuses more often. Not dramatically, but noticeably.

For most standard use cases, this changes nothing. Writing code, analyzing documents, answering questions: the behavior is virtually identical. The tightening only matters at the edges.

The real question is whether you care about the specific refusals. Some users see stricter safety guardrails as a feature. Others see them as unnecessary limitations on what should be a general-purpose tool. Your mileage depends on your use case.

Why Anthropic didn't announce this

This is the pattern with system prompt changes: Anthropic makes them, deploys them, and documents them only if someone asks or if the changes are massive enough to warrant a blog post.

It's pragmatic. Announcing every incremental safety adjustment would create changelog fatigue. But it also means users discover these changes reactively - when behavior shifts and they wonder why.

Compare this to ChatGPT, where OpenAI tends to communicate bigger behavioral changes. Neither approach is wrong, just different philosophies about transparency. Anthropic leans toward "we're iterating continuously" while OpenAI leans toward "we're announcing our changes."

The broader Claude version picture

If you've been tracking Claude versus ChatGPT capabilities, system prompt changes matter more than you might think. A capable model with aggressive guardrails behaves differently than the same capable model with permissive ones.

Opus 4.7 is still the same underlying model as 4.6 - same training, same weights. The capability didn't change. But the personality, constraints, and behavioral boundaries did.

This is why some users hold onto older versions. If you found 4.6's behavior more suited to your workflow, migrating to 4.7 might require rewording prompts to work around the new guardrails. It's not impossible, just different friction.

What you should do about it

If you use Claude through the web or API, you're already on 4.7. No choice there. The system prompt changes apply automatically.

If you use Claude through Cursor or other integrations, check which version you're actually running. Different integrations sometimes lag behind. You might still be on 4.6 without realizing it.

The practical takeaway: if Claude suddenly starts refusing things it used to do, check the version number first. It's probably the prompt changes, not a capability regression.