Is Claude Code Getting Worse? What 1,000 Hacker News Points Tells Us

A GitHub issue titled 'Claude Code is unusable for complex engineering tasks with Feb updates' hit 1,000+ points on Hacker News. We looked at what actually changed, why developers are frustrated, and what your options are.

A GitHub issue doesn't usually make the front page of Hacker News. When one titled "Claude Code is unusable for complex engineering tasks with Feb updates" crossed 1,000 points, that was a signal worth paying attention to.

The complaints are real, they are widespread, and the frustration behind them is understandable. But the full picture is more nuanced than the headline suggests - and for developers trying to decide whether to stick with Claude Code, switch tools, or adjust their workflow, the nuance is what actually matters.

What the complaints are actually about

The core issue reported by developers is behavioral: Claude Code started refusing more tasks, adding more unsolicited caveats, breaking large implementations into smaller steps without being asked, and producing output that is more cautious but less useful on complex multi-file work.

This is not a capability regression in the technical sense. The underlying model did not get worse at reasoning or code generation. What changed is how the model decides to apply those capabilities - and that shift has a name: safety and alignment tuning.

After major model updates, AI labs routinely run additional rounds of reinforcement learning from human feedback (RLHF) to steer model behavior. Done well, this makes the model more helpful and less likely to produce harmful output. Done too aggressively, it can create what developers call "overcautious" behavior - the model hesitates, hedges, and declines things it should be able to handle.

That appears to be what happened with the February updates. The changes that were meant to improve behavior on edge cases had side effects on legitimate complex engineering tasks.

Where it hurts most

Not all Claude Code workflows are equally affected. Developers doing well-scoped, single-file tasks largely report no issues. The regression shows up clearly in three areas:

Large refactors across multiple files. Ask Claude Code to restructure a service, rename a pattern throughout a codebase, or migrate to a new pattern - tasks that require touching 10+ files - and you get more incomplete passes, more mid-task pauses, more requests to confirm before proceeding.

Long agentic sessions. Extended sessions where Claude Code is running terminal commands, editing files, and iterating on test failures show increased drop-offs. The model is more likely to stop and ask for confirmation at points where it previously would have continued autonomously.

Ambiguous requirements. Complex engineering problems almost always involve some ambiguity. Pre-update Claude Code would make reasonable assumptions and proceed. Post-update behavior leans toward surfacing the ambiguity and waiting - which is technically more correct but often frustrating when you just want it to take a reasonable shot.

Anthropic's position

Anthropic has acknowledged the issue. The response has been measured: they are aware of the quality regression reports, they are investigating, and updates are coming. That is the standard response to this kind of issue, and it is worth taking at face value - Anthropic has a direct commercial incentive to fix this, and the HN thread gave them a very clear signal about what developers are experiencing.

Claude Code costs up to $200 a month at heavy usage. Users paying that much for a tool that has gotten meaningfully worse on their workflows are not going to stay quiet, and they haven't.

What you can actually do right now

If you are affected, a few approaches are worth trying before switching tools entirely.

Be more explicit about scope. Instead of "refactor this service to use the repository pattern", try "refactor only the UserService class in user-service.ts to use the repository pattern, starting with the database calls". Breaking the scope down explicitly reduces the model's uncertainty and the resulting hesitation.

Use CLAUDE.md to set expectations. Claude Code reads a CLAUDE.md file in your project root as context at the start of every session. Adding instructions like "proceed without asking for confirmation on file edits unless destructive" can reduce the mid-task pauses meaningfully.

Plan mode before agent mode. Spending a turn generating a step-by-step plan before asking Claude Code to execute it gives the model a clearer road map and reduces the points where it second-guesses the approach mid-task.

The alternatives are better than they were

The honest answer is that the competition has closed the gap. A year ago, Claude Code was in a category of its own for complex agentic coding tasks. That is no longer true.

Cursor has continued improving its multi-file editing and codebase-aware features. For developers who prefer a GUI and want the best editor experience, Cursor at $20/month is a strong alternative to Claude Code - especially if you are on Claude Code's higher-tier plans.

Goose and OpenClaw have picked up users from this exact frustration. Both are open-source, both support Claude and other models as backends, and neither has the behavioral changes that triggered the GitHub issue. The tradeoff is that both require more technical setup and live in the terminal. But for developers comfortable with that - and especially those already paying for Claude API access - they are worth evaluating seriously.

It is also worth noting that all of these tools ultimately run on the same underlying models. If you use Goose with a Claude backend, you get Claude's reasoning with a different behavioral layer on top. Whether that behavioral layer is an improvement or a downgrade depends on your workflow.

The bigger pattern

Claude Code's quality regression is the latest example of a recurring dynamic in AI tools: models improve in measurable benchmarks, then get tuned for safety or commercial behavior in ways that make them feel worse in practice. This has happened with ChatGPT, Copilot, and now Claude Code.

The developer community is unusually vocal when this happens - hence the 1,000-point GitHub issue. That vocal feedback loop is what tends to produce corrections. Anthropic will almost certainly roll back or tune around the behavioral changes that are affecting complex engineering tasks, because the commercial cost of not doing so is high.

Whether that happens in a week or a quarter is the open question. In the meantime, the workarounds above are the most practical path for developers who need the tool to work well right now.