Freestyle, Twill.ai, and the New Infrastructure Layer for Coding Agents

Two projects that landed on Hacker News this week - Freestyle and Twill.ai - are building the cloud infrastructure that coding agents need to do real work. Here's what they are and why this matters for AI coding tools.

Two projects hit Hacker News this week with a combined 325 upvotes, and they are both solving the same underlying problem: coding agents need somewhere to run. Freestyle (260 points) describes itself as a cloud for coding agents. Twill.ai (YC S25, 65 points) lets you delegate engineering tasks to agents that run in isolated sandboxes and return pull requests. They are different products but they are building the same infrastructure layer - the part of the stack that has been quietly holding AI coding tools back.

The problem: agents need real environments

The current generation of AI coding tools - Cursor, GitHub Copilot, Goose - are built around a model where you, the developer, are present. The agent suggests. You review. You accept or reject. You run the tests. The agent helps but it does not own the execution environment.

That works well for augmented coding. It breaks down when you want to delegate a task entirely. If you tell an agent to add Stripe billing to your project and step away, the agent needs to install packages, run a local server, trigger webhooks, observe the output, and iterate. All of that requires a sandboxed environment with a real filesystem, real network access, and the ability to run arbitrary code without contaminating your local machine.

Without that infrastructure, "agentic" coding tools are mostly sophisticated autocomplete. With it, they become something closer to a junior developer you can actually leave tasks with.

Freestyle: cloud sandboxes on demand

Freestyle provides on-demand cloud environments that coding agents can spin up, use, and destroy. Each sandbox gets a full Linux environment, filesystem, and network access. Agents can install packages, run tests, make HTTP requests, and observe the results - the same things a developer does when setting up a project locally.

The developer experience is designed around giving agents a place to work without any manual infrastructure setup. You point an agent at Freestyle, describe a task, and the agent handles provisioning and tearing down environments as needed. The 260 Hacker News points suggest this resonates with developers who have been waiting for someone to solve the environment problem.

Freestyle is primarily API-first, aimed at teams building agent-powered developer tools rather than individual developers using a UI. It is infrastructure for the next generation of coding products, not an end-user application.

Twill.ai: bring me a PR

Twill.ai (YC Summer 2025) takes a different angle on the same problem. Instead of selling infrastructure to developers building agent tools, Twill sells the outcome directly: you delegate a task via Slack, GitHub, or Linear, and Twill returns a pull request.

Under the hood it runs coding agents (it integrates with Claude Code and similar tools) in isolated cloud sandboxes - the same core infrastructure problem that Freestyle solves. But the user-facing interface is task delegation, not environment management. A product manager can file a Jira ticket, a developer can post in Slack, and Twill handles the rest: cloning the repo, setting up the environment, running the agent, opening the PR.

The YC backing suggests investors believe this workflow - describe task, receive PR - is something engineering teams will pay for. The 65 Hacker News points is more modest than Freestyle's reception, but Twill is solving a harder product problem: it has to make autonomous agent output reliable enough that you can trust the PR without reviewing every line.

Why this changes the AI coding tool market

The standard comparison between AI coding tools today focuses on which model produces better code, which IDE integration is smoother, which subscription is better value. The Cursor vs GitHub Copilot comparison is essentially a debate about those factors.

What Freestyle and Twill.ai are building changes the comparison. Once you have reliable, sandboxed execution environments, the relevant question shifts from 'which tool helps me write better code?' to 'which tool can I trust to complete a task while I work on something else?' That is a fundamentally different product category.

Goose and OpenClaw are already designed around the autonomous task execution model - both are open-source agents you give tasks to rather than tools that suggest code while you type. Freestyle and Twill.ai are building the infrastructure that makes those agents more capable and more trustworthy.

The shift from assist to delegate

The generation of AI coding tools that launched in 2023 and 2024 was about assistance: AI that helped you write code faster, spotted bugs, and answered questions about your codebase. The next generation is about delegation: AI that you hand a task to and expect back a result.

Freestyle and Twill.ai are early indicators of what that shift requires. Not just better models - the models are already good enough for many well-scoped tasks. What is needed is the infrastructure to run agents reliably, observe their outputs, and loop on failures without human intervention at every step.

For developers evaluating AI coding tools today, this is the right question to ask: is this tool designed around assistance or delegation? Both are valuable. But they require different infrastructure, different interfaces, and different expectations about what AI is doing in your workflow.