Spec Driven Development with ralph-starter
Spec Driven Development is the biggest shift in AI coding since agents learned to run tests. Here is how ralph-starter fits in.
Spec Driven Development is the biggest shift in AI coding since agents learned to run tests. Here is how ralph-starter fits in.
I ran ralph-starter figma, pasted a Figma URL, picked my tech stack, and walked away. When I came back, the generated landing page matched the Figma design at 98.2% pixel accuracy. The AI agent had caught its own font-size mismatch, fixed it, and passed a strict visual comparison — all without me writing a single line of code.
This is how Figma-to-code should work. Here is exactly how it does.
I tracked one full week of development. Half the tasks with ralph-starter, half by hand. Same sprint, same project, same me, same coffee intake (a lot).
ralph-starter works with multiple coding agents. I use Claude Code for basically everything, but I wanted to actually test the others on real tasks instead of just assuming. So I ran the same task on Claude Code, Cursor, Codex CLI, and OpenCode over the past few weeks. Some surprises, some not.
I spend more time writing specs than writing code now, and my output went up, not down. That genuinely surprised me.
I run ralph-starter a lot. Multiple tasks per day, 3 to 7 loops each. Last month I checked my Anthropic dashboard and the total was $62, almost scrolled past it, but then I looked at the prompt caching line and did the math. Without caching? $109. I saved $47 by literally doing nothing.
Linear is where my team plans work. ralph-starter is where it gets built. I have been running this combo every single day for weeks now, and I want to show you exactly what my workflow looks like.
ralph-starter is a CLI tool that orchestrates AI coding agents in autonomous loops. You give it a task (or point it at a GitHub issue, a Linear ticket, a Notion page), it runs the agent, checks if tests pass, if lint is clean, if build works. If something fails it feeds the error back to the agent and loops again. When everything passes it commits, pushes, and opens a PR.
A designer handed me a Figma file on Friday afternoon with 12 screens for a dashboard. My immediate thought was "cool, that is next week gone." Instead I pointed ralph-starter at it and had working React components before I left for the weekend.
I wanted to write the post I wish existed when I started: how to go from zero to your first automated PR with ralph-starter and Claude Code. No fluff, just the steps.
I keep saying "I type one command and get a PR" and people want to know what actually happens in between. So let me walk you through a real one.
Do you know when you have a GitHub issue with the full spec, and then you open Claude or ChatGPT, copy the issue, paste there, get code back, paste in your editor, run tests, something breaks, go back to chat, paste the error? I was doing this 20 times a day. Twenty. I counted.