We tested three AI coding assistants on a repo
We wanted to see which AI coding assistant could handle real-world complexity—a massive repo, multi-service architecture, legacy code, flaky APIs, and every edge case we could find.
The Setup: - 1.2M-line monorepo - Six microservices in different languages - 40+ dependencies - A database schema that would make a junior dev cry - An API spec that changed mid-task (on purpose)
The Tools & Cost: - Cursor — strong autocomplete, context limits ~250K tokens, costs ~$600/year - Windsurf — fast edits, context up to ~400K tokens, costs ~$550/year - Dropstone — unlimited token context, remembers every file, every change, costs just $12
Performance: - Cursor: lost track after 250K tokens, broke unrelated modules - Windsurf: struggled with multi-file reasoning past 400K tokens - Dropstone: handled the entire repo without forgetting anything, refactored across services, fixed old bugs, wrote unit tests referencing files from days ago
Key Takeaway: - Cursor = great for small bursts - Windsurf = rapid prototyper, context-constrained - Dropstone = AI brain that never forgets at a fraction of the cost
Memory and context matter. When it runs out, even good AI struggles—but for $12, Dropstone can handle huge, messy projects without token limits.
Curious to hear: Has anyone else tested AI coding assistants on large projects? How did cost and context limits affect your workflow?
No comments yet