Launch HN: Morph (YC S23) – Apply AI code edits at 4,500 tokens/sec

1 bhaktatejas922 0 7/7/2025, 2:40:45 PM
Hey HN, I'm Tejas at Morph. We've built a crazy-fast Apply model that applies AI-generated code edits into your files. It clocks in at over 4,500 tokens per second, turning sometimes lazy/imperfect AI-generated patches into fast, reliable edits.

Why? AI spits out code that can’t reliably be inserted into existing code. Full file rewrites, brittle search-and-replace hacks are too slow, expensive, or error-prone.

Morph takes a different approach:

- Your agent outputs edits “lazily”, referencing unmodified lines in the existing file (ex: // ...existing code...)

- Morph instantly applies these edits to a file using our Fast Apply model + our inference engine. We do this with a slight variation on "speculative edits," using speculative decoding and the original code as a reference for blazing-fast generation

This approach was pioneered by Cursor last year, but the models that set Cursor apart—like their Fast Apply model—aren’t available as an API. We built Morph so developers can build a similar experience into their own coding agents.

Try it (no payment or sign up): https://morphllm.com/dashboard

Docs: https://docs.morphllm.com/quickstart

We have 2 Fast Apply models: morph-v3-fast - 4500+ tok/sec, and morph-v3-large - 2500+ tok/sec. These models power Fast Apply at create.xyz, databutton, continue.dev, and more

We also have more cooking: - Inline Edit Model (Cmd-K): Extremely fast inline edits - keep dev flow state; and Morph Tab API: Our Next Edit Prediction model guesses your next code edit + action with sub-500ms latency. It's currently in private beta, but you can request early access here: https://morphllm.com/tab

Our hot takes:

(1) Raw inference speed is very important for practical coding assistants. We've found boosting inference speed dramatically improves dev experience compared to 0.2% accuracy gains. Curious if HN agrees or disagrees.

(2) Frontier model full-file rewrites are legacy—incremental speculative edits are the future. Many popular tools still have frontier models rewrite whole files or use udiff, but we've seen huge wins in speed, reliability, user retention/conversion, and cost by ditching this approach entirely. As frontier models move upmarket, they’ll leave behind tasks like these - narrow + 99% accurate that can be extremely inference optimized

(3) We will see all complexity move into models (plural) - not model (singular). As benchmarks on narrow tasks saturate to 99%+ and frontier models move upmarket, tasks will move to inference optimized models. Frontier model tokens will be used to do tasks only frontier models can do We’d love to hear your ideas and experiences with coding agents! https://youtu.be/LdT8epGHJPk – Tejas & the Morph team

Comments (0)

No comments yet