why has nobody solved the code editing in a robust way yet. I think all the implementations so far have been hacky. I also had to write my own basic one a few times[1].
I guess diffusion-based models can prove good for this usecase?
> why has nobody solved the code editing in a robust way yet.
Mostly because code editing is not the problem. When coding the solution exists out of the coding space. Code only remove ambiguity. It may conflict with earlier interpretations or the current interpretation is flaky, which leads to bugs, aka actual behavior differs from expected behavior (which also exists out of the coding space).
So trying to solve things within the coding space is an incorrect approach since the beginning of computation. And trying to merge natural languages (great for exploring problems) and formal languages (great for specifying instructions) was seen as foolishness by Dijkstra [0].
The reason natural languages are great for problem solving is that we can redefine what things means easily, changing the semantic of terms as our understanding evolves. And when we've settled on a set of semantics and a process, we translate that to formal notation so it stays fixed. An analogy is sketching (where we freely edit lines and just try stuff) and oil painting where every brush stroke is purposeful.
The most robust way is not to index. Then you can't go wrong, but it is slower. People seem to be ok with it since the ML part takes longer -- for typical codebases, at least.
I don't see how indexing is related to this? The question is about how to get the LLM to reliably apply the edit it wants to make. Even when the full current version of the file is in the context, this is one of the flakiest bits of the current LLM workflows.
esafak · 42m ago
In my mind the concern was about the LLMs mental model of what the files look like, which affects edits. I see where you're coming from too.
I guess diffusion-based models can prove good for this usecase?
1. https://github.com/asadm/vibemode/blob/main/source/editor.js
Mostly because code editing is not the problem. When coding the solution exists out of the coding space. Code only remove ambiguity. It may conflict with earlier interpretations or the current interpretation is flaky, which leads to bugs, aka actual behavior differs from expected behavior (which also exists out of the coding space).
So trying to solve things within the coding space is an incorrect approach since the beginning of computation. And trying to merge natural languages (great for exploring problems) and formal languages (great for specifying instructions) was seen as foolishness by Dijkstra [0].
The reason natural languages are great for problem solving is that we can redefine what things means easily, changing the semantic of terms as our understanding evolves. And when we've settled on a set of semantics and a process, we translate that to formal notation so it stays fixed. An analogy is sketching (where we freely edit lines and just try stuff) and oil painting where every brush stroke is purposeful.
[0]: https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...
https://news.ycombinator.com/item?id=44106944