It's been a few years since I've rolled up my sleeves and did some reverse engineering with Ghirda. The skill is very "use it or lose it" so I wonder if this will help me get back into it quicker. Or... a ton of hallucinations leading down dead end rabbit holes.
Curious if anyone has given it a shot an can speak to the experience.
axoltl · 2h ago
I can't comment on MCP use specifically but I can comment on using an LLM while reversing. I use a local instance of whatever ends up being SOTA for local reasoning LLMs at 30B-70B params quantized to 4-6b. I feed it decompiled code to identify functions that are 'tedious' to reverse engineer. I recently reversed a binary that was compiled with soft float and had no symbols or strings. A lot of those functions end up being a ton of bit-twiddling. While I reversed the business logic I had the reasoning model identify the soft float functions with very minimal prompting. It did quite well on those!
I also tried to have it automatically build some structs from code showing the access patterns, and it failed miserably on that task. Likely a larger model (o3 or opus) would do better here.
I personally don't think letting an LLM do large parts of the reversing would be useful to me as I build up a lot of my mental model of the system during the process, so I'd be missing out on that. But for handling annoying bits of code I'd likely just forego otherwise? Go ham!
segmondy · 1h ago
You hit the target on what most miss about LLMs, part of work is building up a lot of mental model of the system you are working on. When LLM does the work, it becomes easy to miss that mental model.
jhart99 · 52m ago
I tried to use an LLM for assistance with reversing some embedded code and agree with this. I had built up a pretty decent model of what was going on before starting. It was able to explain what was going on in this one perplexing function quite well but when I'd feed it decent sized blocks of code it would hallucinate like crazy. But I was quite happy with the performance at finding the basic library and ROM functions and annotating them correctly. I think it is all in how you use it.
jtang613 · 3h ago
Thanks for the interest. I wrote GhidrAssistMCP and the original GhidrAssist plugin which work hand-in-hand because I find they improve my RE workflow. They're not immune from hallucinations because the underlying models are not. However, they are fairly rare and I have had very reliable results with both Claude and ChatGPT. When used together, GhidrAssist+GhidrAssistMCP have been able to do some impressive analysis tasks.
If you're just getting back in the saddle, you might want to give both a try. In particular, GhidrAssist's "Explain Function" tool is really helpful at quickly summarizing code and reducing the mental overhead of making sense of large binaries.
justmarc · 46m ago
Applies to everything. If you never had it in muscle memory, you lose it.
PradeetPatel · 27m ago
Thanks so much for sharing!
I'm interested to see how MCP and the development in AI will impact the CTF scene in the future.
leoqa · 3h ago
Why is this better than the other one?
jtang613 · 3h ago
GhidrAssistMCP features:
- several additional tools (like get_class_info, search_classes, etc),
- it has GUI config and logging,
- and it does not rely on an external Python bridge to host the MCP Server - it's monolithic (using the official MCP Java SDK).
Curious if anyone has given it a shot an can speak to the experience.
I also tried to have it automatically build some structs from code showing the access patterns, and it failed miserably on that task. Likely a larger model (o3 or opus) would do better here.
I personally don't think letting an LLM do large parts of the reversing would be useful to me as I build up a lot of my mental model of the system during the process, so I'd be missing out on that. But for handling annoying bits of code I'd likely just forego otherwise? Go ham!
If you're just getting back in the saddle, you might want to give both a try. In particular, GhidrAssist's "Explain Function" tool is really helpful at quickly summarizing code and reducing the mental overhead of making sense of large binaries.
I'm interested to see how MCP and the development in AI will impact the CTF scene in the future.
- several additional tools (like get_class_info, search_classes, etc),
- it has GUI config and logging,
- and it does not rely on an external Python bridge to host the MCP Server - it's monolithic (using the official MCP Java SDK).