It's been a few years since I've rolled up my sleeves and did some reverse engineering with Ghirda. The skill is very "use it or lose it" so I wonder if this will help me get back into it quicker. Or... a ton of hallucinations leading down dead end rabbit holes.
Curious if anyone has given it a shot an can speak to the experience.
I can't comment on MCP use specifically but I can comment on using an LLM while reversing. I use a local instance of whatever ends up being SOTA for local reasoning LLMs at 30B-70B params quantized to 4-6b. I feed it decompiled code to identify functions that are 'tedious' to reverse engineer. I recently reversed a binary that was compiled with soft float and had no symbols or strings. A lot of those functions end up being a ton of bit-twiddling. While I reversed the business logic I had the reasoning model identify the soft float functions with very minimal prompting. It did quite well on those!
I also tried to have it automatically build some structs from code showing the access patterns, and it failed miserably on that task. Likely a larger model (o3 or opus) would do better here.
I personally don't think letting an LLM do large parts of the reversing would be useful to me as I build up a lot of my mental model of the system during the process, so I'd be missing out on that. But for handling annoying bits of code I'd likely just forego otherwise? Go ham!
You hit the target on what most miss about LLMs, part of work is building up a lot of mental model of the system you are working on. When LLM does the work, it becomes easy to miss that mental model.
I tried to use an LLM for assistance with reversing some embedded code and agree with this. I had built up a pretty decent model of what was going on before starting. It was able to explain what was going on in this one perplexing function quite well but when I'd feed it decent sized blocks of code it would hallucinate like crazy. But I was quite happy with the performance at finding the basic library and ROM functions and annotating them correctly. I think it is all in how you use it.
Thanks for the interest. I wrote GhidrAssistMCP and the original GhidrAssist plugin which work hand-in-hand because I find they improve my RE workflow. They're not immune from hallucinations because the underlying models are not. However, they are fairly rare and I have had very reliable results with both Claude and ChatGPT. When used together, GhidrAssist+GhidrAssistMCP have been able to do some impressive analysis tasks.
If you're just getting back in the saddle, you might want to give both a try. In particular, GhidrAssist's "Explain Function" tool is really helpful at quickly summarizing code and reducing the mental overhead of making sense of large binaries.
I was about to start doing this, then realized I shouldn't nerd-snipe myself... The original extension definitely felt user unfriendly, so I was using Claude Code manually, feeding it an exported listing file. The listing files lack full addresses, so it wasn't optimal source material.
I wonder if embeddings could be created from open source and library code and then used to convert back the code with all the correct variable and function names.
It's not AI but Ghidra has a cool feature called BSim which does something similar. Each function get's a "feature vector" which now that I think about it has some clear parallels to embeddings.
Wow that is cool, I bet with that feature and a huge database of known "feature vectors" from open-source libraries so you can focus on the actual business logic of the binary instead of trying to reverse external library functions
It's been a few years since I've rolled up my sleeves and did some reverse engineering with Ghirda. The skill is very "use it or lose it" so I wonder if this will help me get back into it quicker. Or... a ton of hallucinations leading down dead end rabbit holes.
Curious if anyone has given it a shot an can speak to the experience.
I can't comment on MCP use specifically but I can comment on using an LLM while reversing. I use a local instance of whatever ends up being SOTA for local reasoning LLMs at 30B-70B params quantized to 4-6b. I feed it decompiled code to identify functions that are 'tedious' to reverse engineer. I recently reversed a binary that was compiled with soft float and had no symbols or strings. A lot of those functions end up being a ton of bit-twiddling. While I reversed the business logic I had the reasoning model identify the soft float functions with very minimal prompting. It did quite well on those!
I also tried to have it automatically build some structs from code showing the access patterns, and it failed miserably on that task. Likely a larger model (o3 or opus) would do better here.
I personally don't think letting an LLM do large parts of the reversing would be useful to me as I build up a lot of my mental model of the system during the process, so I'd be missing out on that. But for handling annoying bits of code I'd likely just forego otherwise? Go ham!
You hit the target on what most miss about LLMs, part of work is building up a lot of mental model of the system you are working on. When LLM does the work, it becomes easy to miss that mental model.
I tried to use an LLM for assistance with reversing some embedded code and agree with this. I had built up a pretty decent model of what was going on before starting. It was able to explain what was going on in this one perplexing function quite well but when I'd feed it decent sized blocks of code it would hallucinate like crazy. But I was quite happy with the performance at finding the basic library and ROM functions and annotating them correctly. I think it is all in how you use it.
Thanks for the interest. I wrote GhidrAssistMCP and the original GhidrAssist plugin which work hand-in-hand because I find they improve my RE workflow. They're not immune from hallucinations because the underlying models are not. However, they are fairly rare and I have had very reliable results with both Claude and ChatGPT. When used together, GhidrAssist+GhidrAssistMCP have been able to do some impressive analysis tasks.
If you're just getting back in the saddle, you might want to give both a try. In particular, GhidrAssist's "Explain Function" tool is really helpful at quickly summarizing code and reducing the mental overhead of making sense of large binaries.
Applies to everything. If you never had it in muscle memory, you lose it.
Thanks for sharing!
I was about to start doing this, then realized I shouldn't nerd-snipe myself... The original extension definitely felt user unfriendly, so I was using Claude Code manually, feeding it an exported listing file. The listing files lack full addresses, so it wasn't optimal source material.
Thanks so much for sharing!
I'm interested to see how MCP and the development in AI will impact the CTF scene in the future.
Works great! GhidrAssist + MCP are awesome.
Why is this better than the other one?
GhidrAssistMCP features:
- several additional tools (like get_class_info, search_classes, etc),
- it has GUI config and logging,
- and it does not rely on an external Python bridge to host the MCP Server - it's monolithic (using the official MCP Java SDK).
I wonder if embeddings could be created from open source and library code and then used to convert back the code with all the correct variable and function names.
It's not AI but Ghidra has a cool feature called BSim which does something similar. Each function get's a "feature vector" which now that I think about it has some clear parallels to embeddings.
Wow that is cool, I bet with that feature and a huge database of known "feature vectors" from open-source libraries so you can focus on the actual business logic of the binary instead of trying to reverse external library functions
I've been wondering the same thing. However you would have to have a very large database of embeddings for this to be useful, right?
Otoh I can see this being disproportionately helpful with reverse Engineering Rust and Go binaries, which usually include many opensource dependencies
nice, now do x64dbg!