NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
The long tail of LLM-assisted decompilation (blog.chrislewis.au)
bri3d 1 days ago [-]
Claude is doing the decompilation here, right? Has this been compared against using a traditional decompiler with Claude in the loop to improve decompilation and ensure matched results? I would think that Claude’s training data would include a lot more pseudo-C <-> C knowledge than MIPS assembler from GCC 2.7 and C pairs, and even if the traditional decompiler was kind of bad at N64 it would be more efficient to fix bad decompiler C than assembler.
titzer 1 days ago [-]
It's wild to me that they wouldn't try this first. Feeding the asm directly into the model seems like intentionally ignoring a huge amount of work that has gone in traditional decompilation. What LLMs excel at (names, context, searching in high-dimensional space, making shit up) is very different from, e.g. coming up with an actual AST with infix expressions that represents asm code.
skerit 13 hours ago [-]
I've been doing some decompilation with Ghidra. Unfortunately, it's of a C++ game, which Ghidra isn't really great at. And thus Claude gets a bit confused about it all too. But all in all: it does work, and I've been able to reconstruct a ton of things already.
sestep 19 hours ago [-]
One of the other PhD students in my department has an NDSS 2026 paper about combining the strengths of both LLMs and traditional decompilers! https://lukedramko.github.io/files/idioms.pdf
suprjami 21 hours ago [-]
Not Claude, but there are open-weight LLMs trained specifically on Ghidra decomp and tested on their ability to help reverse engineers make sense of it:

https://huggingface.co/LLM4Binary/llm4decompile-22b-v2

There's also a dataset floating around HF which is... I think a popular N64 decomp to pseudo-C? Maybe the Mario one?

decidu0us9034 1 days ago [-]
"Claude struggles with large functions and more or less gives up immediately on those exceeding 1,000 instructions." Well, yeah, that's the thing, an n64 game, that's C targetting an architecture where compiler optimizations are typically lacking, the idomatic style is lots of small tightly-scoped functions and the system architecture itself is a lot simpler than say a modern amd64 pc... These days I often just feel like, why is this person telling me how easy my job is now when they seemingly don't know much about it. I just find it arrogant and insulting... Perpetually demo season.
sureglymop 15 hours ago [-]
Here's an interesting thing. I decided to do advent of code in assembly last year. What I noticed is that there must be a lot of code and binaries in AI training data but not a lot of intermediate representation. Be it LLVM IR, assembly or other forms of IR, it seems underrepresented. LLMs kept trying to give me code patterns that would make sense for high level code but not really for assembly because by hand one could find much more optimized solutions there.

But coincidentally this seems like an easy win for generated training data. Take all your code and have a compiler spit out assembly as well as binary. Now your LLM will not only be able to be a compiler but also make that useful and understandable by humans.

OptionOfT 1 days ago [-]
I'm really excited about this, especially for games for which the source code was lost like Red Alert 2.
qingcharles 21 hours ago [-]
Me too. I'm going to be reverse-engineering Elite PC (original version) and I can't help but think the source is lost. The developer seems to have totally dropped off the face of the Earth. I've contacted others who might know and nobody knows where they are.

Even the game I was a developer on which was published by Eidos in ~1998 is probably lost source. I can't think that anyone has the Visual Source Safe database backup CDs lying around, but I could be wrong.

lstodd 19 hours ago [-]
You mean 1991 Elite Plus? The whole series has been reverse-engineered to death and back. Maybe you mean some other game?

Anyway, for those old titles I don't think not having source is that much of a problem. I participated in two reimplementations of 1994 XCOM : UFO2000 and OpenXcom, helped the 1oom project (first Master of Orion) and I don't think having original source would have helped much.

qingcharles 17 hours ago [-]
No, I'm doing the original 1987 PC Elite. The later one was written by Chris Sawyer. I asked him recently and he also has no idea about Andy who wrote the prior version (both for Realtime). [both versions I assume were written in 100% ASM] Surprisingly Gemini seems to be pretty good at writing 8088 CGA assembler, especially in Deep Think mode. It one-shot an entire filled poly renderer and 3D engine.

I worked with some of the original XCOM guys after a bunch of them left Microprose to set up on their own. I wrote a lot of the graphics engine for this, which was really a direct descendent of XCOM:

https://www.youtube.com/watch?v=9UOYps_3eM0

foxtacles 21 hours ago [-]
I wonder how effective LLMs are going to be for decompiling i.e. games written in C++ targeting the PC platform. I’m not surprised one can get reasonably good results for N64 games, which have always been the easiest to reverse for a number of reasons.
amelius 1 days ago [-]
Does this technique limit the LLM to correctness-preserving transforms?
measurablefunc 1 days ago [-]
Like all things related to LLMs, semantic correctness is left as an exercise for the reader.
seddonm1 18 hours ago [-]
I delivered a talk at Rust Sydney about this exact topic last week:

https://reorchestrate.com/posts/your-binary-is-no-longer-saf...

I am able to translate multi-thousand line c functions - and reproduce bug-for-bug implementation

measurablefunc 18 hours ago [-]
Decompilation does not preserve semantics. You generally do not know whether the code from the decompiler will be compiled to semantically equivalent binary that you initially decompiled.
easyThrowaway 8 hours ago [-]
Many of the decompiled console games of the '90s were originally written in C89 using an ad-hoc compiler from Metrowerks or some off-branch release of gcc-2.95 plus console specific assemblers.

I willing to bet that the decompiled output is gonna be more readable than the original source code.

measurablefunc 4 hours ago [-]
Not related to what I was saying. Compilation is a many-to-one transformation & although you can try to guess an inverse there is no way to guarantee you will recover the original source b/c at the assembly level you don't have any types & structs.
seddonm1 18 hours ago [-]
My test harness loads up the original DLL then executes that in parallel against the converted code (differential testing). That closes the feedback loop the LLM needs to be able to find and fix discrepancies.

I'm also doing this on an old Win32 DLL so the task is probably much easier than a lot of code bases.

measurablefunc 17 hours ago [-]
What are you tracking during the runtime tracing? Or is that written up in your link?
seddonm1 17 hours ago [-]
I am applying differential/property based testing to all the side effects of functions (mutations) and return values. The rust code coverage is also used to steer the LLM as it finds discrepancies in side effects.

It is written up in my link - please bear in mind it is really hard to find the right level to communicate this level of detail at - so I'm happy to answer questions.

measurablefunc 17 hours ago [-]
That's fine, that answers my question.
nemo1618 1 days ago [-]
IMO this is one of the best use cases for AI today. Each function is like a separate mini problem with an explicit, easy-to-verify solution, and the goal is (essentially) to output text that resembles what humans write -- specifically, C code, which the models have obviously seen a lot of. And no one is harmed by this use of AI; no one's job is being taken. It's just automating an enormous amount of grunt work that was previously impossible to automate.

I'm part of the effort to decompile Super Smash Bros. Melee, and a fellow contributor recently wrote about how we're doing agent-based decompilation: https://stephenjayakar.com/posts/magic-decomp/

qingcharles 21 hours ago [-]
And the renaming of all the variables from the auto-gen ones into something human readable was always a thankless task which LLMs are really good for.
m463 24 hours ago [-]
> And no one is harmed by this use of AI; no one's job is being taken

what about: see cool app, decompile it, launch competing app.

(repeat)

_aavaa_ 23 hours ago [-]
Decompiling seems like the hard way to go here. Lots of clones pop up for popular games and apps all the time. I don't think you need to go down the decompile route to achieve that.
roelljr 1 days ago [-]
If you turn this into a benchmark, it will be solved in no time :)
macabeus 1 hours ago [-]
I'm developing a pipeline runner for matching decompilation: https://github.com/macabeus/mizuchi

The initial motivation is to run benchmarks, though the foundation is flexible and can support many other use cases over time.

It's already proving useful. For example, I can run a benchmark, view the results in a dashboard, and even feed the report into Claude Code to answer questions like: "How did changing X affect the results?" or "What could be improved in the next run?"

GaggiX 9 hours ago [-]
Curating a benchmark for reverse engineering functions doesn't seem a bad idea actually
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 22:49:43 GMT+0000 (Coordinated Universal Time) with Vercel.