Decompiling the GPL violated Linux kernel using Evolutionary Algorithms

10 farooqkz 10 9/12/2025, 8:15:48 AM far.chickenkiller.com ↗

Comments (10)

synctext · 2h ago
This is a difficult project. The blog post seems to hint at reasonable feasibility, this stuff is hard! We build a less ambitious tool in the university lab: "ASTANA: Practical String Deobfuscation for Android Applications Using Program Slicing" [0].

Would advise to first read the reverse engineering related work. Genetic programming is just a technique best used when everything else has failed :-)

[0] https://arxiv.org/pdf/2104.02612

farooqkz · 7m ago
Thanks for the hints. Of course, it's very very difficult. But one thing I think you missed, is that I'm proposing a "byte equivalent decompilation". And after that, we should go into reading the code readable and understandable.

If we could create a program doing all this, automatically or semi-automatically, it will be great-great because then not releasing the kernel code doesn't matter. I believe if enough effort and time is put into it, there is a good chance we could see such a thing in like 5-7 years.

After that, we might be able to target the binary blobs, the propriety firmwares. Those might have some legal issue, of course. But as long as it is used only to write a FOSS alternative, that probably won't be an issue, I think.

temp0826 · 3h ago
Not exactly the same goal but in the ballpark, reminded me of the linux-libre patches (removes binary blobs and non-gpl code from the kernel)-

https://www.fsfla.org/ikiwiki/selibre/linux-libre/

mmarx · 2h ago
> Theoretically, we could also go for finding the semantically equivalent C code. However, last time I researched, checking semantic equivalency is a very complex problem. I think it was NP hard.

Already deciding whether two finite automata decide the same (regular) language is PSPACE-complete; it's undecidable for anything that can decide arbitrary context-free languages (which C programs can clearly do).

farooqkz · 19m ago
Well I knew that checking if two binary circuits are equivalent is NP hard. Checking semantic equivalency of C code, of course, should be harder.
ACCount37 · 2h ago
I thought this would be an actual process, not a blog post with idle musings.
farooqkz · 12m ago
I wanted to append [idea] but it became too long. I think there should be balance "musings" and actual work. I learned very valuable lessons in my wakegp research. Even though I haven't completed it. If the one doing this research would be me, I would go for no actual work as long as I can. And when I start to do the work, I should do as little as I can and move very slowly. Like the saying "When you move slowly, you can see the path ahead clear". After all, we should know if the next step we go forward, the ground under us is solid.
ginko · 2h ago
I'm having a really hard time parsing this title.
rnhmjoj · 2h ago
A dash between GPL and violated would probably help.
farooqkz · 18m ago
Thanks for the feedback. I don't think I can edit the HN entry, but I'm updating me blog post.