Ask HN: Help with doing statistics over machine code
2 phafu 2 5/12/2025, 8:45:47 AM
I'd like to do some statistics over the machine code gcc generates, such as a histogram of used instructions, average volatile/preserved registers usage of functions etc. For now just x86_64 SysV-ABI would be enough.
However I'm not aware of any pre-existing tool that lets me easily do this. The options I currently see are either make gcc output assembly and write a parser for the GNU Assembler format (possibly by reusing the asm-parser of the compiler-explorer project), or write a tool that reads (disassembles) object files directly using elfutils.
Any hints, prior work, further ideas, links to useful resources, or any other kind of help would be much appreciated.
If you do want to tie it back to the source, this looks relevant: http://icps.u-strasbg.fr/~pop/gcc-ast.html
The gcc part is only relevant with regards to what dialect of assembler I need to parse. If I go that route, I'd write a parser for the GNU assembler, and that would of course work with any code in that dialect, regardless from which compiler it came from (I haven't checked whether other compilers can produce GNU assembler though).