It seems that there are no good open-source disasm for PPC except Capstone. And it even has perl binding - unfortunately it can extract only basic fields like opcode and text but no operands for specific processor. So I made yet another
I would like to make a few critical remarks about the capstone itself
1) it's fat as pig - with default settings we have ls -l libcapstone.a
Sure this can be fixed with selecting just needed processors but anyway makes a lasting impression
-rw-rw-r-- 1 redp redp 93833624 nov 8 16:10 libcapstone.a
2) it's inconsistent. Just couple of examples
mr r31, r3
actually has instruction id PPC_INSN_OR. to get MR you must check alias ID, and even in this case they have PPC_INS_ALIAS_MR & PPC_INS_ALIAS_MR_
Another example - ld r3, something
actually has PPC_REG_X3 instead of PPC_REG_R3 despite the fact that these are the same register. Why clone lots of registers if you produce the same output for them? And why not add size of register? I suspect this happens bcs they used MD from llvm and was too lazy to make some optimizations
3) it's incomplete. For example they don't implemented reg_access for powerpc (as well as for alpha, risc-v, sparc etc)
multiple TOCs
; prologue
addis r2, r12, 0x1d8
addi r2, r2, 0x70 ; TOC
...
addis r3, r2, 0x19
ld r3, -0x1ca8(r3) ; kmem_cache
Here r3 adjusting from r2 holding address of TOC to get address of kmem_cache. And this is how loading of constant looks like:lis r10, 0xa9
And yes - LIS is again alias PPC_INS_ALIAS_LIS, instruction ID is PPC_ADDIS.
ori r10, r10, 0xc00 ; A90C00
Official doc is very unclear regarding the possibility of having multiple TOCs. Theoretically if linker would put TOC somewhere in middle of .bss/.data/.rdata sections it can cover full 32bit address space. But what if the program has size bigger 4Gb or it was linked with code from several compilers like llvm & gcc? Unfortunately I don't have such samples so cannot say if my disasm will work correctly in case of having several TOCs