понедельник, 24 февраля 2025 г.

ptx internals

It seems that syntax of PTX is undocumented - at least I was unable to find actual BNF grammar or reference implementation. Grammar from antlr project is greatly out-dated and don't contain instructions like ldmatrix/istypep (and btw it has 3505 lines - much bigger than c++14 syntax)

Another project is much better however it's also incomplete. Official documentation is most vague & obscure reading for last years (previous champion) - it contains word "deprecated" 24 times. So it's time to do some disassembling

First remarkable thing is lots of high-entropy parts of data in .rodata section - this is glare sign of encryption, so I made decryptor. After looking at encrypted data we can discover several undocumented features

macros

Who would have guessed that ptxas has typed macros in form
.MACRO name [args];
{ body of macro }
.ENDMACRO
 
Unfortunately built-in macros are processing inside ptxas:
  mov     rbx, cs:macroMap_ptr
  mov     rdx, cs:ptxMacroFuncsFermi_membar_ptr
  lea     rsi, aMembar                  ; "membar"
  mov     rdi, [rbx]
  call    add_hash
  mov     rdi, [rbx]
  mov     rdx, cs:ptxMacroFuncsFermi_cvt_ptr
  lea     rsi, aCvt                     ; "cvt"
  call    add_hash

Functions like ptxMacroFuncsFermi_cvt have lots of "sprintf"s, output can be intercepted with ltrace/frida on strcpy

 

conditional compilation

in form .IF ... .ELIF ... .ELSE ... .ENDIF. And it even supports .in operation! List of pre-defined symbols

 

pragma values

List of values for ptxas v10. However you can't discover those strings inside v12 - bcs guess what? they are encrypted again and algo is not the same as for macros (I wonder what exactly they are afraid of - that amd will steal all their ideas and finally write their own PTX translator?). decryptor & list for v12

 

parsing of instructions 

and finally let's check how they parse numerous instructions. They use clever trick - lookup tables (1 & 2) to identify type of used registers (first column) for each instruction depending from it's type suffix (last column) - this allows them to keep parser of instructions tiny

Комментариев нет:

Отправить комментарий