суббота, 13 сентября 2025 г.

practical ced usage: extracting sm machine ID

It's funny how paranoid nvidia trying to hide as much info from their customers as it can. One sample is so called "special registers" - even with PTX you can extract only limited set of them

So I played a bit with some undocumented SRs - namely with SR_MACHINE_ID_XX & SR_REGALLOC. I made legal loading of special registers and then patched those SASS instructions with my Ced

Lets see how those code looks in

PTX

      mov.u32         %r2, %tid.x;
      st.global.u32   [%rd2], %r2;
      mov.u32         %r3, %tid.y;
      st.global.u32   [%rd2+4], %r3;
      mov.u32         %r4, %tid.z;
      st.global.u32   [%rd2+8], %r4;
      // inline asm
      mov.u32 %r1, %smid;
      // inline asm
      st.global.u32   [%rd2+12], %r1;
      mov.u32         %r5, 21;
      st.global.u32   [%rd2+16], %r5;
As you can see - nothing special, just load in r2-r5 some values and store them in r12 holding address of function argument (obtained with cvta.to.global.u64)
 

SASS


   /*0000*/    MOV R1, c[0x0][0x28] ;
  /*0010*/    S2R R0, SR_TID.X ;
  /*0020*/    MOV R5, 0x15 ;
  /*0030*/    ULDC.64 UR4, c[0x0][0x160] ;
  /*0040*/    S2R R2, SR_TID.Y ;
  /*0050*/    S2R R3, SR_TID.Z ;
  /*0060*/    S2R R4, SR_VIRTUALSMID ;
  /*0070*/    STG.E.SYS [UR4+0x10], R5 ;
  /*0080*/    STG.E.SYS [UR4], R0 ;
  /*0090*/    STG.E.SYS [UR4+0x4], R2 ;
  /*00a0*/    STG.E.SYS [UR4+0x8], R3 ;
  /*00b0*/    STG.E.SYS [UR4+0xc], R4 ;


Very straightforward translation - UR4 holds address of output buffer and R0-R5 filled with "legal" special registers. Now lets think what we can patch

  • at offset 10 replace second operand of S2R to SR_MACHINE_ID_0 - name of this operand is SRa
  • the same goes to offsets 40, 50 & 60
  • and finally I want to fully replace loading of R5 with  S2R R5, SR_REGALLOC

What can this be useful for? Well, first of all, it's just nice to know that you can do it. Secondly, for example, you can restrict the execution of the cuda kernel code to specific GPU card.

Комментариев нет:

Отправить комментарий