windows deep internals: октября 2025

вторник, 28 октября 2025 г.

sass disasm on perl

as an illustration of the use of the modules presented in my previous post I made yet another sass disasm - fully written on Perl. It is almost exact copy of my nvd - implemented just in 460 LoC, the only unsupported feature is registers tracking - bcs I still don't make perl binding for it. What it can do better than original nvdisasm:

shows LUT operations
shows instructions properties/predicates
shows relocs for each code section
shows const bank params

and the most important thing - bcs it's based on Ced - you can patch any instruction from your script. Or customize output/save it somewhere like DB via Perl DBI/add your own passes to reveal some dirty nvidia secrets

Barriers

Дальше »

пятница, 17 октября 2025 г.

perl modules for CUBINs patching

After playing a bit with my ced I came to the conclusion that implemented DSL for editing is not enough - like it would be good to have subroutines to patch repeated/similar instructions, check that patched instruction is what I want, patch attributes/relocs etc

In other words, I need full-fledged PL. Although I've read books series "modern compiler implementation" from Andrew Appel and "crafting interpreters" I think making my own PL is overkill, so I made several XS modules to edit/patch CUBIN files for Perl. Why Perl?

I am able to write on it almost all I want
when I can't - I can always to develop my own module(s)
yet I don't feel sick like from pseudo languages like python
and it damn good and fast when you try to sketch out prototypes for things you have no idea how to make

ELF::FatBinary

source

for extracting/replacing CUBIN files from FatBinaries

see details here

Cubin::Ced

source

In essence this is wrapper around Ced - it allows you to disasm/patch SASS instructions

Currently it don't support registers tracking

See doc in POD format

Cubin::Attrs

source

Module to extract/patch attributes of CUBIN files + also relocs

doc in POD format

Sample

Дальше »

среда, 1 октября 2025 г.

addresses of cuda kernel functions

Quote from official document:

It is not allowed to take the address of a __device__ function in host code

I haven't been surprised for a long time that entire CUDA is made up of ridiculous restrictions. What if I told you that paranoid nvidia lies as usually and actually you can get addresses of kernel functions in your host code?

But first lets check what workarounds we can employ to have functions pointers. I don't know for what pedagogical purpose this code intentionally was written so poorly and does not free the allocated memory - and now millions of brainless artificial idiots will copy-paste it forever, so I made patched version. You can realize that attempt to read from early gathered with cudaMemcpyFromSymbol addresses will results error 1 (invalid argument)

Ok. but we could just return address of function directly from another kernel function, right? So I made quick & dirty hack

I brute-forced all combinations of cf1(__device__/__constant__) & variants of cudaMemcpyFromSymbol/cudaMemcpy - and with no luck

So it's time to run

cuda-gdb

Дальше »

вторник, 28 октября 2025 г.

sass disasm on perl

Barriers

пятница, 17 октября 2025 г.

perl modules for CUBINs patching

ELF::FatBinary

Cubin::Ced

Cubin::Attrs

Sample

среда, 1 октября 2025 г.

addresses of cuda kernel functions

cuda-gdb

вторник, 28 октября 2025 г.

пятница, 17 октября 2025 г.

среда, 1 октября 2025 г.