target cudacore /full/path/to/coredumpand then type lots of info cuda XXX
So last weekend I wrote tool to parse/dump CUDA coredumps and it even works on machine without CUDA SDK (what might be useful if you collect all crash dumps to some centralized storage with help of CUDA_COREDUMP_PIPE)
But first
Little bit of theory
Second group contains whole thread hierarchy:
- list of SMs in .cudbg.smtbl.devX section
- list of CTA in .cudbg.ctatbl.devX.smY sections
- list of WARPs in .cudbg.wptbl.devX.smY.ctaZ sections
- and finally list of threads in each warp - in sections .cudbg.lntbl.devX.smY.ctaZ.wpI
Each thread has own set of sections:
- for call stack - .cudbg.bt.devX.smY.ctaZ.wpI.lnJ
- registers in .cudbg.regs.devX.smY.ctaZ.wpI.lnJ
- predicates in .cudbg.pred.devX.smY.ctaZ.wpI.lnJ
- local memory in .cudbg.local.devX.smY.ctaZ.wpI.lnJ. Curious that those sections has the same addresses
Where get faulty instruction address
- for driver with version >= 555 SM has field errorPC
- WARP has field errorPC too
- finally each lane has fields exception & virtualPC in CudbgThreadTableEntry
Installation
nvidia-smi -q | head
Timestamp : Mon Jan 26 17:12:45 2026
Driver Version : 535.183.01
CUDA Version : 12.2
Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : NVIDIA Very Expensive Card
Command line options
- backtraces with -b option
- grids with -g
- registers/predicates with -r
- CTA/WAPR/threads with -t
Bcs dump can be huge you can restrict it only to WARPs/threads with faulty instructions using -e option
To setup right version of driver use -D option
Happy debugging!

Комментариев нет:
Отправить комментарий