пятница, 1 ноября 2024 г.
perl module for DWARF debug info parsing
воскресенье, 20 октября 2024 г.
binding c++ objects to perl
Sure there are lots of ways to do this (as usually in Perl: "there's more than one way to do it"), just to name few:
- swig - result looks not native and clumsy, also you need to make facade-like interfaces for really complex classes
- ffi - also has problems with complex classes
- XS - official and wide-spread way. Btw smoothest and most seamless Win32::OLE module uses it exactly
and besides you can simple call perl functions directly from c++ code (with all bells and whistles "dSP; ENTER; SAVETMPS" etc) like polymake do
So this article is just remaining for myself how to bind small-sized c++ classes to perl using XS. Lets begin with simplest one - just perl wrapper around Interval-Tree
- to create new skeleton: h2xs -cfn Module::Name
- patch Makefile.PL to add XSOPT="-C++" and change C compiler CC="g++", also highly likely we should add -lstdc++ to LIBS
- don't forget to add all source/header files to MANIFEST
Code is simple, I prefer to make my own MAGIC instead of implementation DESTROY method - you will see further why
четверг, 3 октября 2024 г.
TLS in gcc RTL
(insn 29 8 10 3 (set (reg:SI 0 ax [orig:82 _1 ] [82])
(mem/c:SI (const:DI (unspec:DI [
(symbol_ref:DI ("tls_state") [flags 0x2a] <var_decl 0x7f61bce052d0 tls_state>)
] UNSPEC_NTPOFF)) [2 tls_state.idx+0 S4 A32 AS1])) "stest.cc":15:37 81 {*movsi_internal}
(nil))
We can see attribute UNSPEC here - it can be extracted as XINT (in_rtx, 1) for GET_CODE == UNSPEC. The first problem here is that values UNSPEC_XXX are machine-specific. For example the same code while cross-compiling for mips:insn 60 13 61 (set (reg:SI 3 $3)
(unspec:SI [
(const_int 0 [0])
] UNSPEC_TLS_GET_TP)) "stest.cc":15:37 706 {*tls_get_tp_si_split}
(nil))
(insn 61 60 15 (set (reg:SI 2 $2 [201])
(reg:SI 3 $3)) "stest.cc":15:37 313 {*movsi_internal}
(nil))
(insn 15 61 16 (set (reg:SI 3 $3 [202])
(high:SI (const:SI (unspec:SI [
(symbol_ref:SI ("tls_state") [flags 0x2a] <var_decl 0x7ff96a9dfcf0 tls_state>)
] 306)))) "stest.cc":15:37 313 {*movsi_internal}
(nil))
Second - NTPOFF in comments marked as belonging to "Relocation specifiers" while
;; TLS support
UNSPEC_TP
UNSPEC_TLS_GD
UNSPEC_TLS_LD_BASE
UNSPEC_TLSDESC
UNSPEC_TLS_IE_SUN
Thirdly - lets recompile the same file with -fPIC option:(call_insn/u 9 8 11 3 (parallel [
You can notice that address of TLS var gathered with __tls_get_addr but next access to it not marked in any way - just regular chain set mem component_ref. Disgusting
(set (reg:DI 0 ax)
(call:DI (mem:QI (symbol_ref:DI ("__tls_get_addr")) [0 S1 A8])
(const_int 0 [0])))
(unspec:DI [
(symbol_ref:DI ("tls_state") [flags 0x10] <var_decl 0x7fcea13c92d0 tls_state>)
(reg/f:DI 7 sp)
] UNSPEC_TLS_GD)
]) "stest.cc":15:37 1048 {*tls_global_dynamic_64_di}
(expr_list:REG_EH_REGION (const_int -2147483648 [0xffffffff80000000])
(nil))
(nil))
(insn 11 9 46 3 (set (reg:SI 0 ax [orig:82 _1 ] [82])
(mem/c:SI (reg/f:DI 0 ax [88]) [2 tls_state.idx+0 S4 A32])) "stest.cc":15:37 81 {*movsi_internal}
(nil))
Just note for myself what else cannot be extracted from gcc RTL:
- pointer to members -
despite the fact that in file cp/cp-tree.def there are OFFSET_REF & PTRMEM_CST in real RTL they are just integer constants - pointer to member methods - similalry
- TLS are indistinguishable from regular global vars
- to be continued
понедельник, 30 сентября 2024 г.
gcc plugin to collect cross-references, part 9
Lets extract some useful results from my gcc plugin for collecting cross-references: 1, 2, 3, 4, 5, 6, 7 & 8.
I've noticed that plugin worked unbearably slowly on big source files (like compiling itself) - something above 30 minutes, so I add profiling to it (see details here). Profiler showed that consumed user-time was only 18 seconds - this is glare sign that plugin spending most of time on some kind of lock. After some meditation (and strace/ltrace) I finally found that root of problem was in sqlite - it sleeping in fdatasync on each data writing to its DB! The cure is to executePRAGMA synchronous=OFF
this gives a non-illusory x100 speed up. Can I now consider myself as one of mythical "x100 programmer", he-he?
The next problem is how to store data from multiple gcc processes into single sqlite DB (like parallel make -j X). I've decided to use Apache Thrift - old and unfashionable but it really works and you can make multithreaded RPC server in 500 lines of c++ code. Server just listen on some port and stores all data from RPC clients into sqlite DB. Command line arguments path2db port_number .To shutdown RPC server use rpcq port_number. Of course you can make your own implementation of RPC server, for example with PHP & DB/2 (unfortunately Thrift does not supports Cobol)
As a consequence now plugin has 3 implementations for data storing:
- in plain text files, self-test takes 14s
- into sqlite, self-test takes 18s
- in sqlite via RPC server listened on some arbitrary TCP port. Self-test takes 23s. Connection string looks like -fplugin-arg-gptest-db=localhost:port_number
Btw sqlite version has size 2.67Mb, Thrift rpc client - 4.65Mb. Something is fundamentally wrong with "modern frameworks"
DB schema
is very simple - in essence it consists of two tables:symtab for storing symbols
- id - primary key
- mangled name of symbol/function or value of literal
- fname - for functions this is name of file where they were declared
- bcount - amount of basic blocks of function, can be used as some complexity metric
xrefs to link symbols and functions from which they are referred
- id of function
- bb - index of basic block
- arg - if non-zero - index of function's argument
- what - id of referred symbol
- kind - one letter type of xref:
- 'c' - function call
- 'v' - call of virtual method
- 'l' - literal
- 'r' - just reference to some global symbol
- 'f' - field of some struct/class/union
- 'F' - constant (with -fplugin-arg-gptest-ic option)
Fetching results
пятница, 20 сентября 2024 г.
Tracking arguments of functions in gcc
I continuing to improve my gcc plugin for collecting cross-references: 1, 2, 3, 4, 5, 6 & 7. On this week I decided to see if I can extract source of complex types like records and most prominent kind of them is arguments of function - they are easy to identify in asm (but not so easy to bind them in gcc RTL expressions)
Having function declaration fdecl we can extract arguments with something like:
for (tree arg = DECL_ARGUMENTS (fdecl); arg; arg = DECL_CHAIN (arg))
{
auto a = DECL_RTL_IF_SET (arg);
if ( a && REG_EXPR(a) ) { // do something with argument in REG_EXPR(a)
I need only arguments that are a pointer or reference to record/union so I filtered them in method check_arg
Those was easiest part, and now try to find how arguments can be tracked in RTL.
воскресенье, 15 сентября 2024 г.
bug in gcc?
It seems that gcc not always put COMPONENT_REF when access fields of structures passed by reference. For example I add today simple static function append_name(aux_type_clutch &clutch, const char *name)
It has reference to field txt of structure aux_type_clutch but RTL looks like:
(insn 20 19 21 4 (set (reg/f:DI 0 ax [89])
First instruction just loads in register RAX parm_decl (of type aux_type_clutch) like
(mem/f/c:DI (plus:DI (reg/f:DI 6 bp)
(const_int -8 [0xfffffffffffffff8])) [258 clutch+0 S8 A64])) "gptest.cpp":1364:24 80 {*movdi_internal}
(nil))
(insn 21 20 22 4 (parallel [
(set (reg/f:DI 0 ax [orig:83 _2 ] [83])
(plus:DI (reg/f:DI 0 ax [89])
(const_int 8 [0x8])))
(clobber (reg:CC 17 flags))
]) "gptest.cpp":1364:24 230 {*adddi_1}
(expr_list:REG_EQUAL (plus:DI (mem/f/c:DI (plus:DI (reg/f:DI 19 frame)
(const_int -8 [0xfffffffffffffff8])) [258 clutch+0 S8 A64])
(const_int 8 [0x8]))
(nil)))
mov rax, [rbp+clutch]
and second add to RAX just some const 0x8 (offset to field txt):
add rax, 8
it's impossible from RTL to track back this constant to offset in COMPONENT_REF
What is more even strange - for methods you can track fields access for parameters passed by reference (like this) - for example in constructor of the same aux_type_clutch:
(insn 12 11 13 2 (set (mem:SI (plus:DI (reg/f:DI 0 ax [94])
(const_int 40 [0x28])) [4 this_12(D)->level+0 S4 A64])
(const_int 0 [0])) "gptest.cpp":465:4 81 {*movsi_internal}
(nil))
четверг, 5 сентября 2024 г.
hidden executable pages in linux kernel, part 2
In part 1 I've described how memory managed by hardware. Now lets dig into how kernel sees memory. Not surprisingly that we should check the same structures that malicious drivers update while hiding
Modules
- on kernel >= 6.4 mem[MOD_TEXT].base & mem[MOD_TEXT].size
- on kernel < 4.5 module_core & core_text_size
- otherwise core_layout.base & core_layout.text_size
vmap_area_list
False positives
среда, 28 августа 2024 г.
hidden executable pages in linux kernel
Standard method to find rootkits like this (or like this) is cross-scanning PTEs of memory without NX bit, then extract pages belonging to LKMs - thus in set difference we will gather hidden executable memory. Lets check how we can scan PTEs under linux
disclaimer
cat /boot/config-$(uname -r) | grep -E 'X86_5LEVEL|PGTABLE_LEVELS'
CONFIG_PGTABLE_LEVELS=5
CONFIG_X86_5LEVEL=y
So my kernel has 5 levels and this exactly correspond to hardware:
- pte_t - PTE, 9 bits of page address (size of page is 4096 bytes - low 12 bits)
- pmd_t - PDE. another 9 bits of address
- pud_t - PDPTE, 9 bits
- p4d_t - PML4, 9 bits
- pgd_t - PML5, 9 bits
Total 12 + 5 * 9 = 57bits
It's really amazing how memory management is implemented differently in different operating systems running on the same hardware. For example in Windows all PTE are located in huge contiguous sparse array and you can get address of PTE for some address of memory with very simple function MiGetPteAddress. Let MiGetPteAddress(addr1) is addr2. We then can continue this process for all paging levels - get PteAddress(addr2) and so on - to find if all 5 parts of address is valid. And this can be used in reverse direction - skip scan of huge PTEs areas if they are not presented in memory
Unfortunately in linux PTE not stored in one huge contiguous memory. So we need to start with top-level (from PGD) and scan all tables on lower levels. Root of pgd_t stored in init_mm->pgd. As usually var init_mm is not exported
<sarcasm>Linux widely known for the consistency, completeness and backward-compatibility of its API and being developers-friendly in general</sarcasm>
Next we need way to find valid pXX_t. Seems that there are functions pXX_present, pXX_bad and so on. The right sequence of calls is
- pXX_none
- pXX_leaf - this is damn good name for functions to check for large pages
- pXX_bad
- and finally pXX_offset to get item for next level
Unfortunately there are also so called hugeTLB pages (enabled with CONFIG_HUGETLB_PAGE):
grep 'HUGETLB_PAGE'
/boot/config-$(uname -r)
CONFIG_HUGETLB_PAGE=y
CONFIG_HUGETLB_PAGE_FREE_VMEMMAP=y
As you may expect functions pmd_huge & pud_huge non-exported too (and p4d_huge & pgd_huge are just dumb macros)
Finally we need to check if some page is executable. This is very hardware specific - for example
- Arc has flag _PAGE_EXECUTE
- aarch64 has flag _PAGE_KERNEL_EXEC
- powerpc has _PAGE_EXEC
- s390 has _PAGE_NOEXEC
so for some arch there is function pte_exec, while for another pte_no_exec. Also it's curious that there are no analogs for pud/pmd etc - so actually I have zero ideas how check executability for large & huge pages
However, this is not the end of suffering. Quick check:
grep address /proc/cpuinfo
shows that they lie - my hardware actually supports only 48bit addresses, so kernel should have only 4 levels of paging. Try to guess how they swept the trash under the carpet?
address sizes : 39 bits physical, 48 bits virtual
вторник, 20 августа 2024 г.
bpf_verifier_ops
Lets dissect some typical ebpf spyware. It sets up uprobes on
- SSL_read_ex
- SSL_read
- SSL_write_ex
- SSL_write
- gotls_write_register
- gotls_read_register
- gotls_exit_read_register
and uses bpf functions probe_read_user & probe_read_user_str to steal data and map_update_elem & ringbuf_submit to store data in bpf maps
How we can mitigate this?
Official way is to use LSM - function __sys_bpf calls security_bpf so we could register with security_add_hooks LSM hook with index bpf. This effectively prevents loading of ebpf program and sometimes is not what we want - for example in case of honeypots there is high chance that usermode program just will exits after failed ebpf program loading and you can't monitor which connections it will try to establish
Another way - is to patch bpf_func_proto for selected functions, like I did. However this is brutal method and affects all ebpf programs (I still believe that some is not spyware, he-he)
Luckily there is way to blind only some types of ebpf programs - method get_func_proto in bpf_verifier_ops. I made PoC to blind aforementioned 4 functions for BPF_PROG_TYPE_TRACING & BPF_PROG_TYPE_KPROBE only
Now we have another problem - how to check integrity of bpf_verifier_ops? I've also add this check in my lkcd. Example of output when PoC ublind is loaded looks like:
[24] type BPF_PROG_TYPE_TRACING at 0xffffffffc1357720 - ublind!s_trace_patched
get_func_proto: 0xffffffffc13551e0 - ublind!my_func_proto
is_valid_access: 0xffffffffaee24e20 - kernel!tracing_prog_is_valid_access
четверг, 25 июля 2024 г.
crowdstrike
since everyone (including "professional C++ programmers") has already written about so I will also write it down (for memory)
This was not first time: proof
Official report is very vague but we can conclude that root cause was bug in config parser leading to dereference of bad (it's unclear if it contained zero or just was uninitialized) pointer, as I assumed
Well, this is not big deal - I made thousands bsod/kernel panics. There usually long path from developers machine to production, right?
Then I read second report - and this is pure madness. They DIDN'T TESTED IT AT ALL!!! Proof is very simple - lets assume we have bug with probability of bsod P. Then for N test machines probability to not catch this bug is (1-P) ^ N. for P = 0.5 (bsod happens or not, he-he) and 8 test machines this probability 0.4%. Given that in this incident probability of bsod was close to 1 - size of their test cluster is exactly 0
Instead they used some kind of spell-checkar Content Validator. I'm almost sure that they were bitten by adherents of totalitarian eBPF sect with their “20,000 lines of code” verifier (which exists solely to drink blood and ruin lives also unable to prevents cases like this)
Well, this is already serious problems but still not disaster - like you can deploy only for very small percent of users, employ fuse to prevent spreading of malicious update (for example by tracking heartbeats from updated machines) and so on. Right? ...
So right recipe for catastrophe from group of rebels:
- never test anything - static validator is enough
- deploy in Friday
- deploy to all users at the same time
вторник, 23 июля 2024 г.
moving string literals to discardable sections
gcc
- in case of gcc plugin we must operate on RTL bcs some functions can be inlined (or even fully eliminated) - you just cannot do it in GIMPLE
- we need some tracking of cross-references between functions and literals
Unfortunately gcc does not have such tracking - all string literals stored in string pool. To make live even worse sometimes string pool belongs to function. From this comment
Only a few targets need per-function constant pools. Most can use one per-file pool
we can conclude that actually nobody ever understand when per-function constant pools are used. Probably I could patch my gcc plugin to add tracking of string literals, finding candidates for eviction to discardable section and marking them with __section__
attribute. However debugging of RTL gcc-plugins is real nightmare
Patching obj files
- currently LIEF (or binutils) cannot do it
- we need to patch not only relocs but also DWARF debug info and symbol table - bcs after removing of some string from .rodata addresses of all entries below must be changed
In general patching of binary is not perfect idea, especially if there is good alternative
четверг, 11 июля 2024 г.
ebpf map as communication channel
- ioctls
- read from driver/write to driver + possible employ polling
- file in kernfs
- futex in shared memory
- io_uring
- netlink like auditd or XFRM do
etc
Let's look at typical situation - your IT crew was bitten by violent adherent of the totalitarian sect of rust ebpf Witnesses and as a consequence you now have hundreds of ebpfs (and no single person who know how this pile of spaghetti code works)
So I made simple POC to show how you can do it. Source of driver & userland test program
Lets dive into gory details and see what other non-obvious advantages can be obtained from such heretical crossbreeding
понедельник, 1 июля 2024 г.
dumping ebpf kind_ops
четверг, 20 июня 2024 г.
frame sizes in dwarfdump
Add today dumping of stack frame sizes to my dwarfdump (well, where they are exists). Format of .debug_frame section obviously was invented by martian misantrophes so patch is huge and ugly
Sample of output for some random function from mips kernel:
// Addr 0x183D27C .text
// Frame Size 18
// FileName: drivers/char/random.c
// LocalVars:
// LVar0, tag 6965A71
// int ret
// LVar1, tag 6965A8F
// bool branch
ssize_t random_read_iter(struct kiocb* kiocb,struct iov_iter* iter);
and the same in JSON format:
"110516780":{"type":"function","file":"drivers/char/random.c","type_id":"110427157","name":"random_read_iter","addr":"25416316","section":".text","frame_size":"24","params":[{"name":"kiocb","id":"110516807","type_id":"110466611"},{"name":"iter","id":"110516828","type_id":"110470510"}],"lvars":["110516849":{"type":"var","file":"drivers/char/random.c","owner":"110516780","type_id":"110426600","name":"ret"},
"110516879":{"type":"var","file":"drivers/char/random.c","owner":"110516780","type_id":"110427073","name":"branch"}]}
Unfortunately dwarfdump can't work with kernel modules bcs they are actually just object files and for this sad reason they have relocations even for debug sections. So to properly deal with this files I need to apply relocations first and this is arch-specific action (which I prefer to avoid)
Binutils has 2 solution of this problem:
- objcopy calls bfd_simple_get_relocated_section_contents from libbfd.so and this means that tool should have dependency from it
- readelf has it's own relocation code in apply_relocations and this is huge pile of code
And I really don’t like both of the above approaches
вторник, 18 июня 2024 г.
function stack size in GCC
DW_AT_frame_base
block, right?NO
- flag_callgraph_info is set - corresponding to option -fcallgraph-info
- flag_stack_usage_info is set - corresponding to options -fstack-usage
There should probably be heartbreaking conclusion about quality of opensource in general and gcc and in particular...
суббота, 15 июня 2024 г.
stack frames size in DWARF
As you might suspect, the stack size in the kernel is quite meager so it's very important to know how much of stack occupy your driver. So I conducted inhumane experiments on my own driver lkcd to check if stack frame size can be extracted from DWARF debug info. Biggest function in my driver is lkcd_ioctl so lets explore it
mips
addiu sp,sp,-688
воскресенье, 9 июня 2024 г.
kotest fix for mips
$LC0:
.ascii "const string %f\012\000"
At the same time it does for named literals:
.type fmt_msg, @object
.size fmt_msg, 15
fmt_msg:
.ascii "enter with %d\012\000"
I am too lazy to investigate which ancient specification from past century it follows. Fortunately this is easy repairable problem - just calculate size of symbol as distance to next one (or till end of section). Since I suspect that this is not the only architecture with similar gcc behavior, I add -f option to do such kind of sizes recalculation
Some results for mips32 kernel 6.0:
find ~/linux60/ -type f -name "*.ko" | xargs ./kotest | awk -f total.awk
1890293
1497092
понедельник, 27 мая 2024 г.
kotest
Linux kernel allows you to have discardable sections in LKM and this creates problem of links between two kind of memory. As you can guess keeping pointer to already unloaded area can be very dangerous so I made simple tool kotest to check such kind of links. It divides sections of ELF file into two category and check all relocations - relocs between areas of the same type considered as ok. To keep track if some symbol from persistent area is used only from discardable sections I also use couple of reference counts
command line options
- -b take into account variables in .bss
- -h make hexdump of found vars
- -v verbose mode
find path_to_kernel_root -type f -name "*.ko" | xargs kotest
it is reliable to use for analysis only fixups?
.init.text:0000000000016155 mov rdi, offset ip_vs_genl_family
.init.text:000000000001615C mov cs:ip_vs_genl_family.module, offset __this_module
.init.text:0000000000016167 mov cs:ip_vs_genl_family.ops, offset ip_vs_genl_ops
.init.text:0000000000016172 mov cs:ip_vs_genl_family.mcgrps, 0
.init.text:000000000001617D mov qword ptr cs:ip_vs_genl_family.n_ops, 10h
.init.text:0000000000016188 call __genl_register_family
.rodata + 5A0 (ip_vs_genl_ops) rref 1 xref 0 add size 768
why not use famous objtool?
LKM loading
четверг, 16 мая 2024 г.
linux input handles
Try convince me that input_register_handle is not best place for installing keylogger, it's even strange that they were embarrassed to connect there their holy cow eBPF. Long story short - there are 3 structures in linux kernel for servicing of input devices:
- input_dev chained in list (sure non-exported) input_dev_list
- input_handler chained in list input_handler_list
- input_handle with pointer to input_handler and attached to input_dev (in list h_list)
So keylogger could
- just call input_register_handle
- to be more stealthy - patch functions pointers in already registered input_handler (very convenient that sysrq_handler missed out method event)
- attach own input_handle to desired input_dev but without registering corresponding input_handler - yes, this is perfectly legal
- patch functions pointers directly in input_dev
Guess in three tries what exactly you can extract from sysfs?
So I add to my lkcd dumping of all above-mentioned structures. Sample of output
среда, 8 мая 2024 г.
asm injection stub
- bcs our stub can be called from two different hooks we should store somewhere via which entry point we was called
- restore old hooks values
- call dlopen/dlsym and then target function (and pass it address of injection stub for delayed munmap. No, you can't free those memory directly in your target function - try to guess why)
- get right old hook and jump to it if it was installed or just return to code called __malloc_hook somewhere in libc
So I collected all parameters to do job in table dtab consisting from 6 pointers
- __malloc_hook address
- old value of __malloc_hook
- __free_hook address
- old value of __free_hook
- pointer to dlopen
- pointer to dlsym
четверг, 2 мая 2024 г.
yet another linux process injection
As you my know there are two methods
1) using LD_PRELOAD, don`t work if you want to inject into already running (and perhaps even many days) process
2) ptrace. Has the following inherent disadvantages
- target process can be ptraced by somebody else
- victim program can detect ptrace
- you just want to avoid in logs something like ptrace attach of "./a.out"[PID] was attempted by "XXX"
So I developed very rough analogs of famous VirtualAllocEx/VirtualProtectEx + simple hook to hijack execution onto assembly written shellcode to call dlopen/dlsym. Currently only x86_64 supported bcs I am too lazy to rewrite this asm stub
Prerequisites
You must have root privileges and be able to build and load kernel modules. I tested code on kernel 6.8, 5.15 and probably it also can work on 4.x, not sure about more old versions
Lets start lighting the dirty details in reverse order
суббота, 27 апреля 2024 г.
gcc: placing strings in arbitrary section
As you may know gcc always placing string literals in section .rodata. Let's assume what we want to change
this outrageous behavior - for example for shellcode literals used in function init_module (contained in section .init.text)
We can start with dumping of gcc RTL - for something like printf("some string") RTL will be symbol_ref <var_decl addr *.LC1> and in .S file this looks like
.section .rodata
.LC1:
.string "some string"
That unnamed VAR_DECL has attribute DECL_IN_CONSTANT_POOL. Probably it is possible to make gcc plugin to collect such literals referring from functions inside specific section and instead of DECL_IN_CONSTANT_POOL patch them section attribute. However this requires too many labour so lets try something more lazy
Possible solutions is to explicitly set section via gcc __attribute__:
#define RSection __attribute__ ((__section__ (".init.text")))
#define _RN(name) static const char rn_##name[] RSection =
#define _GN(name) rn_##name
...
_RN(dummy_str) "some string";
printf("%s\n", _GN(dummy_str));
Looks very ugly. And even worse - this raises compilation error:
error: ‘rn_dummy_str__’ causes a section type conflict with ‘init_module’
11 | #define _RN(name) static const char rn_##name##__[] __attribute__ ((section (".init.text"))) =
How we can fix this problem? My first thought was to write quick and dirty Perl script to scan sources for _RN markers and produce .S file where all strings were placed in right section. But then I decided to overcome my laziness and made patch for gcc - it just checks if passed declaration is initialized with STRING_CST value. Surprisingly, it works!
воскресенье, 31 марта 2024 г.
netfilter hooks
- field nf_hooks_ingress in net_dev (when CONFIG_NETFILTER_INGRESS enabled)
- on more new kernels also field nf_hooks_egress in net_dev (when CONFIG_NETFILTER_EGRESS enabled)
- lots of fields in struct netns_nf:
- hooks_ipv4
- hooks_ipv6
- hooks_arp (CONFIG_NETFILTER_FAMILY_ARP)
- hooks_bridge (CONFIG_NETFILTER_FAMILY_BRIDGE)
- hooks_decnet (CONFIG_NETFILTER_FAMILY_DECNET)
lkmem -c -n ../unpacked/101 /boot/System.map-5.15.0-101-generic
...
2 nf hooks:
[0] type 02 IPV4 idx 0 0xffffffffa7b84dd0 - kernel!apparmor_ipv4_postroute
[1] type 10 IPV6 idx 0 0xffffffffa7b84e10 - kernel!apparmor_ipv6_postroute
пятница, 8 марта 2024 г.
Profiling shared libraries on linux
Disclaimer: proposed approach uses dirty hacks & patches and tested on x86_64 only so use it at your own risk. Also no chatGPT or some another Artificial Idiots were used for this research
Lets assume that we have shared library (for example R extension or python module) and we want to know where and why it spending many hours and consuming megawatts of electricity. There is even semi-official way to do this:
- compile shared library with -pg option
- set envvar LD_PROFILE_OUTPUT to directory where you want to store profiling data
- set envvar LD_PROFILE to filename of library to profile
- run your program. Well, sounds that you need lots of things to do before this step and you can`t set up profiling dynamically
- run sprof on profiling log
Unfortunately this method just don`t work - sprof fails with cryptic message Inconsistency detected by ld.so: dl-open.c: 890: _dl_open: Assertion `_dl_debug_initialize (0, args.nsid)->r_state == RT_CONSISTENT' failed!
воскресенье, 14 января 2024 г.
failed attempts to draw graphs
CSES has several really hard graph-related tasks, for example
- New Flight Routes with directed graph (btw this task was borrowed from russian olympiad contest)
- Forbidden Cities with undirected graph
It would be a good idea to visualize those graphs. One of well-known tool to do this is Graphviz, so I wrote simple perl script to render graph from CSES plain text into their DSL. On small graphs all goes well and we can enjoy with something like
But seems that on big graphs with 200k nodes dot just can`t finish rendering and after ~2 hours of hard work met with OOM killer. Lets think how we can reduce size of graph
четверг, 4 января 2024 г.
Distinct Colors
I`ve solved yet another very funny CSES task - it looks very similar to another task called "Reachable Nodes" (my solution for it). The only difference is that we asked to count not unique nodes but colors of nodes. What can go wrong?
And this is where funny part begins - my patched solution got crashes. gdb didn`t showed nothing interesting. However I remember scary cryptic command to show stack usage:
print (char *)_environ - (char *)$sp
$1 = 8384904
Very close to default 8Mb (check ulimit -s). Wait, WHAT? Do we really have stack exhausting? Lets check - 8 * 1024 * 1024 = 8388608 bytes. Tree can have 200000 nodes. 8388608 / 200000 = ~42 bytes for each recursive DFS call. Seems to be true - in each call we store return address + stack frame RBP + 3 registers holding args (this, indexes of node and parent) - so at least 5 * 8 = 40 bytes. It`s so happened that some tests contain tree with very long stem from root till end, so yes - recursive DFS cannot visit all nodes in such tree. Solution is simple - we can emulate recursion with std::stack. As bonus for all nodes in stack we can use single bit mask to save space
Another unpleasant observation is that trees in tests ain't BINARY trees. When one picture is worth a thousand words:
Degree of node 2 is 4. This is main reason why function dfs has separate branch for processing joint nodes with only 2 descendants - bcs initially method is_fork returned only left and right