Standard method to find rootkits like this (or like this) is cross-scanning PTEs of memory without NX bit, then extract pages belonging to LKMs - thus in set difference we will gather hidden executable memory. Lets check how we can scan PTEs under linux
disclaimer
cat /boot/config-$(uname -r) | grep -E 'X86_5LEVEL|PGTABLE_LEVELS'
CONFIG_PGTABLE_LEVELS=5
CONFIG_X86_5LEVEL=y
So my kernel has 5 levels and this exactly correspond to hardware:
- pte_t - PTE, 9 bits of page address (size of page is 4096 bytes - low 12 bits)
- pmd_t - PDE. another 9 bits of address
- pud_t - PDPTE, 9 bits
- p4d_t - PML4, 9 bits
- pgd_t - PML5, 9 bits
Total 12 + 5 * 9 = 57bits
It's really amazing how memory management is implemented differently in different operating systems running on the same hardware. For example in Windows all PTE are located in huge contiguous sparse array and you can get address of PTE for some address of memory with very simple function MiGetPteAddress. Let MiGetPteAddress(addr1) is addr2. We then can continue this process for all paging levels - get PteAddress(addr2) and so on - to find if all 5 parts of address is valid. And this can be used in reverse direction - skip scan of huge PTEs areas if they are not presented in memory
Unfortunately in linux PTE not stored in one huge contiguous memory. So we need to start with top-level (from PGD) and scan all tables on lower levels. Root of pgd_t stored in init_mm->pgd. As usually var init_mm is not exported
<sarcasm>Linux widely known for the consistency, completeness and backward-compatibility of its API and being developers-friendly in general</sarcasm>
Next we need way to find valid pXX_t. Seems that there are functions pXX_present, pXX_bad and so on. The right sequence of calls is
- pXX_none
- pXX_leaf - this is damn good name for functions to check for large pages
- pXX_bad
- and finally pXX_offset to get item for next level
Unfortunately there are also so called hugeTLB pages (enabled with CONFIG_HUGETLB_PAGE):
grep 'HUGETLB_PAGE'
/boot/config-$(uname -r)
CONFIG_HUGETLB_PAGE=y
CONFIG_HUGETLB_PAGE_FREE_VMEMMAP=y
As you may expect functions pmd_huge & pud_huge non-exported too (and p4d_huge & pgd_huge are just dumb macros)
Finally we need to check if some page is executable. This is very hardware specific - for example
- Arc has flag _PAGE_EXECUTE
- aarch64 has flag _PAGE_KERNEL_EXEC
- powerpc has _PAGE_EXEC
- s390 has _PAGE_NOEXEC
so for some arch there is function pte_exec, while for another pte_no_exec. Also it's curious that there are no analogs for pud/pmd etc - so actually I have zero ideas how check executability for large & huge pages
However, this is not the end of suffering. Quick check:
grep address /proc/cpuinfo
shows that they lie - my hardware actually supports only 48bit addresses, so kernel should have only 4 levels of paging. Try to guess how they swept the trash under the carpet?
address sizes : 39 bits physical, 48 bits virtual