As you can see from function bpf_map_alloc_id all bpf maps stored in balanced tree map_idr and synced on spinlock map_idr_lock. No surprise that you can`t view them in user-mode - there is bpf command BPF_MAP_GET_NEXT_ID but it can only enumerate ID of maps. So I add today some code to view bpf maps: lkmem -c -d -B gives output like
bpf_maps at 0xffffffff929c1880: 15
[0] id 3 UDPrecvAge at 0xffff99e344f48000
type: 1 BPF_MAP_TYPE_HASH
key_size 8 value_size 8
[1] id 4 UDPsendAge at 0xffff99e344cb4c00
type: 1 BPF_MAP_TYPE_HASH
key_size 38 value_size 8
also
disasm of jitted ebpf code began to look better:
mov rdi, 0xffff99e344f48000 ; UDPrecvAge
call 0xffffffff90c191f0 ; __htab_map_lookup_elem
This letter explains that JIT replacing sequence of opcodes
bpf_mov r1, const_internal_map_id
bpf_call bpf_map_lookup
with direct loading of 64bit address of map (BPF_LD_IMM64 pseudo op). But this code is not optimal - every instruction occupy 10 bytes. Lets consider case where we employ constants pool and put all map addresses somewhere after function - sure this will require at least 8 bytes for each address + perhaps some space for alignment. But now we can produce code like:
mov rdi, qword [map1_addr wrt rip] ; 7 bytes
call
__htab_map_lookup_elem
...
; somewhere after function
map1_addr: resq 1 ; jit should put real address of map here
if function has 3 or more reference to the same map we can have some decreasing of jitted code size