Skip to content

Commit 3087fd6

Browse files
committed
Add heuristics to suggest absolute addresses
When looking for relative addresses, perform heuristic checks for absolute addresses. The original heuristic was just looking for negative numbers to be the offsets. This is insufficient because a kernel virtual address for a kernel loaded at an absolute address will almost certainly be in the top half of the virtual address space which would be a negative number. 2 heuristics have been added. The first is to check the top 3 nybbles are 0xFFF. True negative numbers are unlikely to be *THAT* negative. The kernel will be on the order of a few 10s of MB The second heuristic is to check for zeros in the top byte using the mask 0x3F. This assumes the kernel is loaded near the bottom of the kernel address space and will catch the 3G/1G split. Strictly speaking the second heuristic should never trip if the first one doesn't.
1 parent 9f88c33 commit 3087fd6

File tree

1 file changed

+32
-0
lines changed

1 file changed

+32
-0
lines changed

vmlinux_to_elf/kallsyms_finder.py

+32
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
from enum import Enum
1010
from sys import argv, stdout
1111
import logging
12+
import math
1213

1314
try:
1415
from architecture_detecter import guess_architecture, ArchitectureName, architecture_name_to_elf_machine_and_is64bits_and_isbigendian, ArchitectureGuessError
@@ -966,6 +967,37 @@ def find_kallsyms_addresses_or_symbols(self):
966967

967968
if self.has_base_relative:
968969
number_of_negative_items = len([offset for offset in tentative_addresses_or_offsets if offset < 0])
970+
971+
# Many kerenels put their addresses in the upper half of the
972+
# virtual address space. This means that many of the addresses
973+
# will look like negative numbers. On the other hand, there
974+
# should be the same zeros in the high byte(s). A true
975+
# negative will probably have the top 3 nybbles or so as
976+
# 0xfff00000. Lets check this as well. Lets perform these
977+
# checks
978+
BITS = 64 if self.is_64_bits else 32
979+
NEGATIVE_HEURISTIC_MASK = 0xFFF << (BITS - 12) # Mask for the top 3 nybbles
980+
ABSOLUTE_HEURISTIC_MASK = 0x3f << (BITS - 8) # Mask for zeros in the top byte
981+
982+
heuristically_negative = len([offset for offset in tentative_addresses_or_offsets if (offset & NEGATIVE_HEURISTIC_MASK) == NEGATIVE_HEURISTIC_MASK])
983+
heuristically_absolute = len([offset for offset in tentative_addresses_or_offsets if (offset & ABSOLUTE_HEURISTIC_MASK) == 0])
984+
985+
heuristic_negative_percent = heuristically_negative / len(tentative_addresses_or_offsets)
986+
heuristic_absolute_percent = heuristically_absolute / len(tentative_addresses_or_offsets)
987+
988+
if heuristic_negative_percent < 0.5:
989+
logging.warning(f'[!] WARNING: Less than half ({math.trunc(heuristic_negative_percent * 100)}%) of offsets are negative')
990+
logging.warning( ' You may want to re-run this utility, overriding the relative base')
991+
992+
if heuristic_absolute_percent > 0.5:
993+
logging.warning(f'[!] WARNING: More than half ({math.trunc(heuristic_absolute_percent * 100)}%) of offsets look like absolute addresses')
994+
logging.warning( '[!] You may want to re-run this utility, overriding the relative base')
995+
996+
if heuristic_absolute_percent > 0.5 or heuristic_negative_percent < 0.5:
997+
logging.info( '[i] Note: sometimes there is junk at the beginning of the kernel and the load address is not the guessed')
998+
logging.info( ' base address provided. You may need to play around with different load addresses to get everything')
999+
logging.info( ' to line up. There may be some decent tables in the kernel with known patterns to line things up')
1000+
logging.info( ' heuristically, but I have not explored this yet.')
9691001

9701002
logging.info('[i] Negative offsets overall: %g %%' % (number_of_negative_items / len(tentative_addresses_or_offsets) * 100))
9711003

0 commit comments

Comments
 (0)