module: Fix performance regression on modules with large symbol tables
Commit
554bdfe5acf3715e87c8d5e25a4f9a896ac9f014 (module: reduce string
table for loaded modules) introduced an optimization to shrink the size of
the resident string table. Part of this involves calling bitmap_weight()
on the strmap bitmap once for each core symbol. strmap contains one bit
for each byte of the module's strtab.
For kernel modules with a large number of symbols, the addition of the
bitmap_weight() operation to each iteration of the add_kallsyms() loop
resulted in a significant "insmod" performance regression from 2.6.31
to 2.6.32. bitmap_weight() is expensive when the bitmap is large.
The proposed alternative optimizes the common case in this loop
(is_core_symbol() == true, and the symbol name is not a duplicate), while
penalizing the exceptional case of a duplicate symbol.
My test was run on a 600 MHz MIPS processor, using a kernel module with
15,000 "core" symbols and 10,000 symbols in .init.text. .strtab takes up
250,227 bytes.
Original code: insmod takes 3.39 seconds
Patched code: insmod takes 0.07 seconds
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>