ipc/sem: optimize perform_atomic_semop()
This is the main workhorse that deals with semop user calls such that
the waitforzero or semval update operations, on the set, can complete on
not as the sma currently stands. Currently, the set is iterated twice
(setting semval, then backwards for the sempid value). Slowpaths, and
particularly SEM_UNDO calls, must undo any altered sem when it is
detected that the caller must block or has errored-out.
With larger sets, there can occur situations where this involves a lot
of cycles and can obviously be a suboptimal use of cached resources in
shared memory. Ie, discarding CPU caches that are also calling semop
and have the sembuf cached (and can complete), while the current lock
holder doing the semop will block, error, or does a waitforzero
operation.
This patch proposes still iterating the set twice, but the first scan is
read-only, and we perform the actual updates afterward, once we know
that the call will succeed. In order to not suffer from the overhead of
dealing with sops that act on the same sem_num, such (rare) cases use
perform_atomic_semop_slow(), which is exactly what we have now.
Duplicates are detected before grabbing sem_lock, and uses simple a
32/64-bit hash array variable to based on the sem_num we are working on.
In addition add some comments to when we expect to the caller to block.
[akpm@linux-foundation.org: coding-style fixes]
[colin.king@canonical.com: ensure we left shift a ULL rather than a 32 bit integer]
Link: http://lkml.kernel.org/r/20161028181129.7311-1-colin.king@canonical.com
Link: http://lkml.kernel.org/r/20160921194603.GB21438@linux-80c1.suse
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>