Add hcall_entry and hcall_exit tracepoints. This replaces the inline
assembly HCALL_STATS code and converts it to use the new tracepoints.
To keep the disabled case as quick as possible, we embed a status word
in the TOC so we can get at it with a single load. By doing so we
keep the overhead at a minimum. Time taken for a null hcall:
No tracepoint code: 135.79 cycles
Disabled tracepoints: 137.95 cycles
For reference, before this patch enabling HCALL_STATS resulted in a null
hcall of 201.44 cycles!
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>