Ash schrieb:
It doesnt make much sense to put the optimized ASM in
there, neither
is much hope of GCC having a good day and doing a lot of optimisation.
So far the best option would be the macro with a lookup table (only
one global kernel table tho).
What about using inline assembler in a macro - only as a platform
specific optimization of course?
result bsr inlined 46ffffe9
it took 1751088 21%
Using GCC this should be much faster when using
#define get_bits(value) \
({ \
int bits = -1; \
__asm("bsr %1, %0\n" \
: "+r" (bits) \
: "rm" (value)); \
bits; \
})
This macro returns -1 when no bits were set. I tested it and it works as
expected. When the -1 as "error" isn't suitable, you might want to
change it to 0 ... line 3 of the macro.
result macro 46ffffe9
it took 653692 7%
The table approach doesn't seem to be bad either <g>.
Regards,
Mark