@Timo, @Thomas, @Ged
Byte order swappers should always be fastcall for perfomance reasons.
They have no need for the benefits of the cdecl call convention.
Using cdecl in this case would make the binary code pitifully slow.
Think about it for a bit.. Some pseudocode show what I mean:
..CDECL...
     push hi
     push lo
     call Swapper
     mov  dsthi, edx
     mov  dstlo, eax
     add  esp, 4
     ...
// UINT64 __cdecl
Swapper:
     push  ebp
     mov   ebp, esp
     mov   eax, ebp+8  // lo
     mov   eax, ebp+12 // hi
     bswap
     xchg  eax, edx
     bswap
     pop   ebp
     ret
..FASTCALL...
     mov  edx, hi
     mov  eax, lo
     call Swapper
     mov  dsthi, edx
     mov  dstlo, eax
     ...
// UINT64 __declspec(naked) __fastcall
Swapper:
     bswap
     xchg eax, edx
     bswap
     ret
Sadly the compiler designers were not (yet) clever enough
to make the fastcall regs EAX, EDX, ECX, in that exact order,
but even as it stands today..
Swapper:
     mov   eax, ecx
     bswap
     xchg  eax, edx
     bswap
     ret
(If you actually link against a binary swapper compiled out of your
control with
cdecl convention, the argument falls of course, as you must comply with
the binary.)
Keep up the good work..
Best Regards
// Love