Am 03.06.2011 21:48, schrieb Alex Ionescu:
What about on UP, where a strong compiler will realize
that the call
only raises IRQL to dispatch, and since these apis are in NTOSKRNL,
they might get in-lined or modified.
This would only happen, if the function is
either delcared as inline or
inside the same translation unit. Both of that is not the case here.
If it was the case (say we inlined KeAcquireQueuedSpinLock) then we
would need to add an explicit memory barrier to that inline function.
On AMD64, raising the IRQL means only modifying cr8.
So the function will become:
OldIrql == __readcr8();
__writecr8(DISPATCH_LEVEL);
inc dword ptr [reg];
__writecr8(OldIrql);
Is ordering still guaranteed in this scenario?
No, not in this case. (Which is IMO
a bad definition of KeRaiseIrql or a
bad implementation of __writecr8) But how does the use of volatile fix
this? It doesn't!
MSDN says about volatile objects:
"A write to a volatile object (volatile write) has Release semantics; a
reference to a global or static object that occurs before a write to a
volatile object in the instruction sequence will occur before that
volatile write in the compiled binary."
But we would need release + acquire semantics. and even then it wouldn't
guarantee that __writecr8() is considered "reference to a global or
static object". And gcc might even behave different.
Luckily at least all Spinlock functions have an implicit memory barrier
and are thus not prone to this problem.
So instead of adding volatile here and there and random places or even
to all shared data structures, we need to make sure, we stick to some
locking rules, which would mean not to expect KeRaiseIrql() to provide
an implicit memory barrier and to make sure all inlined spinlock
functions will have a proper memory barrier.
Regards,
Timo