Am 04.06.2011 01:12, schrieb Alex Ionescu:
On 2011-06-03, at 5:31 PM, Timo Kreuzer wrote:
Am 03.06.2011 21:48, schrieb Alex Ionescu:
What about on UP, where a strong compiler will
realize that the call
only raises IRQL to dispatch, and since these apis are in NTOSKRNL,
they might get in-lined or modified.
This would only happen, if the function is
either delcared as inline
or inside the same translation unit. Both of that is not the case here.
If it was the case (say we inlined KeAcquireQueuedSpinLock) then we
would need to add an explicit memory barrier to that inline function.
This isn't true -- a modern optimizing compiler uses link-time code
generation which crosses compilation units and inline semantics.
For example, in Windows 7, this function stores the address of the
addend in ESI, and all callers move the address in ESI prior.
Completely different from how it's defined in source, but the result
of LTGC.
True, I didn't consider LTCG. I don't know how it handles
instruction
reordering, but to prevent it, we can use a _ReadWriteBarrier in
KxAcquireSpinLock and KxReleaseSpinLock. This way, even if inlined we
will get proper ordering for all our spinlock functions.
We also need to take the CPUs out-of-order execution into account. mov
cr8,x is a serializing instruction, so that is no problem. Other
architectures would also have to make sure that changing the irql is a
serializing instruction.
Is ordering still guaranteed in this scenario?
No, not in this case. (Which is IMO a bad definition of KeRaiseIrql
or a bad implementation of __writecr8)
This is how KeRaiseIrql(DISPATCH_LEVEL) behaves on AMD64.... it's an
inline in wdm.h, which I'll paste, since last time I quoted MSDN you
said I was making stuff up.
As I said above. I agree with you that it does not
guarantee ordering in
this case.
__forceinline
KIRQL
KfRaiseIrql (
__in KIRQL NewIrql
)
{
KIRQL OldIrql;
OldIrql = KeGetCurrentIrql();
NT_ASSERT(OldIrql <= NewIrql);
WriteCR8(NewIrql);
return OldIrql;
}
But how does the use of volatile fix this? It
doesn't!
MSDN says about volatile objects:
"A write to a volatile object (volatile write) has Release semantics;
a reference to a global or static object that occurs before a write
to a volatile object in the instruction sequence will occur before
that volatile write in the compiled binary."
But we would need release + acquire semantics. and even then it
wouldn't guarantee that __writecr8() is considered "reference to a
global or static object". And gcc might even behave different.
These are all good points.
Luckily at least all Spinlock functions have an implicit memory
barrier and are thus not prone to this problem.
But not in UP -- which is what ReactOS Builds as.
Not in case of inlined functions
and with LTCG, which both does not yet
apply to ros. For safety in regards to inlining / LTCG, I added the
memory barriers now.
So instead of adding volatile here and there and random places or
even to all shared data structures, we need to make sure, we stick to
some locking rules, which would mean not to expect KeRaiseIrql() to
provide an implicit memory barrier and to make sure all inlined
spinlock functions will have a proper memory barrier.
Fair enough, so then those should be fixed.
Regards,
Timo
_______________________________________________
Ros-dev mailing list
Ros-dev(a)reactos.org <mailto:Ros-dev@reactos.org>
http://www.reactos.org/mailman/listinfo/ros-dev
_______________________________________________
Ros-dev mailing list
Ros-dev(a)reactos.org
http://www.reactos.org/mailman/listinfo/ros-dev