Am 04.06.2011 01:12, schrieb Alex Ionescu:

On 2011-06-03, at 5:31 PM, Timo Kreuzer wrote:

Am 03.06.2011 21:48, schrieb Alex Ionescu:
What about on UP, where a strong compiler will realize that the call only raises IRQL to dispatch, and since these apis are in NTOSKRNL, they might get in-lined or modified.
This would only happen, if the function is either delcared as inline or inside the same translation unit. Both of that is not the case here.
If it was the case (say we inlined KeAcquireQueuedSpinLock) then we would need to add an explicit memory barrier to that inline function.

This isn't true -- a modern optimizing compiler uses link-time code generation which crosses compilation units and inline semantics.

For example, in Windows 7, this function stores the address of the addend in ESI, and all callers move the address in ESI prior. Completely different from how it's defined in source, but the result of LTGC.
True, I didn't consider LTCG. I don't know how it handles instruction reordering, but to prevent it, we can use a _ReadWriteBarrier in KxAcquireSpinLock and KxReleaseSpinLock. This way, even if inlined we will get proper ordering for all our spinlock functions.
We also need to take the CPUs out-of-order execution into account. mov cr8,x is a serializing instruction, so that is no problem. Other architectures would also have to make sure that changing the irql is a serializing instruction.




Is ordering still guaranteed in this scenario?
No, not in this case. (Which is IMO a bad definition of KeRaiseIrql or a bad implementation of __writecr8)

This is how KeRaiseIrql(DISPATCH_LEVEL) behaves on AMD64.... it's an inline in wdm.h, which I'll paste, since last time I quoted MSDN you said I was making stuff up.
As I said above. I agree with you that it does not guarantee ordering in this case.


__forceinline
KIRQL
KfRaiseIrql (
    __in KIRQL NewIrql
    )
{

    KIRQL OldIrql;

    OldIrql = KeGetCurrentIrql();

    NT_ASSERT(OldIrql <= NewIrql);

    WriteCR8(NewIrql);
    return OldIrql;
}

But how does the use of volatile fix this? It doesn't!
MSDN says about volatile objects:
"A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary."
But we would need release + acquire semantics. and even then it wouldn't guarantee that __writecr8() is considered "reference to a global or static object". And gcc might even behave different.

These are all good points.


Luckily at least all Spinlock functions have an implicit memory barrier and are thus not prone to this problem.

But not in UP -- which is what ReactOS Builds as.
Not in case of inlined functions and with LTCG, which both does not yet apply to ros. For safety in regards to inlining / LTCG, I added the memory barriers now.



So instead of adding volatile here and there and random places or even to all shared data structures, we need to make sure, we stick to some locking rules, which would mean not to expect KeRaiseIrql() to provide an implicit memory barrier and to make sure all inlined spinlock functions will have a proper memory barrier.

Fair enough, so then those should be fixed.


Regards,
Timo



_______________________________________________
Ros-dev mailing list
Ros-dev@reactos.org
http://www.reactos.org/mailman/listinfo/ros-dev

_______________________________________________ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev