The notification routine can change the list, as there is no lock involved.
Therefore, you should grab the next entry before notifying the driver, to maintain a consistent list state.
-- Best regards, Alex Ionescu
On 2011-06-02, at 1:43 PM, pschweitzer@svn.reactos.org wrote:
DriverNotificationRoutine(DeviceObject, TRUE);/* Go to the next entry */ListEntry = ListEntry->Flink;
Hi,
The notification routine can change the list, as there is no lock involved.
List is locked by the caller. Have a look to: IoRegisterFsRegistrationChange().
Regarding all your remarks: thank you! They have been used to fix code and commited in r52073.
Regards, P. Schweitzer
Ah, I didn't see the caller.
There's still the issue of the missing volatile. It's required to make sure there is strict ordering between the spinlock acquisition and the increment/decrement.
-- Best regards, Alex Ionescu
On 2011-06-03, at 4:46 AM, Pierre Schweitzer wrote:
Hi,
The notification routine can change the list, as there is no lock involved.
List is locked by the caller. Have a look to: IoRegisterFsRegistrationChange().
Regarding all your remarks: thank you! They have been used to fix code and commited in r52073.
Regards, P. Schweitzer
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Am 03.06.2011 13:24, schrieb Alex Ionescu:
Ah, I didn't see the caller.
There's still the issue of the missing volatile. It's required to make sure there is strict ordering between the spinlock acquisition and the increment/decrement.
Ordering is guaranteed, since the spinlock functions act as memory barriers, otherwise a lot of our code wouldn't work. Suggested reading: http://kernel.org/doc/Documentation/volatile-considered-harmful.txt
Regards, Timo
While the paper on Linux's spinlock semantics was very interesting, it remains the fact that this is not the case in Windows in this particular instance.
A lot of ReactOS code *is* missing calls such as KeMemoryBarrier() and (volatile), and only works by chance, so the argument that "otherwise our code wouldn't work" is a bit of a fallacy.
You also need to think outside the strict-ordering x86 box. Most of ReactOS' code is totally borked on IA64, PPC or ARM (and semi-broken on x64 too, which has looser ordering).
Of course, feel free to ignore the suggestion.
-- Best regards, Alex Ionescu
On 2011-06-03, at 8:08 AM, Timo Kreuzer wrote:
Am 03.06.2011 13:24, schrieb Alex Ionescu:
Ah, I didn't see the caller.
There's still the issue of the missing volatile. It's required to make sure there is strict ordering between the spinlock acquisition and the increment/decrement.
Ordering is guaranteed, since the spinlock functions act as memory barriers, otherwise a lot of our code wouldn't work. Suggested reading: http://kernel.org/doc/Documentation/volatile-considered-harmful.txt
Regards, Timo
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Am 03.06.2011 17:55, schrieb Alex Ionescu:
While the paper on Linux's spinlock semantics was very interesting, it remains the fact that this is not the case in Windows in this particular instance.
A lot of ReactOS code *is* missing calls such as KeMemoryBarrier() and (volatile), and only works by chance, so the argument that "otherwise our code wouldn't work" is a bit of a fallacy.
You also need to think outside the strict-ordering x86 box. Most of ReactOS' code is totally borked on IA64, PPC or ARM (and semi-broken on x64 too, which has looser ordering).
Of course, feel free to ignore the suggestion.
-- Best regards, Alex Ionescu
I won't ignore the suggestion. I'd rather try to convince you that its useless. Lets look at this particular instance.
Irql = KeAcquireQueuedSpinLock(Queue); OldValue = (*Ulong)--; KeReleaseQueuedSpinLock(Queue, Irql);
KeAcquireQueuedSpinLock is declared as:
_DECL_HAL_KE_IMPORT KIRQL FASTCALL KeAcquireQueuedSpinLock( IN OUT KSPIN_LOCK_QUEUE_NUMBER Number);
Thats really all we need to know. Its a fastcall function. Its not inline, but a function calls across the translation unit boundaries. This alone is enough to guarantee strict compiler ordering of memory accesses around the bounds of the call. The compiler cannot know what will happen inside the function that is being called, so it won't make any assumptions about reordering possibilities.
See: http://publications.gbdirect.co.uk/c_book/chapter8/sequence_points.html
MSDN: http://msdn.microsoft.com/en-us/library/d45c7a5d%28VS.80%29.aspx "Function-call operator. The function-call expression and all arguments to a function, including default arguments, are evaluated/*and all side effects completed*/ prior to entry to the function. There is no specified order of evaluation among the arguments or the function-call expression."
Also described in the ansi-C standard, but pretty abstract: http://www.bsb.me.uk/ansi-c/ansi-c-one-file#sec_2_1_2_3 http://www.bsb.me.uk/ansi-c/ansi-c-one-file#sec_3_3_2_2
Regards, Timo
What about on UP, where a strong compiler will realize that the call only raises IRQL to dispatch, and since these apis are in NTOSKRNL, they might get in-lined or modified.
On AMD64, raising the IRQL means only modifying cr8.
So the function will become:
OldIrql == __readcr8(); __writecr8(DISPATCH_LEVEL); inc dword ptr [reg]; __writecr8(OldIrql);
Is ordering still guaranteed in this scenario?
-- Best regards, Alex Ionescu
On 2011-06-03, at 3:35 PM, Timo Kreuzer wrote:
Am 03.06.2011 17:55, schrieb Alex Ionescu:
While the paper on Linux's spinlock semantics was very interesting, it remains the fact that this is not the case in Windows in this particular instance.
A lot of ReactOS code *is* missing calls such as KeMemoryBarrier() and (volatile), and only works by chance, so the argument that "otherwise our code wouldn't work" is a bit of a fallacy.
You also need to think outside the strict-ordering x86 box. Most of ReactOS' code is totally borked on IA64, PPC or ARM (and semi-broken on x64 too, which has looser ordering).
Of course, feel free to ignore the suggestion.
-- Best regards, Alex Ionescu
I won't ignore the suggestion. I'd rather try to convince you that its useless. Lets look at this particular instance.
Irql = KeAcquireQueuedSpinLock(Queue); OldValue = (*Ulong)--; KeReleaseQueuedSpinLock(Queue, Irql);KeAcquireQueuedSpinLock is declared as:
_DECL_HAL_KE_IMPORT KIRQL FASTCALL KeAcquireQueuedSpinLock( IN OUT KSPIN_LOCK_QUEUE_NUMBER Number);
Thats really all we need to know. Its a fastcall function. Its not inline, but a function calls across the translation unit boundaries. This alone is enough to guarantee strict compiler ordering of memory accesses around the bounds of the call. The compiler cannot know what will happen inside the function that is being called, so it won't make any assumptions about reordering possibilities.
See: http://publications.gbdirect.co.uk/c_book/chapter8/sequence_points.html
MSDN: http://msdn.microsoft.com/en-us/library/d45c7a5d%28VS.80%29.aspx "Function-call operator. The function-call expression and all arguments to a function, including default arguments, are evaluated and all side effects completed prior to entry to the function. There is no specified order of evaluation among the arguments or the function-call expression."
Also described in the ansi-C standard, but pretty abstract: http://www.bsb.me.uk/ansi-c/ansi-c-one-file#sec_2_1_2_3 http://www.bsb.me.uk/ansi-c/ansi-c-one-file#sec_3_3_2_2
Regards, Timo
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Am 03.06.2011 21:48, schrieb Alex Ionescu:
What about on UP, where a strong compiler will realize that the call only raises IRQL to dispatch, and since these apis are in NTOSKRNL, they might get in-lined or modified.
This would only happen, if the function is either delcared as inline or inside the same translation unit. Both of that is not the case here. If it was the case (say we inlined KeAcquireQueuedSpinLock) then we would need to add an explicit memory barrier to that inline function.
On AMD64, raising the IRQL means only modifying cr8.
So the function will become:
OldIrql == __readcr8(); __writecr8(DISPATCH_LEVEL); inc dword ptr [reg]; __writecr8(OldIrql);
Is ordering still guaranteed in this scenario?
No, not in this case. (Which is IMO a bad definition of KeRaiseIrql or a bad implementation of __writecr8) But how does the use of volatile fix this? It doesn't! MSDN says about volatile objects: "A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary." But we would need release + acquire semantics. and even then it wouldn't guarantee that __writecr8() is considered "reference to a global or static object". And gcc might even behave different.
Luckily at least all Spinlock functions have an implicit memory barrier and are thus not prone to this problem.
So instead of adding volatile here and there and random places or even to all shared data structures, we need to make sure, we stick to some locking rules, which would mean not to expect KeRaiseIrql() to provide an implicit memory barrier and to make sure all inlined spinlock functions will have a proper memory barrier.
Regards, Timo
On 2011-06-03, at 5:31 PM, Timo Kreuzer wrote:
Am 03.06.2011 21:48, schrieb Alex Ionescu:
What about on UP, where a strong compiler will realize that the call only raises IRQL to dispatch, and since these apis are in NTOSKRNL, they might get in-lined or modified.
This would only happen, if the function is either delcared as inline or inside the same translation unit. Both of that is not the case here. If it was the case (say we inlined KeAcquireQueuedSpinLock) then we would need to add an explicit memory barrier to that inline function.
This isn't true -- a modern optimizing compiler uses link-time code generation which crosses compilation units and inline semantics.
For example, in Windows 7, this function stores the address of the addend in ESI, and all callers move the address in ESI prior. Completely different from how it's defined in source, but the result of LTGC.
On AMD64, raising the IRQL means only modifying cr8.
So the function will become:
OldIrql == __readcr8(); __writecr8(DISPATCH_LEVEL); inc dword ptr [reg]; __writecr8(OldIrql);
Is ordering still guaranteed in this scenario?
No, not in this case. (Which is IMO a bad definition of KeRaiseIrql or a bad implementation of __writecr8)
This is how KeRaiseIrql(DISPATCH_LEVEL) behaves on AMD64.... it's an inline in wdm.h, which I'll paste, since last time I quoted MSDN you said I was making stuff up.
__forceinline KIRQL KfRaiseIrql ( __in KIRQL NewIrql ) {
KIRQL OldIrql;
OldIrql = KeGetCurrentIrql();
NT_ASSERT(OldIrql <= NewIrql);
WriteCR8(NewIrql); return OldIrql; }
But how does the use of volatile fix this? It doesn't! MSDN says about volatile objects: "A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary." But we would need release + acquire semantics. and even then it wouldn't guarantee that __writecr8() is considered "reference to a global or static object". And gcc might even behave different.
These are all good points.
Luckily at least all Spinlock functions have an implicit memory barrier and are thus not prone to this problem.
But not in UP -- which is what ReactOS Builds as.
So instead of adding volatile here and there and random places or even to all shared data structures, we need to make sure, we stick to some locking rules, which would mean not to expect KeRaiseIrql() to provide an implicit memory barrier and to make sure all inlined spinlock functions will have a proper memory barrier.
Fair enough, so then those should be fixed.
Regards, Timo
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Am 04.06.2011 01:12, schrieb Alex Ionescu:
On 2011-06-03, at 5:31 PM, Timo Kreuzer wrote:
Am 03.06.2011 21:48, schrieb Alex Ionescu:
What about on UP, where a strong compiler will realize that the call only raises IRQL to dispatch, and since these apis are in NTOSKRNL, they might get in-lined or modified.
This would only happen, if the function is either delcared as inline or inside the same translation unit. Both of that is not the case here. If it was the case (say we inlined KeAcquireQueuedSpinLock) then we would need to add an explicit memory barrier to that inline function.
This isn't true -- a modern optimizing compiler uses link-time code generation which crosses compilation units and inline semantics.
For example, in Windows 7, this function stores the address of the addend in ESI, and all callers move the address in ESI prior. Completely different from how it's defined in source, but the result of LTGC.
True, I didn't consider LTCG. I don't know how it handles instruction reordering, but to prevent it, we can use a _ReadWriteBarrier in KxAcquireSpinLock and KxReleaseSpinLock. This way, even if inlined we will get proper ordering for all our spinlock functions. We also need to take the CPUs out-of-order execution into account. mov cr8,x is a serializing instruction, so that is no problem. Other architectures would also have to make sure that changing the irql is a serializing instruction.
Is ordering still guaranteed in this scenario?
No, not in this case. (Which is IMO a bad definition of KeRaiseIrql or a bad implementation of __writecr8)
This is how KeRaiseIrql(DISPATCH_LEVEL) behaves on AMD64.... it's an inline in wdm.h, which I'll paste, since last time I quoted MSDN you said I was making stuff up.
As I said above. I agree with you that it does not guarantee ordering in this case.
__forceinline KIRQL KfRaiseIrql ( __in KIRQL NewIrql ) {
KIRQL OldIrql; OldIrql = KeGetCurrentIrql(); NT_ASSERT(OldIrql <= NewIrql); WriteCR8(NewIrql);return OldIrql; }
But how does the use of volatile fix this? It doesn't! MSDN says about volatile objects: "A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary." But we would need release + acquire semantics. and even then it wouldn't guarantee that __writecr8() is considered "reference to a global or static object". And gcc might even behave different.
These are all good points.
Luckily at least all Spinlock functions have an implicit memory barrier and are thus not prone to this problem.
But not in UP -- which is what ReactOS Builds as.
Not in case of inlined functions and with LTCG, which both does not yet apply to ros. For safety in regards to inlining / LTCG, I added the memory barriers now.
So instead of adding volatile here and there and random places or even to all shared data structures, we need to make sure, we stick to some locking rules, which would mean not to expect KeRaiseIrql() to provide an implicit memory barrier and to make sure all inlined spinlock functions will have a proper memory barrier.
Fair enough, so then those should be fixed.
Regards, Timo
Ros-dev mailing list Ros-dev@reactos.org mailto:Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Finally addressed in r52117. Thanks!
"ReactOS Development List" ros-dev@reactos.org wrote on Fri, June 3rd, 2011, 1:24 PM:
Ah, I didn't see the caller.
There's still the issue of the missing volatile. It's required to make sure there is strict ordering between the spinlock acquisition and the increment/decrement.
-- Best regards, Alex Ionescu
I confirm: Timo and I agreed that the volatiles were needed because of a lower-level bug, and Timo fixed that, making the volatiles useless.
Best regards, Alex Ionescu
On Mon, Jun 6, 2011 at 2:35 PM, Timo Kreuzer timo.kreuzer@web.de wrote:
Am 06.06.2011 19:06, schrieb Pierre Schweitzer:
Finally addressed in r52117.
Thanks!
Oops, looks like I reverted it :D
No, seriously: please read the discussion about this to the end.
Timo
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
go go Timo go! :D
On Mon, Jun 6, 2011 at 9:09 PM, Alex Ionescu ionucu@videotron.ca wrote:
I confirm: Timo and I agreed that the volatiles were needed because of a lower-level bug, and Timo fixed that, making the volatiles useless.
Best regards, Alex Ionescu
On Mon, Jun 6, 2011 at 2:35 PM, Timo Kreuzer timo.kreuzer@web.de wrote:
Am 06.06.2011 19:06, schrieb Pierre Schweitzer:
Finally addressed in r52117.
Thanks!
Oops, looks like I reverted it :D
No, seriously: please read the discussion about this to the end.
Timo
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
If only you were that efficient to revert your changes when you are wrong.
May you revert r51748, or shall I? If you forgot, here is a reminder: most of concerned devs agreed on a revert, given Alex Ionescu's reasons.
So now, apply your own methods to your stuff, or I'll do.
Thanks.
"ReactOS Development List" ros-dev@reactos.org wrote on Mon, June 6th, 2011, 8:35 PM:
Am 06.06.2011 19:06, schrieb Pierre Schweitzer:
Finally addressed in r52117. Thanks!
Oops, looks like I reverted it :D No, seriously: please read the discussion about this to the end.
Timo
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Am 07.06.2011 21:20, schrieb Pierre Schweitzer:
If only you were that efficient to revert your changes when you are wrong.
Me? Wrong? You must be kidding! ;-)
May you revert r51748, or shall I? If you forgot, here is a reminder: most of concerned devs agreed on a revert, given Alex Ionescu's reasons.
So now, apply your own methods to your stuff, or I'll do.
Thanks.
Done.
Btw, just an explanation why I reverted your change so quickly without asking first. In fact, I didn't even realize that you just comitted it, I somehow thought it was old code and realized only later that you just comitted it, after Colin mentioned something on irc :)