ion@svn.reactos.com wrote:
- DPRINT("Waiting on Critical Section: %x\n", CriticalSection);
- if (CriticalSection->DebugInfo) CriticalSection->DebugInfo->EntryCount++;
I'm just reading the diffs here, so I could be way off, but shouldn't this use an InterlockedIncrement... otherwise you could get a context switch between in the middle of your increment to another threading attempting to lock the critical section. It is just debug information, but still...
Also, looking at the rest of the code, I notice that spin count doesn't appear to be implemented.
I don't have a build environment even setup, so take the attached patch with a grain of salt.
The changes in RtlpCreateCriticalSEctionSem() are because the InterlockCompareExchangePointer() should already have written the new event into the data structure.
Maybe on day I'll manage to get my build environment setup again.
- Joseph
Index: C:/Users/galb/ReactOS/reactos/lib/ntdll/rtl/critical.c =================================================================== --- C:/Users/galb/ReactOS/reactos/lib/ntdll/rtl/critical.c (revision 12774) +++ C:/Users/galb/ReactOS/reactos/lib/ntdll/rtl/critical.c (working copy) @@ -133,7 +133,17 @@ { HANDLE Thread = (HANDLE)NtCurrentTeb()->Cid.UniqueThread;
- /* Try to Lock it */ + /* Try for SpinCount times to Lock it */ + for(LONG i = 0; i < CriticalSection->SpinCount; i++) + { + if ( RtlTryEnterCriticalSection(CriticalSection) ) + return STATUS_SUCCESS; + } + + /* + * Still didn't get it; we have to bump the LockCount, which + * might get it for us, and then wait + */ if (InterlockedIncrement(&CriticalSection->LockCount)) {
/* @@ -398,12 +408,12 @@
/* Increase the Debug Entry count */ DPRINT("Waiting on Critical Section: %x\n", CriticalSection); - if (CriticalSection->DebugInfo) CriticalSection->DebugInfo->EntryCount++; + if (CriticalSection->DebugInfo) InterlockedIncrement(&CriticalSection->DebugInfo->EntryCount);
for (;;) {
/* Increase the number of times we've had contention */ - if (CriticalSection->DebugInfo) CriticalSection->DebugInfo->ContentionCount++; + if (CriticalSection->DebugInfo) InterlockedIncrement(&CriticalSection->DebugInfo->ContentionCount);
/* Wait on the Event */ DPRINT("Waiting on Event: %x\n", CriticalSection->LockSemaphore); @@ -523,21 +533,14 @@ return; }
- if (!(hEvent = InterlockedCompareExchangePointer((PVOID*)&CriticalSection->LockSemaphore, - (PVOID)hNewEvent, - 0))) { - - /* We created a new event succesffuly */ - hEvent = hNewEvent; - } else { - - /* Some just created an event */ - DPRINT("Closing already created event!\n"); - NtClose(hNewEvent); + + if ( InterlockedCompareExchangePointer((PVOID*)&CriticalSection->LockSemaphore, + (PVOID)hNewEvent, + 0)) != 0) { + /* Someone else just created the event */ + NtClose(hNewEvent); } - - /* Set either the new or the old */ - CriticalSection->LockSemaphore = hEvent; + DPRINT("Event set!\n"); }
Joseph Galbraith wrote:
ion@svn.reactos.com wrote:
- DPRINT("Waiting on Critical Section: %x\n", CriticalSection);
- if (CriticalSection->DebugInfo)
CriticalSection->DebugInfo->EntryCount++;
I'm just reading the diffs here, so I could be way off, but shouldn't this use an InterlockedIncrement... otherwise you could get a context switch between in the middle of your increment to another threading attempting to lock the critical section. It is just debug information, but still...
I suppose that makes sense... I'll look into it.
Also, looking at the rest of the code, I notice that spin count doesn't appear to be implemented.
No, not as of now. It was based on WINE code which doesn't implement this yet (AFAIK). I have been optimizing it and adding more feature to make it stabler, so spin lock support for MP builds was on my list. Thanks for your patch... I'll take a look at it.
I don't have a build environment even setup, so take the attached patch with a grain of salt.
The changes in RtlpCreateCriticalSEctionSem() are because the InterlockCompareExchangePointer() should already have written the new event into the data structure.
Yeah, you're right, it was a bit useless since the call already sets the new event if sucesful. We still have to compare with the old event though, not with 0.
Maybe on day I'll manage to get my build environment setup again.
Are you on win32? Do you need any help setting it up?
- Joseph
Best regards, Alex Ionescu
Alex Ionescu wrote:
Joseph Galbraith wrote:
ion@svn.reactos.com wrote:
- DPRINT("Waiting on Critical Section: %x\n", CriticalSection);
- if (CriticalSection->DebugInfo)
CriticalSection->DebugInfo->EntryCount++;
I'm just reading the diffs here, so I could be way off, but shouldn't this use an InterlockedIncrement... otherwise you could get a context switch between in the middle of your increment to another threading attempting to lock the critical section. It is just debug information, but still...
I suppose that makes sense... I'll look into it.
Also, looking at the rest of the code, I notice that spin count doesn't appear to be implemented.
No, not as of now. It was based on WINE code which doesn't implement this yet (AFAIK).
Yes it does. http://cvs.winehq.org/cvsweb/~checkout~/wine/dlls/ntdll/critsection.c?rev=1....
I have been optimizing it and adding more feature to make it stabler, so spin lock support for MP builds was on my list. Thanks for your patch... I'll take a look at it.
It would be good if someone could implement the stack backtrace capture functions as we could then remove critsec debugging hacks from all over the wine tree.
Rob
Rob Shearman wrote:
Yes it does. http://cvs.winehq.org/cvsweb/~checkout~/wine/dlls/ntdll/critsection.c?rev=1....
Hi Rob,
Looks like I was using an old revision of the code, since I hadn't seen the Spincount code. I will add it.
I have been optimizing it and adding more feature to make it stabler, so spin lock support for MP builds was on my list. Thanks for your patch... I'll take a look at it.
It would be good if someone could implement the stack backtrace capture functions as we could then remove critsec debugging hacks from all over the wine tree.
Ah, so that's what was using the hacks. If they are removed, then the code would be much easier to share. By the way, I've made some serious performance improvements to the code, by adding a static buffer which is used to hold up to 64 debug sections. The rest are then allocated by using the heap. This has a two fold advantage:
1) The first CS (used by the Heap Manager on ROS, probably on WINE too), one for RtlInitializeHeapManager and the other for RtlAllocateHeap now also have a valid Debug Section, because they use the static one. 2) Because the kernel-call is removed, it offers significant improvments on ROS, on WINE too, since there is less code that must run (a tiny loop versus a whole heap allocation).
I'm thinking about sending a patch to WineDevel... since it's only two functions to be added and the brute heap calls to be replaced by the function names, I think it would be easy to integrate.
Best regards, Alex Ionescu
Alex Ionescu wrote:
By the way, I've made some serious performance improvements to the code, by adding a static buffer which is used to hold up to 64 debug sections. The rest are then allocated by using the heap. This has a two fold advantage:
- The first CS (used by the Heap Manager on ROS, probably on WINE
too), one for RtlInitializeHeapManager and the other for RtlAllocateHeap now also have a valid Debug Section, because they use the static one. 2) Because the kernel-call is removed, it offers significant improvments on ROS, on WINE too, since there is less code that must run (a tiny loop versus a whole heap allocation).
A call to RtlAllocateHeap doesn't normally result in a kernel-call.
I'm thinking about sending a patch to WineDevel... since it's only two functions to be added and the brute heap calls to be replaced by the function names, I think it would be easy to integrate.
For most process-wide critical sections we don't allocate a DebugInfo anyway; we just store the critical section and the DebugInfo structure in the .data section of the DLL and don't call RtlInitializeCriticalSectionAndSpinCount. So, I don't think it would be that much of a performance improvement for Wine. Anyway, if you wanted the performance, you should just disable the code that allocates the DebugInfo, like release versions of Windows do. However, patches to manipulate the EntryCount and ContentionCount members would be welcome as they could be useful in identifying performance bottlenecks.
Rob
Robert Shearman wrote:
Alex Ionescu wrote:
A call to RtlAllocateHeap doesn't normally result in a kernel-call.
Even without them, heap code is much much slower, then allocating a static buffer, because of all the searches and control that must be done.
I'm thinking about sending a patch to WineDevel... since it's only two functions to be added and the brute heap calls to be replaced by the function names, I think it would be easy to integrate.
For most process-wide critical sections we don't allocate a DebugInfo anyway; we just store the critical section and the DebugInfo structure in the .data section of the DLL and don't call RtlInitializeCriticalSectionAndSpinCount.
Yes, unfortunately shell32.dll calls RtlDeleteCriticalSection in changenotify.c and iconcache.c. I had to remove those calls from the ROS tree because the code will think the Debug Data is in a heap and try to free it, which raises a warning. I think it's wrong to call them on wine too, even if no errors happen, because you shouldn't delete sections you statically allocate in your dll .data section.
So, I don't think it would be that much of a performance improvement for Wine.
Probabaly minimal, yes, for now.
Anyway, if you wanted the performance, you should just disable the code that allocates the DebugInfo, like release versions of Windows do.
People keep telling me this, but I have not been able to reproduce this behaviour on my retail machines. Perhaps this was in NT4?
However, patches to manipulate the EntryCount and ContentionCount members would be welcome as they could be useful in identifying performance bottlenecks.
Once my code becomes more stable and stack backtrace gets added, I will send everythign to wine, also in hope that they will get rid of the static CS and use dynamic ones...its' much cleaner, imo.
Rob
Best regards, Alex Ionescu
Alex Ionescu wrote:
Joseph Galbraith wrote:
The changes in RtlpCreateCriticalSEctionSem() are because the InterlockCompareExchangePointer() should already have written the new event into the data structure.
Yeah, you're right, it was a bit useless since the call already sets the new event if sucesful. We still have to compare with the old event though, not with 0.
Well, if the exchange worked, the old value was 0; if the exchange didn't work, then it was something other than 0.
So, if we get 0 back from InterlockCompareExhcnagePointer(), then we know the exchange took place, and we had no previous handle. Otherwise, we know the exchange didn't take place, and we must clean up our handle.
(I think anyway.)
Maybe on day I'll manage to get my build environment setup again.
Are you on win32? Do you need any help setting it up?
Well, yes, win32. I've been setup before, but I haven't kept up with the tool chain.
Do we still need the djcc stuff for building freeloader?
What would be really cool was an MSI that would get me a complete 'ReactoOS' build environment in one simple step. Maybe I'll look into it.
Actually, best would be the VC++ build system :-)
Thanks,
- Joseph
Joseph Galbraith wrote:
Alex Ionescu wrote:
Joseph Galbraith wrote:
The changes in RtlpCreateCriticalSEctionSem() are because the InterlockCompareExchangePointer() should already have written the new event into the data structure.
Yeah, you're right, it was a bit useless since the call already sets the new event if sucesful. We still have to compare with the old event though, not with 0.
Well, if the exchange worked, the old value was 0; if the exchange didn't work, then it was something other than 0.
So, if we get 0 back from InterlockCompareExhcnagePointer(), then we know the exchange took place, and we had no previous handle. Otherwise, we know the exchange didn't take place, and we must clean up our handle.
You're right again...you should submit a patch to wine for that as well.
Well, yes, win32. I've been setup before, but I haven't kept up with the tool chain.
Do we still need the djcc stuff for building freeloader?
What would be really cool was an MSI that would get me a complete 'ReactoOS' build environment in one simple step. Maybe I'll look into it.
I'm currently working on this exact project!
Actually, best would be the VC++ build system :-)
Soon, soon.
Thanks,
- Joseph
Best regards, Alex Ionescu