ion@svn.reactos.org wrote:
Fix shamefully dangerously broken Work Thread/Queue/Item implementation:
- Do not pollute the kernel with 10 real-time threads and 5 high-priority threads in order to manage work items. Work threads are very-low priority (< 7) and should never pre-empt userthreads like they do now. 1 priority 7, 5 priority 5 and 3 priority 4 threads are now properly created.
- Implement a worker thread balance set manager. On SMP systems, it is able to determine when a new thread should be allocate to execute on a free CPU. On both UP and MP, it is also able to detect if a work queue has deadlocked, and will allocate new dynamic threads to unfreeze the queue.
- Add check for threads returning with APC disabled, and re-enable APCs if this happend. This hack is used in NT for broken drivers.
- Lots of code changes to support dynamic threads, which:
- Can terminate.
- Use a 10 minute timeout on the kernel queue.
- Add skeleton code for swapping worker thread stacks as well as worker thread shutdown (not yet implemented).
- Add WORKER_INVALID bugcheck definition.
- These changes seem to make ROS a lot more responsive.
NDK:
- Make more compatible with MS IFS
- Fix EX_WORK_QUEUE definition.
- Fix ETHREAD offsets.
- Fix RtlIsNameLegalDOS8Dot3 definition.
- Move splay tree defines to IFS.
Updated files: trunk/reactos/include/ndk/exfuncs.h trunk/reactos/include/ndk/extypes.h trunk/reactos/include/ndk/ifssupp.h trunk/reactos/include/ndk/iofuncs.h trunk/reactos/include/ndk/obfuncs.h trunk/reactos/include/ndk/pstypes.h trunk/reactos/include/ndk/rtlfuncs.h trunk/reactos/include/ndk/rtltypes.h trunk/reactos/lib/rtl/dos8dot3.c trunk/reactos/ntoskrnl/ex/work.c trunk/reactos/ntoskrnl/ntoskrnl.mc trunk/reactos/w32api/include/ddk/ntifs.h
Ros-svn mailing list Ros-svn@reactos.org http://www.reactos.org/mailman/listinfo/ros-svn
After applying this commit, I get a crash during compiling ros on ros with 'make clean' on a clean tree on my smp machine:
Assertion (Thread->State == Waiting) == (Thread->WaitBlockList != NULL) failed at ./ntoskrnl/ke/wait.c:722
I don't get a back trace. It may be a problem of the broken ASSERT statement (DBG=1 and KDBG=0).
- Hartmut
After applying this commit, I get a crash during compiling ros on ros
with 'make clean' on a clean tree on my smp machine:
Assertion (Thread->State == Waiting) == (Thread->WaitBlockList != NULL) failed at ./ntoskrnl/ke/wait.c:722
I don't get a back trace. It may be a problem of the broken ASSERT statement (DBG=1 and KDBG=0).
Hi,
Can you try KDBG=1 and do a "bt" please? This should show a stack trace. Also, is this the SMP build?
- Hartmut
Best regards, Alex Ionescu
Can you also try separating the ASSERT in two statements please? It's really confusing in the way it's written.
Best regards, Alex Ionescu
Alex Ionescu wrote:
After applying this commit, I get a crash during compiling ros on ros
with 'make clean' on a clean tree on my smp machine:
Assertion (Thread->State == Waiting) == (Thread->WaitBlockList != NULL) failed at ./ntoskrnl/ke/wait.c:722
I don't get a back trace. It may be a problem of the broken ASSERT statement (DBG=1 and KDBG=0).
Hi,
Can you try KDBG=1 and do a "bt" please? This should show a stack trace. Also, is this the SMP build?
- Hartmut
Best regards, Alex Ionescu
Alex Ionescu wrote:
Hi,
Can you try KDBG=1 and do a "bt" please? This should show a stack trace. Also, is this the SMP build?
It is a smp build. I'm not able to install a build with KDBG=1. It does always crash at the end of the second stage setup. I've created a backtrace by adding KeRosDumpStackFrames. It is rev 20554 with a highly modified hal and kernel during the processor/apic initialization.
- Hartmut
(./ntoskrnl/ke/wait.c:724 CPU0) KiAbortWaitThread: 818f75e8, Status: 102, 818f7690 Frames: <ntoskrnl.exe:95f9 (./ntoskrnl/ke/wait.c:725 (KiAbortWaitThread))> <ntoskrnl.exe:955e (./ntoskrnl/ke/wait.c:701 (KiWaitTest))> <ntoskrnl.exe:841e (./ntoskrnl/ke/timer.c:307 (KiHandleExpiredTimer))> <ntoskrnl.exe:83d1 (./ntoskrnl/ke/timer.c:273 (KiExpireTimers))> <ntoskrnl.exe:334f (./ntoskrnl/ke/dpc.c:554 (KiDispatchInterrupt))> <80507204> hal/halx86/mp/mpsirql.c:93, HalpLowerIrql <8050767D> hal/halx86/mp/mpsirql.c:337, HalEndSystemInterrupt <80504419> hal/halx86/mp/apic.c:820, MpsTimerHandler <80506F6F> hal\halx86\mp\mps.S:85, MpsTimerInterrupt <ntoskrnl.exe:8d93f (./ntoskrnl/ps/thread.c:90 (PspSystemThreadStartup))> Assertion (Thread->State == Waiting) == (Thread->WaitBlockList != NULL) failed at ./ntoskrnl/ke/wait.c:727 for CPU0 KeBugCheckWithTf at ntoskrnl\ke\i386\exp.c:1242
M:\Sandbox\ros_work\reactos>svn diff ntoskrnl\ke\wait.c Index: ntoskrnl/ke/wait.c =================================================================== --- ntoskrnl/ke/wait.c (Revision 20554) +++ ntoskrnl/ke/wait.c (Arbeitskopie) @@ -719,6 +719,11 @@
/* If we are blocked, we must be waiting on something also */ DPRINT("KiAbortWaitThread: %x, Status: %x, %x \n", Thread, WaitStatus, Thread->WaitBlockList); + if (!((Thread->State == Waiting) == (Thread->WaitBlockList != NULL))) + { + DPRINT1("KiAbortWaitThread: %x, Status: %x, %x \n", Thread, WaitStatus, Thread->WaitBlockList); + KeRosDumpStackFrames(NULL, 10); + } ASSERT((Thread->State == Waiting) == (Thread->WaitBlockList != NULL));
/* Remove the Wait Blocks from the list */
Hartmut Birr wrote:
Alex Ionescu wrote:
Hi,
Can you try KDBG=1 and do a "bt" please? This should show a stack trace. Also, is this the SMP build?
It is a smp build. I'm not able to install a build with KDBG=1. It does always crash at the end of the second stage setup. I've created a backtrace by adding KeRosDumpStackFrames. It is rev 20554 with a highly modified hal and kernel during the processor/apic initialization.
- Hartmut
Hi,
Can you try the latest build please? It fixes some bugs in worker threads and waiting mechanisms.
Best regards, Alex Ionescu
Alex Ionescu wrote:
Hi,
Can you try the latest build please? It fixes some bugs in worker threads and waiting mechanisms.
I get the same result:
(./ntoskrnl/ke/wait.c:750 CPU0) KiAbortWaitThread: 81381a18, Status: 102, 81381ac0 Frames: <ntoskrnl.exe:9c36 (./ntoskrnl/ke/wait.c:751 (KiAbortWaitThread))> <ntoskrnl.exe:9b98 (./ntoskrnl/ke/wait.c:726 (KiWaitTest))> <ntoskrnl.exe:866b (./ntoskrnl/ke/timer.c:307 (KiHandleExpiredTimer))> <ntoskrnl.exe:861e (./ntoskrnl/ke/timer.c:273 (KiExpireTimers))> <ntoskrnl.exe:336b (./ntoskrnl/ke/dpc.c:554 (KiDispatchInterrupt))> <804DF204> <804DF67D> <804DC419> <804DEF6F> <ntoskrnl.exe:8f0fd (./ntoskrnl/ps/thread.c:90 (PspSystemThreadStartup))>
- Hartmut
2006/1/5, Hartmut Birr osexpert@googlemail.com:
I get the same result:
(./ntoskrnl/ke/wait.c:750 CPU0) KiAbortWaitThread: 81381a18, Status: 102, 81381ac0 Frames: <ntoskrnl.exe:9c36 (./ntoskrnl/ke/wait.c:751 (KiAbortWaitThread))> <ntoskrnl.exe:9b98 (./ntoskrnl/ke/wait.c:726 (KiWaitTest))> <ntoskrnl.exe:866b (./ntoskrnl/ke/timer.c:307 (KiHandleExpiredTimer))> <ntoskrnl.exe:861e (./ntoskrnl/ke/timer.c:273 (KiExpireTimers))> <ntoskrnl.exe:336b (./ntoskrnl/ke/dpc.c:554 (KiDispatchInterrupt))> <804DF204> <804DF67D> <804DC419> <804DEF6F> <ntoskrnl.exe:8f0fd (./ntoskrnl/ps/thread.c:90 (PspSystemThreadStartup))>
The thread state (Thread->State) is 'Ready'.
- Hartmut
Alex Ionescu wrote:
Hartmut Birr wrote:
It is a smp build. I'm not able to install a build with KDBG=1. It does always crash at the end of the second stage setup. I've created a backtrace by adding KeRosDumpStackFrames. It is rev 20554 with a highly modified hal and kernel during the processor/apic initialization.
- Hartmut
Hi,
Can you try the latest build please? It fixes some bugs in worker threads and waiting mechanisms.
Hi,
there are three places in the kernel, which does remove wait blocks. They are in KiAbortWaitThread, KiInsertQueue and KiBlockThread. Only in KiBlockThread, Thread->WaitBlockList is set to NULL. Is this the problem?
- Hartmut
Hartmut Birr wrote:
Hi,
there are three places in the kernel, which does remove wait blocks. They are in KiAbortWaitThread, KiInsertQueue and KiBlockThread. Only in KiBlockThread, Thread->WaitBlockList is set to NULL. Is this the problem?
- Hartmut
Hi,
I have been reading Windows Internals II and Windows Internals 4th Edition and I see that the wait blocks are actually supposed to be a circular list... so there should never really be any "NULL". It's possible I based some previous code on this knowledge, which is now conflicting with the ROS implementation of a null-terminated list. I will change the wait code to use circular lists as documented and post a patch.
Best regards, Alex Ionescu
Alex Ionescu wrote:
Hartmut Birr wrote:
Hi,
there are three places in the kernel, which does remove wait blocks. They are in KiAbortWaitThread, KiInsertQueue and KiBlockThread. Only in KiBlockThread, Thread->WaitBlockList is set to NULL. Is this the problem?
- Hartmut
Hi,
I have been reading Windows Internals II and Windows Internals 4th Edition and I see that the wait blocks are actually supposed to be a circular list... so there should never really be any "NULL". It's possible I based some previous code on this knowledge, which is now conflicting with the ROS implementation of a null-terminated list. I will change the wait code to use circular lists as documented and post a patch.
Best regards, Alex Ionescu
I think, the real problem isn't if the list is NULL terminated or if the WaitBlockList entry from thread is NULL. The real problem is, KiAbortWaitThread is called for a thread which does not waiting.
- Hartmut
Hartmut Birr wrote:
Alex Ionescu wrote:
Hartmut Birr wrote:
Hi,
there are three places in the kernel, which does remove wait blocks. They are in KiAbortWaitThread, KiInsertQueue and KiBlockThread. Only in KiBlockThread, Thread->WaitBlockList is set to NULL. Is this the problem?
- Hartmut
Hi,
I have been reading Windows Internals II and Windows Internals 4th Edition and I see that the wait blocks are actually supposed to be a circular list... so there should never really be any "NULL". It's possible I based some previous code on this knowledge, which is now conflicting with the ROS implementation of a null-terminated list. I will change the wait code to use circular lists as documented and post a patch.
Best regards, Alex Ionescu
I think, the real problem isn't if the list is NULL terminated or if the WaitBlockList entry from thread is NULL.
Hmm perhaps not, but it's still an issue I'm going to tackle tonight (last time I tried it failed for mysterious reasons).
The real problem is, KiAbortWaitThread is called for a thread which does not waiting.
Ok, since this happened after my worker thread patch and they use kernel queues, I reviewed their implementation and found a number of important flaws.. wether or not they cause this problem I can't tell for sure, but I've also added a debug print before the KeAbortWaitThread call... let me know if this patch fixes anything or if the dprint shoes that the thread isn't really waiting.
- Hartmut
Best regards, Alex Ionescu
Alex Ionescu wrote:
Hartmut Birr wrote:
The real problem is, KiAbortWaitThread is called for a thread which does not waiting.
Ok, since this happened after my worker thread patch and they use kernel queues, I reviewed their implementation and found a number of important flaws.. wether or not they cause this problem I can't tell for sure, but I've also added a debug print before the KeAbortWaitThread call... let me know if this patch fixes anything or if the dprint shoes that the thread isn't really waiting.
Best regards, Alex Ionescu
I've test your changes (r20579 with r20601,20605,20606). KeAbortWaitThread is called for waiting threads only. But I'm running in another problem. Compiling ros on ros (with the nice parameter '-j2') hangs after some time. If I look to taskmgr or ctm, only the idle thread consumes cpu power. Sometimes I can stop the compiling with Ctrl-C, sometimes not. I wasn't able to compile ros on the smp machine. On the up machine, one of four compile runs does finish.
- Hartmut
Hartmut Birr wrote:
Alex Ionescu wrote:
Hartmut Birr wrote:
The real problem is, KiAbortWaitThread is called for a thread which does not waiting.
Ok, since this happened after my worker thread patch and they use kernel queues, I reviewed their implementation and found a number of important flaws.. wether or not they cause this problem I can't tell for sure, but I've also added a debug print before the KeAbortWaitThread call... let me know if this patch fixes anything or if the dprint shoes that the thread isn't really waiting.
Best regards, Alex Ionescu
I've test your changes (r20579 with r20601,20605,20606). KeAbortWaitThread is called for waiting threads only. But I'm running in another problem. Compiling ros on ros (with the nice parameter '-j2') hangs after some time. If I look to taskmgr or ctm, only the idle thread consumes cpu power. Sometimes I can stop the compiling with Ctrl-C, sometimes not. I wasn't able to compile ros on the smp machine. On the up machine, one of four compile runs does finish.
- Hartmut
Hi Hartmut, try ps.exe and dump all the threads and processes. Thanks, James
James Tabor wrote:
Hartmut Birr wrote:
I've test your changes (r20579 with r20601,20605,20606). KeAbortWaitThread is called for waiting threads only. But I'm running in another problem. Compiling ros on ros (with the nice parameter '-j2') hangs after some time. If I look to taskmgr or ctm, only the idle thread consumes cpu power. Sometimes I can stop the compiling with Ctrl-C, sometimes not. I wasn't able to compile ros on the smp machine. On the up machine, one of four compile runs does finish.
- Hartmut
Hi Hartmut, try ps.exe and dump all the threads and processes. Thanks, James
I've attached the output from PS. It seems, it isn't possible to terminate AS. AS has no running threads.
- Hartmut
P PID PPID KTime UTime NAME t TID KTime UTime State WaitResson w PID Hwnd WndStile TID WndName P 0 0 0:07:27 0:00:00 ProcName: P 4 0 0:00:17 0:00:00 ProcName: System t 8 8:44:17 0:00:00 Wait WrQueue t 12 8:44:17 0:00:00 Wait WrQueue t 16 8:44:17 0:00:00 Wait WrQueue t 20 8:44:19 0:00:00 Wait WrQueue t 24 8:44:18 0:00:00 Ready Executive t 28 8:44:17 0:00:00 Wait WrQueue t 32 8:44:17 0:00:00 Wait WrQueue t 36 8:44:17 0:00:00 Ready Executive t 40 8:44:21 0:00:00 Wait WrQueue t 44 8:44:17 0:00:00 Wait Executive t 48 8:44:18 0:00:00 Wait Executive t 52 8:44:24 0:00:00 Wait Executive t 56 8:44:17 0:00:00 Wait Executive t 60 8:44:17 0:00:00 Wait Executive t 64 8:44:22 0:00:00 Wait Executive t 68 8:44:17 0:00:00 Wait DelayExecution t 72 8:44:17 0:00:00 Wait Executive t 76 8:44:18 0:00:00 Wait Executive t 128 8:44:17 0:00:00 Wait Executive t 132 8:44:20 0:00:00 Wait Executive t 264 8:44:18 0:00:00 Wait Executive P 80 4 0:00:00 0:00:00 ProcName: smss.exe t 88 8:44:17 0:00:00 Wait UserRequest t 92 8:44:17 0:00:00 Wait UserRequest t 96 8:44:17 0:00:00 Wait UserRequest t 100 8:44:17 0:00:00 Wait UserRequest t 104 8:44:17 0:00:00 Wait UserRequest t 120 8:44:17 0:00:00 Wait UserRequest P 108 80 0:01:54 0:00:01 ProcName: csrss.exe t 84 8:44:17 0:00:00 Wait UserRequest t 124 8:44:18 0:00:00 Wait UserRequest t 136 8:44:17 0:00:00 Wait UserRequest t 140 8:44:18 0:00:00 Wait Executive t 152 8:44:19 0:00:00 Wait UserRequest t 156 8:44:17 0:00:00 Wait UserRequest t 160 8:44:20 0:00:00 Wait Executive w 108 20020 96000000 160 t 164 8:44:18 0:00:00 Wait Executive w 108 20022 86000000 164 t 168 8:44:17 0:00:00 Wait Executive w 108 20024 86000000 168 t 180 8:44:19 0:00:00 Wait UserRequest t 196 8:44:17 0:00:00 Wait UserRequest t 208 8:44:17 0:00:00 Wait UserRequest t 228 8:44:17 0:00:00 Wait UserRequest t 252 8:44:17 0:00:00 Wait UserRequest t 316 8:44:17 0:00:00 Wait UserRequest t 328 8:44:18 0:00:00 Wait UserRequest t 332 8:46:13 0:00:00 Wait Executive w 108 200d4 04cf0000 332 w 108 200d6 14ca0000 332 Command Prompt w 108 200de 14ca0000 332 Command Prompt t 340 8:44:18 0:00:00 Wait UserRequest t 380 8:44:17 0:00:00 Wait UserRequest t 400 8:44:17 0:00:00 Wait UserRequest t 412 8:44:17 0:00:00 Wait UserRequest P 144 108 0:00:00 0:00:00 ProcName: winlogon.exe t 148 8:44:18 0:00:00 Wait UserRequest w 144 20032 84000000 148 SAS P 172 144 0:00:00 0:00:00 ProcName: services.exe t 176 8:44:19 0:00:00 Wait Executive t 184 8:44:17 0:00:00 Wait UserRequest P 188 172 0:00:00 0:00:00 ProcName: eventlog.exe t 192 8:44:19 0:00:00 Wait UserRequest t 204 8:44:17 0:00:00 Wait UserRequest t 200 8:44:19 0:00:00 Wait UserRequest P 212 172 0:00:00 0:00:00 ProcName: umpnpmgr.exe t 216 8:44:19 0:00:00 Wait UserRequest t 224 8:44:19 0:00:00 Wait UserRequest t 236 8:44:19 0:00:00 Wait UserRequest t 256 8:44:19 0:00:00 Wait UserRequest t 268 8:44:19 0:00:00 Wait UserRequest t 272 8:44:19 0:00:00 Wait UserRequest t 276 8:44:19 0:00:00 Wait UserRequest t 280 8:44:19 0:00:00 Wait UserRequest t 284 8:44:19 0:00:00 Wait UserRequest t 288 8:44:19 0:00:00 Wait UserRequest t 292 8:44:18 0:00:00 Wait UserRequest t 296 8:44:19 0:00:00 Wait UserRequest t 308 8:44:19 0:00:00 Wait UserRequest t 312 8:44:19 0:00:00 Wait UserRequest P 232 172 0:00:00 0:00:00 ProcName: dhcp.exe t 220 8:44:18 0:00:00 Wait DelayExecution t 240 8:44:17 0:00:00 Wait Executive P 244 144 0:00:00 0:00:00 ProcName: userinit.exe t 248 8:44:18 0:00:00 Wait UserRequest P 300 244 0:00:01 0:00:00 ProcName: explorer.exe t 304 8:44:22 0:00:00 Wait Executive w 300 60040 84000000 304 w 300 20042 84000000 304 w 300 20044 84000000 304 w 300 20046 84000000 304 w 300 2006c 04c00000 304 w 300 20074 94000000 304 Program Manager w 300 20076 50010000 304 Program Manager w 300 20078 56010340 304 Program Manager w 300 2007c 40000002 304 Program Manager w 300 20088 96040000 304 Program Manager w 300 2008c 5000000b 304 Start w 300 20090 50000045 304 Running Applications w 300 20092 56001341 304 Running Applications w 300 20094 84800000 304 Running Applications w 300 20096 52000000 304 Running Applications w 300 20098 84800003 304 Running Applications w 300 2009a 50000000 304 Running Applications w 300 2009c 84800003 304 Running Applications w 300 2009e 56000b4d 304 Running Applications w 300 200a0 84800001 304 Running Applications w 300 200a2 5600a249 304 Running Applications w 300 200aa 86040000 304 Start Menu P 260 300 0:00:00 0:00:00 ProcName: cmd.exe t 324 8:44:18 0:00:00 Wait UserRequest P 344 260 0:00:04 0:00:00 ProcName: _make.EXE t 336 8:44:31 0:00:00 Wait UserRequest P 372 300 0:00:00 0:00:00 ProcName: cmd.exe t 376 8:44:19 0:00:00 Wait UserRequest P 360 344 0:00:00 0:00:00 ProcName: gcc.exe t 404 8:44:19 0:00:00 Wait UserRequest P 396 360 0:00:00 0:00:00 ProcName: as.exe
Hi! Hartmut Birr wrote:
t 8 8:44:17 0:00:00 Wait WrQueue
^^^^^^^ Oh man! NOOOOO! The Kernel time is not init right so this is meaningless! Sorry, this was a good try BTW, 8^< James
Hartmut Birr wrote:
Alex Ionescu wrote:
Hartmut Birr wrote:
The real problem is, KiAbortWaitThread is called for a thread which does not waiting.
Ok, since this happened after my worker thread patch and they use kernel queues, I reviewed their implementation and found a number of important flaws.. wether or not they cause this problem I can't tell for sure, but I've also added a debug print before the KeAbortWaitThread call... let me know if this patch fixes anything or if the dprint shoes that the thread isn't really waiting.
Best regards, Alex Ionescu
I've test your changes (r20579 with r20601,20605,20606). KeAbortWaitThread is called for waiting threads only. But I'm running in another problem. Compiling ros on ros (with the nice parameter '-j2') hangs after some time. If I look to taskmgr or ctm, only the idle thread consumes cpu power. Sometimes I can stop the compiling with Ctrl-C, sometimes not. I wasn't able to compile ros on the smp machine. On the up machine, one of four compile runs does finish.
- Hartmut
Hi,
From what I understood from Waxdragon, my 20632 should have fixed this... can you try again please?
Best regards, Alex Ionescu
On 1/7/06, Hartmut Birr osexpert@googlemail.com wrote:
Alex Ionescu wrote:
Hartmut Birr wrote:
The real problem is, KiAbortWaitThread is called for a thread which does not waiting.
Ok, since this happened after my worker thread patch and they use kernel queues, I reviewed their implementation and found a number of important flaws.. wether or not they cause this problem I can't tell for sure, but I've also added a debug print before the KeAbortWaitThread call... let me know if this patch fixes anything or if the dprint shoes that the thread isn't really waiting.
Best regards, Alex Ionescu
I've test your changes (r20579 with r20601,20605,20606). KeAbortWaitThread is called for waiting threads only. But I'm running in another problem. Compiling ros on ros (with the nice parameter '-j2') hangs after some time.
When I try to selfhost with a -O2 build, I see a hang about 90% into the build. The hang I see is *hard*, at least under vmware, nothing is reponding. I don't use "-j2".
If I look to taskmgr or ctm, only the idle thread consumes cpu power.
I have seen this once, I don't remember what I was doing at the time, it may have compiling.
Sometimes I can stop the compiling with Ctrl-C, sometimes not. I wasn't able to compile ros on the smp machine. On the up machine, one of four compile runs does finish.
- Hartmut
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
WD -- <Alex_Ionescu> it's like saying let's rename Ke to Kernel because people think it's Ketchup
WaxDragon wrote:
On 1/7/06, Hartmut Birr osexpert@googlemail.com wrote:
Alex Ionescu wrote:
Hartmut Birr wrote:
The real problem is, KiAbortWaitThread is called for a thread which does not waiting.
Ok, since this happened after my worker thread patch and they use kernel queues, I reviewed their implementation and found a number of important flaws.. wether or not they cause this problem I can't tell for sure, but I've also added a debug print before the KeAbortWaitThread call... let me know if this patch fixes anything or if the dprint shoes that the thread isn't really waiting.
Best regards, Alex Ionescu
I've test your changes (r20579 with r20601,20605,20606). KeAbortWaitThread is called for waiting threads only. But I'm running in another problem. Compiling ros on ros (with the nice parameter '-j2') hangs after some time.
When I try to selfhost with a -O2 build, I see a hang about 90% into the build. The hang I see is *hard*, at least under vmware, nothing is reponding. I don't use "-j2".
If I look to taskmgr or ctm, only the idle thread consumes cpu power.
I have seen this once, I don't remember what I was doing at the time, it may have compiling.
Sometimes I can stop the compiling with Ctrl-C, sometimes not. I wasn't able to compile ros on the smp machine. On the up machine, one of four compile runs does finish.
- Hartmut
Currently I'm on r20600. After fixing some bugs (timer.c, gate.c and wait.c), I'm able to compile ros on ros. After updating to r20601, it is broken again. I commit the fixes if I reach the head revision.
- Hartmut
Index: ntoskrnl/ke/timer.c =================================================================== --- ntoskrnl/ke/timer.c (Revision 20601) +++ ntoskrnl/ke/timer.c (Arbeitskopie) @@ -165,6 +165,11 @@ return KeSetTimerEx(Timer, DueTime, 0, Dpc); }
+ULONG _help1; +ULONG _help2; +ULONG _help3; +ULONG _help4; + /* * @implemented * @@ -247,6 +252,7 @@ /* Query Interrupt Times */ InterruptTime = KeQueryInterruptTime();
+ _help1 = 1; /* Loop through the Timer List and remove Expired Timers. Insert them into the Expired Listhead */ LIST_FOR_EACH_SAFE(Timer, tmp, &KiTimerListHead, KTIMER, TimerListEntry) { @@ -259,7 +265,9 @@ RemoveEntryList(&Timer->TimerListEntry); InsertTailList(&ExpiredTimerList, &Timer->TimerListEntry); } + _help1 = 0;
+ _help2 = 1; /* Expire the Timers */ while (!IsListEmpty(&ExpiredTimerList)) {
@@ -267,11 +275,13 @@
/* Get the Timer */ Timer = CONTAINING_RECORD(CurrentEntry, KTIMER, TimerListEntry); + Timer->Header.Inserted = FALSE; DPRINT("Expiring Timer: %x\n", Timer);
/* Expire it */ KiHandleExpiredTimer(Timer); } + _help2 = 0;
DPRINT("Timers expired\n");
@@ -301,7 +311,9 @@ /* Set it as Signaled */ DPRINT("Setting Timer as Signaled\n"); Timer->Header.SignalState = TRUE; + _help3 = 1; KiWaitTest(&Timer->Header, IO_NO_INCREMENT); + _help3 = 0;
/* If the Timer is periodic, reinsert the timer with the new due time */ if (Timer->Period) { @@ -321,10 +333,11 @@ DPRINT("Timer->Dpc %x Timer->Dpc->DeferredRoutine %x\n", Timer->Dpc, Timer->Dpc->DeferredRoutine);
/* Insert the DPC */ + _help4 = 1; KeInsertQueueDpc(Timer->Dpc, NULL, NULL); - + _help4 = 0; DPRINT("Finished dpc routine\n"); } } Index: ntoskrnl/ke/gate.c =================================================================== --- ntoskrnl/ke/gate.c (Revision 20601) +++ ntoskrnl/ke/gate.c (Arbeitskopie) @@ -136,7 +136,6 @@ /* Reschedule the Thread */ DPRINT("Unblocking the Thread\n"); KiUnblockThread(WaitThread, &WaitStatus, EVENT_INCREMENT); - return;
quit: /* Release the Dispatcher Database Lock */ Index: ntoskrnl/ke/wait.c =================================================================== --- ntoskrnl/ke/wait.c (Revision 20601) +++ ntoskrnl/ke/wait.c (Arbeitskopie) @@ -152,6 +152,7 @@ { /* FIXME: The timer already expired, we should find a new ready thread */ Status = STATUS_SUCCESS; + CHECKPOINT1; break; }
@@ -329,6 +330,7 @@ { /* Return a timeout if we couldn't insert the timer */ Status = STATUS_TIMEOUT; + CHECKPOINT1; goto DontWait; } } @@ -560,6 +562,7 @@ { /* Return a timeout */ Status = STATUS_TIMEOUT; + CHECKPOINT1; goto DontWait; }
@@ -808,6 +811,10 @@ { KiDispatchThreadNoLock(Ready); } + else + { + KeReleaseDispatcherDatabaseLockFromDpcLevel(); + }
/* Lower irql back */ KeLowerIrql(OldIrql); Index: ntoskrnl/include/internal/ke.h =================================================================== --- ntoskrnl/include/internal/ke.h (Revision 20601) +++ ntoskrnl/include/internal/ke.h (Arbeitskopie) @@ -62,7 +62,6 @@ #define KeReleaseDispatcherDatabaseLockFromDpcLevel() \ KeReleaseSpinLockFromDpcLevel(&DispatcherDatabaseLock); #define KeReleaseDispatcherDatabaseLock(OldIrql) \ - KeReleaseSpinLockFromDpcLevel(&DispatcherDatabaseLock); \ KiExitDispatcher(OldIrql); #endif
@@ -675,7 +674,7 @@
VOID NTAPI -KeApplicationProcessorInit(VOID); +KeApplicationProcessorInit(ULONG);
VOID NTAPI
From: Hartmut Birr
Currently I'm on r20600.
I was wondering "why such an old version", until I looked up the date: 6 Jan. 150 commits in 3 days, I'd say this project is alive and kicking :-)
GvG
The svn.reactos.org is up and roung again I just update my local tree
----- Original Message ----- From: "Hartmut Birr" osexpert@googlemail.com To: "ReactOS Development List" ros-dev@reactos.org Sent: den 9 January 2006 16:32 Subject: Re: [ros-dev] Re: [ros-svn] [ion] 20554: - Fix shamefully dangerouslybroken Work Thread/Queue/Item implementation:
WaxDragon wrote:
On 1/7/06, Hartmut Birr osexpert@googlemail.com wrote:
Alex Ionescu wrote:
Hartmut Birr wrote:
The real problem is, KiAbortWaitThread is called for a thread which does not waiting.
Ok, since this happened after my worker thread patch and they use kernel queues, I reviewed their implementation and found a number of important flaws.. wether or not they cause this problem I can't tell for sure, but I've also added a debug print before the KeAbortWaitThread call... let me know if this patch fixes anything or if the dprint shoes that the thread isn't really waiting.
Best regards, Alex Ionescu
I've test your changes (r20579 with r20601,20605,20606). KeAbortWaitThread is called for waiting threads only. But I'm running
in
another problem. Compiling ros on ros (with the nice parameter '-j2') hangs after some time.
When I try to selfhost with a -O2 build, I see a hang about 90% into the build. The hang I see is *hard*, at least under vmware, nothing is reponding. I don't use "-j2".
If I look to taskmgr or ctm, only the idle thread consumes cpu power.
I have seen this once, I don't remember what I was doing at the time, it may have compiling.
Sometimes I can stop the compiling with Ctrl-C, sometimes not. I wasn't able to compile ros on the smp machine. On the up machine, one of four compile runs does finish.
- Hartmut
Currently I'm on r20600. After fixing some bugs (timer.c, gate.c and wait.c), I'm able to compile ros on ros. After updating to r20601, it is broken again. I commit the fixes if I reach the head revision.
- Hartmut
---------------------------------------------------------------------------- ----
Index: ntoskrnl/ke/timer.c
--- ntoskrnl/ke/timer.c (Revision 20601) +++ ntoskrnl/ke/timer.c (Arbeitskopie) @@ -165,6 +165,11 @@ return KeSetTimerEx(Timer, DueTime, 0, Dpc); }
+ULONG _help1; +ULONG _help2; +ULONG _help3; +ULONG _help4;
/*
- @implemented
@@ -247,6 +252,7 @@ /* Query Interrupt Times */ InterruptTime = KeQueryInterruptTime();
- _help1 = 1; /* Loop through the Timer List and remove Expired Timers. Insert them
into the Expired Listhead */
LIST_FOR_EACH_SAFE(Timer, tmp, &KiTimerListHead, KTIMER,
TimerListEntry)
{@@ -259,7 +265,9 @@ RemoveEntryList(&Timer->TimerListEntry); InsertTailList(&ExpiredTimerList, &Timer->TimerListEntry); }
_help1 = 0;
_help2 = 1; /* Expire the Timers */ while (!IsListEmpty(&ExpiredTimerList)) {
@@ -267,11 +275,13 @@
/* Get the Timer */ Timer = CONTAINING_RECORD(CurrentEntry, KTIMER, TimerListEntry);
Timer->Header.Inserted = FALSE; DPRINT("Expiring Timer: %x\n", Timer); /* Expire it */ KiHandleExpiredTimer(Timer);}
_help2 = 0;
DPRINT("Timers expired\n");
@@ -301,7 +311,9 @@ /* Set it as Signaled */ DPRINT("Setting Timer as Signaled\n"); Timer->Header.SignalState = TRUE;
_help3 = 1; KiWaitTest(&Timer->Header, IO_NO_INCREMENT);
_help3 = 0;
/* If the Timer is periodic, reinsert the timer with the new due time
*/
if (Timer->Period) {@@ -321,10 +333,11 @@ DPRINT("Timer->Dpc %x Timer->Dpc->DeferredRoutine %x\n",
Timer->Dpc, Timer->Dpc->DeferredRoutine);
/* Insert the DPC */
_help4 = 1; KeInsertQueueDpc(Timer->Dpc, NULL, NULL);
- _help4 = 0; DPRINT("Finished dpc routine\n"); }
} Index: ntoskrnl/ke/gate.c =================================================================== --- ntoskrnl/ke/gate.c (Revision 20601) +++ ntoskrnl/ke/gate.c (Arbeitskopie) @@ -136,7 +136,6 @@ /* Reschedule the Thread */ DPRINT("Unblocking the Thread\n"); KiUnblockThread(WaitThread, &WaitStatus, EVENT_INCREMENT);
- return;
quit: /* Release the Dispatcher Database Lock */ Index: ntoskrnl/ke/wait.c =================================================================== --- ntoskrnl/ke/wait.c (Revision 20601) +++ ntoskrnl/ke/wait.c (Arbeitskopie) @@ -152,6 +152,7 @@ { /* FIXME: The timer already expired, we should find a new
ready thread */
Status = STATUS_SUCCESS;
CHECKPOINT1; break; }@@ -329,6 +330,7 @@ { /* Return a timeout if we couldn't insert the timer */ Status = STATUS_TIMEOUT;
CHECKPOINT1; goto DontWait; } }@@ -560,6 +562,7 @@ { /* Return a timeout */ Status = STATUS_TIMEOUT;
CHECKPOINT1; goto DontWait; }@@ -808,6 +811,10 @@ { KiDispatchThreadNoLock(Ready); }
else
{
KeReleaseDispatcherDatabaseLockFromDpcLevel();}
/* Lower irql back */ KeLowerIrql(OldIrql);
Index: ntoskrnl/include/internal/ke.h
--- ntoskrnl/include/internal/ke.h (Revision 20601) +++ ntoskrnl/include/internal/ke.h (Arbeitskopie) @@ -62,7 +62,6 @@ #define KeReleaseDispatcherDatabaseLockFromDpcLevel() \ KeReleaseSpinLockFromDpcLevel(&DispatcherDatabaseLock); #define KeReleaseDispatcherDatabaseLock(OldIrql) \
- KeReleaseSpinLockFromDpcLevel(&DispatcherDatabaseLock); \ KiExitDispatcher(OldIrql);
#endif
@@ -675,7 +674,7 @@
VOID NTAPI -KeApplicationProcessorInit(VOID); +KeApplicationProcessorInit(ULONG);
VOID NTAPI
---------------------------------------------------------------------------- ----
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev