Author: ion
Date: Tue Jun 21 22:43:55 2011
New Revision: 52412
URL: http://svn.reactos.org/svn/reactos?rev=52412&view=rev
Log:
[NTOSKRNL]: It seems NtYieldExecution in Windows heavily biases towards avoiding a yield as much as possible, and thus attempts to locklessly check the PRCB's ready summary. Problem being that the old code was running a bunch of instructions before checking for this ready summary, and doing a double-fs/gs dereference to get the ready summary. Upon reading some WRK documentation, it appears Windows uses a KiGetCurrentReadySummary macro, which WinDBG analysis shows to be an inline that reads the ReadySummary straight from PrcbData with only a single segment read, and no code preceeding it. Although it's "unlikely" to ever really be a problem, it's perhaps best to replicate this behavior since there could be dependencies on it. Comments in file point to PDF with this information (hosted on Microsoft Research's website, publically accessible).
[NTOSKRNL]: Also, fix locking order such that the current PRCB is only read after the IRQL is raised to SYNCH_LEVEL. I'm too tired to tell if this is a requirement, but the PDF shows it being done that way in Windows, so I've re-ordered the operations to match. Intuition would indicate that this shouldn't matter, as fs:1C shouldn't change from under us in the middle of execution, but why argue with public source code?
vicmarcal said it's okay if this breaks the build (I don't have any way to build ReactOS anymore).
Modified:
trunk/reactos/ntoskrnl/ke/thrdschd.c
Modified: trunk/reactos/ntoskrnl/ke/thrdschd.c
URL: http://svn.reactos.org/svn/reactos/trunk/reactos/ntoskrnl/ke/thrdschd.c?rev…
==============================================================================
--- trunk/reactos/ntoskrnl/ke/thrdschd.c [iso-8859-1] (original)
+++ trunk/reactos/ntoskrnl/ke/thrdschd.c [iso-8859-1] Tue Jun 21 22:43:55 2011
@@ -696,6 +696,28 @@
return OldAffinity;
}
+//
+// This macro exists because NtYieldExecution locklessly attempts to read from
+// the KPRCB's ready summary, and the usual way of going through KeGetCurrentPrcb
+// would require getting fs:1C first (or gs), and then doing another dereference.
+// In an attempt to minimize the amount of instructions and potential race/tear
+// that could happen, Windows seems to define this as a macro that directly acceses
+// the ready summary through a single fs: read by going through the KPCR's PrcbData.
+//
+// See http://research.microsoft.com/en-us/collaboration/global/asia-pacific/
+// programs/trk_case4_process-thread_management.pdf
+//
+// We need this per-arch because sometimes it's Prcb and sometimes PrcbData, and
+// because on x86 it's FS, and on x64 it's GS (not sure what it is on ARM/PPC).
+//
+#ifdef _M_IX86
+#define KiGetCurrentReadySummary() __readfsdword(FIELD_OFFSET(KIPCR, PrcbData.ReadySummary))
+#elif _M_AMD64
+#define KiGetCurrentReadySummary() __readgsdword(FIELD_OFFSET(KIPCR, Prcb.ReadySummary))
+#else
+#error Implement me!
+#endif
+
/*
* @implemented
*/
@@ -703,16 +725,23 @@
NTAPI
NtYieldExecution(VOID)
{
- NTSTATUS Status = STATUS_NO_YIELD_PERFORMED;
+ NTSTATUS Status;
KIRQL OldIrql;
- PKPRCB Prcb = KeGetCurrentPrcb();
- PKTHREAD Thread = KeGetCurrentThread(), NextThread;
+ PKPRCB Prcb;
+ PKTHREAD Thread; NextThread;
+
+ /* NB: No instructions (other than entry code) should preceed this line */
/* Fail if there's no ready summary */
- if (!Prcb->ReadySummary) return Status;
-
- /* Raise IRQL to synch */
+ if (!KiGetCurrentReadySummary()) return STATUS_NO_YIELD_PERFORMED;
+
+ /* Now get the current thread, set the status... */
+ Status = STATUS_NO_YIELD_PERFORMED;
+ Thread = KeGetCurrentThread();
+
+ /* Raise IRQL to synch and get the KPRCB now */
OldIrql = KeRaiseIrqlToSynchLevel();
+ Prcb = KeGetCurrentPrcb();
/* Now check if there's still a ready summary */
if (Prcb->ReadySummary)