When he merges with the tip I am really ready for 0.2.6
Thanks Steven
__________________________________ Celebrate Yahoo!'s 10th Birthday! Yahoo! Netrospective: 100 Moments of the Web http://birthday.yahoo.com/netrospective/
Steven Edwards wrote:
When he merges with the tip I am really ready for 0.2.6
Thanks Steven
__________________________________ Celebrate Yahoo!'s 10th Birthday! Yahoo! Netrospective: 100 Moments of the Web http://birthday.yahoo.com/netrospective/ _______________________________________________ Ros-dev mailing list Ros-dev@reactos.com http://reactos.com:8080/mailman/listinfo/ros-dev
Do you want to delay 0.2.6 until I merge with tip? Else it will be for 0.3.x/0.2.7
Best regards, Alex Ionescu
Alex Ionescu wrote:
Steven Edwards wrote:
When he merges with the tip I am really ready for 0.2.6
Thanks Steven
Do you want to delay 0.2.6 until I merge with tip? Else it will be for 0.3.x/0.2.7
Best regards, Alex Ionescu
Yes!!!!! Delay 0.2.6 until you're ready to merge!
One Vote to delay! 8^) James
Hi Alex!
Is this one of the problems your branch fixes? This happens some times after +12 hours of operation. But,,, it is the same bug check!
Thanks, James
KeBugCheck at ob/object.c:1054 Bug detected (code 0 param 0 0 0 0) The bug code is undefined. Please use an existing code instead.
Frames: <ntoskrnl.exe: ca0c> <ntoskrnl.exe: ca2c> <ntoskrnl.exe: 77c1d> <ntoskrnl.exe: 231d4> <ntoskrnl.exe: 779fa> <ntoskrnl.exe: 77b40> <ntoskrnl.exe: 77c8c> <ntoskrnl.exe: 2329f> <ntoskrnl.exe: 779fa> <ntoskrnl.exe: 77b40> <ntoskrnl.exe: 77c8c> <ntoskrnl.exe: 19647> <ntoskrnl.exe: 39ab> <7FFE0304>
James Tabor wrote:
Hi Alex!
Is this one of the problems your branch fixes? This happens some times after +12 hours of operation. But,,, it is the same bug check!
Thanks, James
KeBugCheck at ob/object.c:1054 Bug detected (code 0 param 0 0 0 0) The bug code is undefined. Please use an existing code instead.
Frames: <ntoskrnl.exe: ca0c> <ntoskrnl.exe: ca2c> <ntoskrnl.exe: 77c1d> <ntoskrnl.exe: 231d4> <ntoskrnl.exe: 779fa> <ntoskrnl.exe: 77b40> <ntoskrnl.exe: 77c8c> <ntoskrnl.exe: 2329f> <ntoskrnl.exe: 779fa> <ntoskrnl.exe: 77b40> <ntoskrnl.exe: 77c8c> <ntoskrnl.exe: 19647> <ntoskrnl.exe: 39ab> <7FFE0304> _______________________________________________ Ros-dev mailing list Ros-dev@reactos.com http://reactos.com:8080/mailman/listinfo/ros-dev
It's someone referencing an object while it's being deleted... my branch probably doesn't fix it, but my Object Manager future branch will ;)
Best regards, Alex Ionescu
Hi, Alex Ionescu wrote:
James Tabor wrote:
Hi Alex!
Is this one of the problems your branch fixes? This happens some times after +12 hours of operation. But,,, it is the same bug check!
Thanks, James
KeBugCheck at ob/object.c:1054 Bug detected (code 0 param 0 0 0 0) The bug code is undefined. Please use an existing code instead.
Frames: <ntoskrnl.exe: ca0c> <ntoskrnl.exe: ca2c> <ntoskrnl.exe: 77c1d> <ntoskrnl.exe: 231d4> <ntoskrnl.exe: 779fa> <ntoskrnl.exe: 77b40> <ntoskrnl.exe: 77c8c> <ntoskrnl.exe: 2329f> <ntoskrnl.exe: 779fa> <ntoskrnl.exe: 77b40> <ntoskrnl.exe: 77c8c> <ntoskrnl.exe: 19647> <ntoskrnl.exe: 39ab> <7FFE0304>
It's someone referencing an object while it's being deleted... my branch probably doesn't fix it, but my Object Manager future branch will ;)
Best regards, Alex Ionescu
Well, I'm doing the ROS on ROS building thing, and I walk away from it come back and find the same debug check, same 1054 every time!
Thanks, James
James Tabor wrote:
Hi, Alex Ionescu wrote:
James Tabor wrote:
Hi Alex!
Is this one of the problems your branch fixes? This happens some times after +12 hours of operation. But,,, it is the same bug check!
Thanks, James
KeBugCheck at ob/object.c:1054 Bug detected (code 0 param 0 0 0 0) The bug code is undefined. Please use an existing code instead.
Frames: <ntoskrnl.exe: ca0c> <ntoskrnl.exe: ca2c> <ntoskrnl.exe: 77c1d> <ntoskrnl.exe: 231d4> <ntoskrnl.exe: 779fa> <ntoskrnl.exe: 77b40> <ntoskrnl.exe: 77c8c> <ntoskrnl.exe: 2329f> <ntoskrnl.exe: 779fa> <ntoskrnl.exe: 77b40> <ntoskrnl.exe: 77c8c> <ntoskrnl.exe: 19647> <ntoskrnl.exe: 39ab> <7FFE0304>
It's someone referencing an object while it's being deleted... my branch probably doesn't fix it, but my Object Manager future branch will ;)
Best regards, Alex Ionescu
Well, I'm doing the ROS on ROS building thing, and I walk away from it come back and find the same debug check, same 1054 every time!
Thanks, James
Can you build with DBG = 1 and give me the new stack trace? (it will have module file + line number in it).
Best regards, Alex Ionescu
Alex,
This is the stack trace (I guess) from your svn branch. I have two cmd running. First is compiling ros and the second is running ctm (Console Task Manager). You can see the plist at the bottom of the post.
http://rafb.net/paste/results/oqcaSV21.html
Thanks, James
James Tabor wrote:
Alex,
This is the stack trace (I guess) from your svn branch. I have two cmd running. First is compiling ros and the second is running ctm (Console Task Manager). You can see the plist at the bottom of the post.
Fixed, let me know how the new test goes.
Hi! Alex Ionescu wrote:
James Tabor wrote:
Alex,
This is the stack trace (I guess) from your svn branch. I have two cmd running. First is compiling ros and the second is running ctm (Console Task Manager). You can see the plist at the bottom of the post.
Fixed, let me know how the new test goes.
That passed 4 ros on ros builds, so I started the taskmgr.exe closed the ctm and 2nd cmd. Started the 5th pass and it failed. The object.c bug, looks the same as the main branch too.
http://rafb.net/paste/results/gupYXE57.html
Thanks, James
James Tabor wrote:
Hi! Alex Ionescu wrote:
James Tabor wrote:
Alex,
This is the stack trace (I guess) from your svn branch. I have two cmd running. First is compiling ros and the second is running ctm (Console Task Manager). You can see the plist at the bottom of the post.
Fixed, let me know how the new test goes.
That passed 4 ros on ros builds, so I started the taskmgr.exe closed the ctm and 2nd cmd. Started the 5th pass and it failed. The object.c bug, looks the same as the main branch too.
http://rafb.net/paste/results/gupYXE57.html
Thanks, James
This test same as above using the taskmgr.exe while compiling Ros on Ros. The branch is Head or current SVN,
http://rafb.net/paste/results/qOCyyY29.html
8^) James
James Tabor wrote:
This test same as above using the taskmgr.exe while compiling Ros on Ros. The branch is Head or current SVN,
I believe this is due to a bug in the object parsing code where it doesn't reference an object when walking into deeper levels. Could you please try my attached patch on SVN trunk if it fixes it?
Best Regards, Thomas
Index: cm/regobj.c =================================================================== --- cm/regobj.c (revision 13818) +++ cm/regobj.c (working copy) @@ -212,11 +212,6 @@ return(STATUS_REPARSE); } } - - ObReferenceObjectByPointer(FoundObject, - STANDARD_RIGHTS_REQUIRED, - NULL, - UserMode); }
DPRINT("CmiObjectParse: %s\n", FoundObject->Name); Index: ob/object.c =================================================================== --- ob/object.c (revision 13818) +++ ob/object.c (working copy) @@ -439,14 +439,16 @@ /* reparse the object path */ NextObject = NameSpaceRoot; current = PathString.Buffer; - + } + + if (NextObject != NULL) + { ObReferenceObjectByPointer(NextObject, DIRECTORY_TRAVERSE, NULL, UserMode); } - - if (NextObject == NULL) + else { break; }
Thomas Weidenmueller schrieb:
James Tabor wrote:
This test same as above using the taskmgr.exe while compiling Ros on Ros. The branch is Head or current SVN,
I believe this is due to a bug in the object parsing code where it doesn't reference an object when walking into deeper levels. Could you please try my attached patch on SVN trunk if it fixes it?
Best Regards, Thomas
I think the problem is a missing status check after a call to ObReferenceXXX. The object is being deleted. The return value is an error and the object isn't referenced. Later the object get a dereference call too much.
- Hartmut
Hartmut Birr wrote:
I think the problem is a missing status check after a call to
ObReferenceXXX. The object is being deleted. The return value is an error and the object isn't referenced. Later the object get a dereference call too much.
Yes, even though looking at the code without my changes there obviously have to be too many dereferences. Just think of the case where the parse routine returns STATUS_SUCCESS and returns a pointer to the next object. In this case the ObFindObject routine doesn't reference it (because it assumes the parsing routine does it (which is wrong), and likely triggered the bug when handling symbolic links, which don't reference the next object pointer). the next loop it might find another next object and dereferences the previous one (which in case of symbolic links) wasn't referenced. every time objects were parsed bug there were deeper levels to be parsed some objects might be dereferenced too often.
I think my patch should fix this problem, so someone who can reproduce it should give it a try.
Sorry for this bad description but I'm a bit in a hurry.
Best Regards, Thomas
Thomas Weidenmueller schrieb:
I believe this is due to a bug in the object parsing code where it doesn't reference an object when walking into deeper levels. Could you please try my attached patch on SVN trunk if it fixes it?
Best Regards, Thomas
Hi,
I think your patch isn't correct. The parsing routine from an object must always reference the returned object because an other thread may try to remove this object. Since a long time I'm searching for a bug which corrupts the registry and which crashs ros on my smp machine. It was always triggered by the font substitution query routine from win32k. I've added a missing locking operation and moved the referencing into the locked region. This fixes my smp crash and may also fix James problem. My test condition is compiling ros on ros in one console and running ctm in an other one.
- Hartmut
M:\Sandbox\ros_work\reactos>set SVN_EDITOR=notepad
M:\Sandbox\ros_work\reactos>d:\programme\subversion\bin\svn.exe diff ntoskrnl\cm Index: ntoskrnl/cm/registry.c =================================================================== --- ntoskrnl/cm/registry.c (revision 13819) +++ ntoskrnl/cm/registry.c (working copy) @@ -20,7 +20,6 @@
POBJECT_TYPE CmiKeyType = NULL; PREGISTRY_HIVE CmiVolatileHive = NULL; -KSPIN_LOCK CmiKeyListLock;
LIST_ENTRY CmiHiveListHead;
@@ -336,7 +335,6 @@ CmiVolatileHive->RootSecurityCell = RootSecurityCell; #endif
- KeInitializeSpinLock(&CmiKeyListLock);
/* Create '\Registry\Machine' key. */ RtlInitUnicodeString(&KeyName, Index: ntoskrnl/cm/regobj.c =================================================================== --- ntoskrnl/cm/regobj.c (revision 13819) +++ ntoskrnl/cm/regobj.c (working copy) @@ -76,6 +76,9 @@ KeyName.Length); KeyName.Buffer[KeyName.Length / sizeof(WCHAR)] = 0;
+ /* Acquire hive lock */ + KeEnterCriticalRegion(); + ExAcquireResourceExclusiveLite(&CmiRegistryLock, TRUE);
FoundObject = CmiScanKeyList(ParsedKey, &KeyName, @@ -91,6 +94,8 @@ Attributes); if (!NT_SUCCESS(Status) || (SubKeyCell == NULL)) { + ExReleaseResourceLite(&CmiRegistryLock); + KeLeaveCriticalRegion(); RtlFreeUnicodeString(&KeyName); return(STATUS_UNSUCCESSFUL); } @@ -104,6 +109,9 @@ &LinkPath); if (NT_SUCCESS(Status)) { + ExReleaseResourceLite(&CmiRegistryLock); + KeLeaveCriticalRegion(); + DPRINT("LinkPath '%wZ'\n", &LinkPath);
/* build new FullPath for reparsing */ @@ -152,6 +160,8 @@ (PVOID*)&FoundObject); if (!NT_SUCCESS(Status)) { + ExReleaseResourceLite(&CmiRegistryLock); + KeLeaveCriticalRegion(); RtlFreeUnicodeString(&KeyName); return(Status); } @@ -179,7 +189,12 @@ if (NT_SUCCESS(Status)) { DPRINT("LinkPath '%wZ'\n", &LinkPath); + + ExReleaseResourceLite(&CmiRegistryLock); + KeLeaveCriticalRegion();
+ ObDereferenceObject(FoundObject); + /* build new FullPath for reparsing */ TargetPath.MaximumLength = LinkPath.MaximumLength; if (EndPtr != NULL) @@ -212,12 +227,9 @@ return(STATUS_REPARSE); } } - - ObReferenceObjectByPointer(FoundObject, - STANDARD_RIGHTS_REQUIRED, - NULL, - UserMode); } + ExReleaseResourceLite(&CmiRegistryLock); + KeLeaveCriticalRegion();
DPRINT("CmiObjectParse: %s\n", FoundObject->Name);
@@ -274,6 +286,10 @@
ObReferenceObject (ParentKeyObject);
+ /* Acquire hive lock */ + KeEnterCriticalRegion(); + ExAcquireResourceExclusiveLite(&CmiRegistryLock, TRUE); + if (!NT_SUCCESS(CmiRemoveKeyFromList(KeyObject))) { DPRINT1("Key not found in parent list ???\n"); @@ -302,6 +318,9 @@
ObDereferenceObject (ParentKeyObject);
+ ExReleaseResourceLite(&CmiRegistryLock); + KeLeaveCriticalRegion(); + if (KeyObject->NumberOfSubKeys) { KEBUGCHECK(REGISTRY_ERROR); @@ -532,11 +551,9 @@ CmiAddKeyToList(PKEY_OBJECT ParentKey, PKEY_OBJECT NewKey) { - KIRQL OldIrql;
DPRINT("ParentKey %.08x\n", ParentKey);
- KeAcquireSpinLock(&CmiKeyListLock, &OldIrql);
if (ParentKey->SizeOfSubKeys <= ParentKey->NumberOfSubKeys) { @@ -568,7 +585,6 @@ NULL, UserMode); NewKey->ParentKey = ParentKey; - KeReleaseSpinLock(&CmiKeyListLock, OldIrql); }
@@ -576,11 +592,9 @@ CmiRemoveKeyFromList(PKEY_OBJECT KeyToRemove) { PKEY_OBJECT ParentKey; - KIRQL OldIrql; DWORD Index;
ParentKey = KeyToRemove->ParentKey; - KeAcquireSpinLock(&CmiKeyListLock, &OldIrql); /* FIXME: If list maintained in alphabetic order, use dichotomic search */ for (Index = 0; Index < ParentKey->NumberOfSubKeys; Index++) { @@ -591,7 +605,6 @@ &ParentKey->SubKeys[Index + 1], (ParentKey->NumberOfSubKeys - Index - 1) * sizeof(PKEY_OBJECT)); ParentKey->NumberOfSubKeys--; - KeReleaseSpinLock(&CmiKeyListLock, OldIrql);
DPRINT("Dereference parent key: 0x%x\n", ParentKey); @@ -599,7 +612,6 @@ return STATUS_SUCCESS; } } - KeReleaseSpinLock(&CmiKeyListLock, OldIrql);
return STATUS_UNSUCCESSFUL; } @@ -611,13 +623,12 @@ ULONG Attributes) { PKEY_OBJECT CurKey; - KIRQL OldIrql; ULONG Index; - + NTSTATUS Status; + DPRINT("Scanning key list for: %wZ (Parent: %wZ)\n", KeyName, &Parent->Name);
- KeAcquireSpinLock(&CmiKeyListLock, &OldIrql); /* FIXME: if list maintained in alphabetic order, use dichotomic search */ for (Index=0; Index < Parent->NumberOfSubKeys; Index++) { @@ -627,8 +638,7 @@ if ((KeyName->Length == CurKey->Name.Length) && (_wcsicmp(KeyName->Buffer, CurKey->Name.Buffer) == 0)) { - KeReleaseSpinLock(&CmiKeyListLock, OldIrql); - return CurKey; + break; } } else @@ -636,13 +646,23 @@ if ((KeyName->Length == CurKey->Name.Length) && (wcscmp(KeyName->Buffer, CurKey->Name.Buffer) == 0)) { - KeReleaseSpinLock(&CmiKeyListLock, OldIrql); - return CurKey; + break; } } } - KeReleaseSpinLock(&CmiKeyListLock, OldIrql);
+ if (Index < Parent->NumberOfSubKeys) + { + Status = ObReferenceObjectByPointer(CurKey, + STANDARD_RIGHTS_REQUIRED, + NULL, + UserMode); + if (NT_SUCCESS(Status)) + { + return CurKey; + } + } + return NULL; }
Hartmut Birr wrote:
I think your patch isn't correct. The parsing routine from an object
must always reference the returned object because an other thread may try to remove this object. Since a long time I'm searching for a bug which corrupts the registry and which crashs ros on my smp machine. It was always triggered by the font substitution query routine from win32k. I've added a missing locking operation and moved the referencing into the locked region. This fixes my smp crash and may also fix James problem. My test condition is compiling ros on ros in one console and running ctm in an other one.
My patch does reference returned objects, that's the whole clue which wasn't done for symbolic links and caused too many dereferences. Alex Ionescu and James Tabor independently tested it, compiling ros on ros works fine and all other known cases that triggered this bug crashing ros also seem to be fixed. The patch already is in alex's branch and is going to be merged to trunk. However holding the registry lock while parsing might be a good idea, i should propably test on a SMP machine.
Best Regards, Thomas
Hartmut Birr wrote:
Hi,
I think your patch isn't correct. The parsing routine from an object must always reference the returned object because an other thread may try to remove this object. Since a long time I'm searching for a bug which corrupts the registry and which crashs ros on my smp machine. It was always triggered by the font substitution query routine from win32k. I've added a missing locking operation and moved the referencing into the locked region. This fixes my smp crash and may also fix James problem. My test condition is compiling ros on ros in one console and running ctm in an other one.
- Hartmut
I did some more investigation and your patch looks better indeed more correct, haven't tried it myself though. I'll tell alex to revert my changes from his branch.
Best Regards, Thomas