Hey everyone,
a few days ago, I started investigating a report in Bugzilla containing an application that raises an exception that goes unhandled, and the stack trace that gets printed is clearly corrupt.
The report is the following: http://www.reactos.org/bugzilla/show_bug.cgi?id=5684
At first I thought the context wasn't being recorded properly, since both EBP and ESP are reported as being 0 at the time the exception was raised. However, there are applications for which we show a correct stack trace in case of an unhandled exception, so this most likely wasn't the case.
My next thought was that we were somehow corrupting the context, but this wasn't the case either. The exception is also being dispatched nicely: exception handlers get the correct details.
Lastly, the UnhandledExceptionFilter seemed to be passing the context on to the stack trace printing function without altering it, so the problem isn't there either. So where else could something be happening? That's right, inside one of the exception handler functions.
Some of you may recognize the exception code (0xeedfade) in the bug report as a Delphi exception. I did too, so I fired up my copy of Delphi 6 and went digging in its exception handler functions.
Every occurrence of UnhandledExceptionFilter in its exception handler functions showed the context wasn't being passed to it, but instead the registration frame was. This happens in a very predictive way however, so creating a patch (read: filthy hack) that detects this and fixes up the context record was easy. It is heavily tied to exactly this kind of defect, no general case.
Running the application still triggers the exception of course, but now we get a reliable stack trace. A naive search in Bugzilla indicates at least the following reports are affected by this problem:
Bug 3602, bug 3927, bug 5123 and bug 5684.
I haven't tested them all, but the tell-tale signs are there.
What does one do when encountering such a bug in compiler? Check if it still exists in the latest version and file a bug report of course. I got confirmation that the bug is present in the recently released Delphi XE with the help of Marco van de Voort of the FreePascal project (shameless plug, I know).
A bug report has been filed with what I consider an extreme amount of detail and how to solve it in their product. Did I mention the report also contains a shameless plug to the ReactOS project?
Today, to my astonishment, I found another report on their site which.. you guessed it.. mentions this problem. This report appears to have been made in 2007, so guessing when (newly compiled) Delphi applications will finally start calling UnhandledExceptionFilter correctly is an exercise I leave to the reader.
Now to get to the question that some of you may already be asking, and is the purpose of this mail: Are we putting this hack into our codebase so we get a decent stack trace when an application messes up in the manner this mail is about, or not? Applications can mess this up in a lot of different ways, so it won't work on those.
I warmly invite everyone to weigh in on this attempt of a discussion (which I hope will be productive), and hope people will see (again) that not everything that goes wrong in ReactOS is the fault of the OS.
References for those interested in the nasty details:
Bug report: http://qc.embarcadero.com/wc/qcmain.aspx?d=89377
Patch:
Index: dll/win32/kernel32/except/except.c =================================================================== --- dll/win32/kernel32/except/except.c (revision 49448) +++ dll/win32/kernel32/except/except.c (working copy) @@ -218,6 +218,13 @@ PEXCEPTION_RECORD ExceptionRecord = ExceptionInfo->ExceptionRecord; PCONTEXT ContextRecord = ExceptionInfo->ContextRecord;
+ /* Attempt to hack around bad calls of UnhandledExceptionFilter */ + if ((ULONG_PTR) (ExceptionRecord->ExceptionAddress) != (ULONG_PTR) (ContextRecord->Eip)) + { + DPRINT1("Bogus context detected, using context record HACK!!\n"); + ContextRecord = (PCONTEXT) ExceptionInfo[1].ExceptionRecord; + } + /* Print a stack trace. */ DbgPrint("Unhandled exception\n"); DbgPrint("ExceptionCode: %8x\n", ExceptionRecord->ExceptionCode); @@ -415,11 +422,17 @@
if (dwExceptionCode == 0xeedface) { - DPRINT1("Delphi Exception at address: %p\n", ExceptionRecord.ExceptionInformation[0]); + DPRINT1("Delphi 2 Exception at address: %p\n", ExceptionRecord.ExceptionInformation[0]); DPRINT1("Exception-Object: %p\n", ExceptionRecord.ExceptionInformation[1]); DPRINT1("Exception text: %s\n", ExceptionRecord.ExceptionInformation[2]); }
+ if (dwExceptionCode == 0xeedfade) + { + DPRINT1("Delphi Exception at address: %p\n", ExceptionRecord.ExceptionInformation[0]); + DPRINT1("Exception-Object: %p\n", ExceptionRecord.ExceptionInformation[1]); + } + /* Trace the wine special error and show the modulename and functionname */
if (dwExceptionCode == 0x80000100 /*EXCEPTION_WINE_STUB*/) {
WBR,
Roel Messiant
First thanks for your detailed analysis. I have looked into this problem some time ago as well and I found a website, which documents this behaviour here: http://www.delphi-forum.de/viewtopic.php?p=290872&sid=90dd5488f7ea8262bc... (German, sorry)
The parameters passed for these delphi exceptions are different and they are only to be handled by the delphi exception handler, which is triggered after the first chance exception and should display a message box or something and terminate the program.
Your suggestion to "hack" this into reactos seems reasonable, and I wouldn't consider this a hack, but rather a feature.
The best way to handle this is before it get's forwarded to the kernel debugger, so probably in KiDipatchException. It just has to make sure that the user mode exception handler receives the original parameters.
Regards, Timo
I think you're talking about the ExceptionInformation part of the EXCEPTION_RECORD structure? Yeah, Delphi can of course put its own data in there (the 7 parameters as mentioned in the link), but that's not really what I was talking about.
All I found that goes wrong is the Delphi exception handler functions don't pass the CONTEXT structure to the UnhandledExceptionFilter function, which they really should do.
I get the impression you think the exception handlers get called with the wrong arguments, but that's not the case. We don't do anything wrong, just the application.
WBR,
Roel Messiant
I have looked into this again in a little more detail. What I think is happening is the following:
in delphi an exception is raised.
Delphi.Sytstem._RaiseExcept RaiseException RtlRaiseException RtlDispatchException RtlpExecuteHandlerForExecption Delphi.System._ExceptionHandler UnhandledExceptionFilter
EXCEPTION_DISPOSITION __cdecl Delphi.System.ExceptionHandler( struct _EXCEPTION_RECORD *ExceptionRecord, void * EstablisherFrame, struct _CONTEXT *ContextRecord, void * DispatcherContext );
Source code can be found here (search for _RaiseExcept, _ExceptionHandler): http://www.getunderstand.com/documents/sample_reports/udelphi_example_report...
what you can see in the code for _ExceptionHandler is 10680 LEA EAX,[ESP+4] 10681 PUSH EAX 10682 CALL UnhandledExceptionFilter
This is an unmodified esp, so esp points to the return address and esp+4 points to the function parameters so it's effectively doing something like
|EXCEPTION_POINTER ExceptionPointers; ||||ExceptionPointers||||.||ExceptionRecord| = ExceptionRecord;| ExceptionPointers||||.||||ContextRecord| = EstablisherFrame; UnhandledExceptionFilter(&|ExceptionPointers|);
It is done that way in multiple locations (line 9641, 9666 (push / esp+8), 10028, 10064, ...) so It's obviously done intentionally.
Correct, that's exactly the same as what I found, and the bug report I filed contains pretty much the same explanation. I suspect the coder(s) of their exception handler functions thought the first 2 parameters should be passed to UnhandledExceptionFilter. Intentional or not however, this doesn't make them correct.
The only reason I think they got away with it is because UnhandledExceptionFilter doesn't tend to use the context, but when it does.. fireworks.
As for "our way around it", I hope it's clear now that this hack/feature would belong in UnhandledExceptionFilter or maybe the function it calls, PrintStackTrace.
I've also noticed the additional debug output you arranged today when the exception code is 0xeedfade. You can't reuse the 0xeedface branch for this, it tends to crash on printing the exception text then. A separate if branch would be needed.
0xeedface was the Delphi 2 exception code, and I suspect the semantics of the ExceptionInformation parameters changed after that (which would explain why they changed the code to 0xeedfade in the first place).
WBR,
Roel Messiant