Hi,
I am pleased to announce the long awaited (or absolutely surprising)
release of PSEH3.
This complete from scratch rewrite takes reactos exception handling to a
new level. It comprises the following advancements:
- Use of enums instead of const variables for internal trylevel counting
and other mechanisms. enums can be declared in a nested manner as well
and only enums are treated by gcc as full compile time constants, while
const variables are treated as dynamic and are only optimized away by
the compiler later. This generally results in better optimized code.
- No more use of dynamic sized arrays. Those make the compiler emit a
slow call to _chkstk. It could have been prevented in PSEH2, if enums
had been used instead of consts. The new code uses the same registration
frame for the top level and nested levels, which also simplifies the code.
- No more use of nested function stack trampolines. These are small code
chunks that are being dynamically put on the stack, when the address of
a nested function is taken. The address of the nested function then
points to this trampoline, which sets up a pointer for nested variable
access (nested function can access variables in the parent frame). These
were parsed by PSEH2 to calculate the neccessary nested frame pointer.
PSEH3 uses a static scope table instead that contains the address of the
nested function. this is only possible by first declaring the function
and putting the implementation after the address is stored in the static
table. This is because nested functions that do not access any variables
in the parent frame do not need a stack trampoline and their addresses
are thus considered compiler constants. GCC doesn't notice that a nested
function uses variables of the parent frame, before the function is
actually implemented, so with only a prototype, it assumes a constant
address. To get the value of the parent frame pointer, PSEH2 calls the
nested function 2 times: The first time with a bogus frame pointer and a
parameter that tells the function to return the (assumed) address of the
current registration frame. Without a proper parent frame pointer the
nested function will return an offset that we can use together with the
real address of the registration frame (which we know, when calling the
filter of finally functions) to calculate the proper parent frame
pointer. The 2nd call passes a different parameter and allows the
function to do the actual work and access variables in the parent frame.
- Improved performance due to use of "asm goto". This construct allows
to get away with having labels, that are never accessed by any C code.
The compiler doesn't remove them. So we do not need the old hack, of
comparing a "volatile" 0 against 0, just to confuse GCC to not optimize
away a label. Even though there is a bug/defficiency/limitation in asm
goto, that would usually not allow the combination with static C jump
labels, we can still do that due to some tricks that force GCC to do the
right thing. More information:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51840
- Improved performance due to faster registration. The new registration
code is smaller and saves only esp and ebp. Due to some tricks in PSEH3,
GCC is forced to assume all registers are being clobbered when the
__except block is run. Filter function and __finally function don't make
use of registers from the previous code. So all neccessary register save
and restore is now fully done by GCC.
- The trick to make GCC assume all registers are clobbered at the
beginning of the __except block has one additional huge advantage. Now
all variables that are accessed by the __except block will be stored in
memory locations, so that modifications in the __try block will be
available in the __except block, something that doesn't work with PSEH2.
This does not prevent the compiler from doing constant expression
elimination! So when you try to do 2 different, constant assignments to
a variable in the __try block, expect that this will be reduced to only
the later assignment. GCC does not know, that the execution can be
interrupted in the middle of the __try block between those 2
assignments. If you need such stuff, you must use a volatile variable.
- Less stack usage. PSEH3 stores much less data on the stack. No
trampolines, only esp and ebp are saved, registration frames are smaller.
- No more need for _SEH2_YIELD(). This macro was used to make sure the
registration frames are properly unregistered, when leaving the try or
except block using goto or return. This is not neccessary anymore due to
the use of __attribute__((cleanup)). This attribute is attached to the
registration frame and causes the cleanup function to be called
automatically, whenever the variable goes out of scope. This happens
when a goto or return is used in the code.
- Human readable warnings, when special SEH instriniscs, like
_exception_info() or _abnormal_termination() are used outside of their
valid context.
- Cleaner and better readable code, so you won't get braindead, when
trying to figure out what's going on. Header: < 300 lines commented
code, instead of 370 lines uncommented code, lib: < 250 lines of
commented C code + < 70 lines commented asm code, instead of 290 lines
of uncommented C code + 120 lines uncommented asm code)
I also promised Amine to take a look into implementing a version that is
compatible with clang. Initially I planned to include this into this
release, but it turns out that clang massivley lacks features to do so
and a completely different approach will probably be used for that.
The new code is only compatible with GCC 4.7 and later, and it's also
currently disabled (cmake option USE_PSEH3).
Tesbot showed no noticable regressions.
Have fun.
Timo