Hi,
I have come to the conclusion that using -O2 is beneficial even for DBG = 1 builds, and that it should be set on by default on all builds. The typically given reason for not using optimizations on a "Debug" build is because these apparently make assembly code harder to read. I have realized otherwise, and as seen in the example that I will include below, I'm sure this will be mutually agreed on. I note the following advantages in using -O2 on a DBG = 1 build as well:
- -O2 makes the compiler do additional checks. For example, gcc will NOT detect uninitialized variables unless -O2 is being used, even though they are a very important programming bug. Apart from finding more bugs, it also makes trunk compilable. Right now, I see at least two commits by Thomas or others being made every week in order to fix some code which used unitinialized variables (I myself have been guilty of this). This means that some of us, like Thomas, have to constantly fix other people's mistakes. - -O2 means less last-minute blockers. Because we release in -O2 but almost never build it like that, this creates a big problem for people like Andrew or Brandon, which handle the release process and do testing. Because the -O2 build gets less testing coverage, it is very possible for a critical bug to be in ROS for a month before anyone notices it at release time, in which case we will all have to scramble to find a fix for it. - -O2 will not undefine DBG or change anything else in the code. All the advatanges, extra error checking and assertions of the DBG =1 build would remain. - -O2 builds are much faster, greatly helping testing speed. - -O2 builds are much more likely to bring up race conditions and other important timing bugs we need to watch out for. - -O2 means easier debugging. This point is really important because until I realized how true it was, I didn't want to bring this up. Here is a pseudo(but real) disassembly of something I've seen in my dbg = 1 kernel binary while debugging:
0x40b845: push ebp mov ebp, esp sub esp, 4 mov [ebp-4], fs:18h mov eax, [ebp-4] leave retn
0x4bc8a5: push ebp mov ebp, esp sub esp, 4 call 0x40b845 mov ecx, [eax+1c] mov [ebp-4], eax mov eax, [ebp-4] leave retn
0x42b845: push ebp mov ebp, esp sub esp, 4 call 0x4bc8a5 mov ecx, [eax+124] mov [ebp-4], eax mov eax, [ebp-4] leave retn
KeFooBar: push ebp mov ebp, esp sub esp, 4c call 0x42b845 mov [ebp-0xc], eax mov eax, [ebp-0xc] <..> leave retn
This is how it looks with -O2
KeFooBar: push ebp mov ebp, esp sub esp, 4c mov eax, fs:124h <..> leave retn
I hope we can all agree on which one of these is readable. The -O2 build clearly shows you that eax is fs:124h, which you oughta know is Pcrb->CurrentThread; even if you don't, you can easily check in a header. The non-o2 build calls 3 other functions, out of which 2 are merely calling other functions themselves (due to lack of symbols you have no way of knowing what these functions are doing), until we finally get to a function which does fs:18, which you then realize is the PCR, you then walk back and realize pcr->0x1c is PCRB, and Prcb->0x124 is current thread.
Yes, this example could easily be destroyed by saying " use a #define with inline assembly" but I can bring many more; we can't start using inline assembly everywhere... msvc does an amazing job at optimizing these things, and even gcc isn't that bad, if only you let it. Code built without -o2 makes horrible usage of the stack, which makes you have to memory a lot more addresses then code which simple stores values in registers. Because humans are smart, the loops generated by -O2 are also much closer to what someone that understands assembly is used to (for example, the loop will use ecx, and not a stack variable that you need to memorize). I consider myself an expert on assembly coding, and I simply have great trouble reading non-O2 kernels, so how exactly does it help debugging?
In the end, I am convinced that the only disadvantage of using -O2 by default is that it will slightly increase build times. I don't think this increase is more then, at most 1 minute or two for a complete build. If this issue is really critical to someone people, then perhaps only core system files should use -O2 (kernel32, ntdll, ntoskrnl, csr, win32k, drivers, etc).
I know some of the developers on IRC are strongly for this, but I want to make sure I get a broader opinion.
Best regards, Alex Ionescu
Alex Ionescu wrote:
- -O2 makes the compiler do additional checks. For example, gcc will NOT
detect uninitialized variables unless -O2 is being used, even though they are a very important programming bug. Apart from finding more bugs, it also makes trunk compilable. Right now, I see at least two commits by Thomas or others being made every week in order to fix some code which used unitinialized variables (I myself have been guilty of this). This means that some of us, like Thomas, have to constantly fix other people's mistakes.
- -O2 means less last-minute blockers. Because we release in -O2 but
almost never build it like that, this creates a big problem for people like Andrew or Brandon, which handle the release process and do testing. Because the -O2 build gets less testing coverage, it is very possible for a critical bug to be in ROS for a month before anyone notices it at release time, in which case we will all have to scramble to find a fix for it.
- -O2 will not undefine DBG or change anything else in the code. All the
advatanges, extra error checking and assertions of the DBG =1 build would remain.
- -O2 builds are much faster, greatly helping testing speed.
- -O2 builds are much more likely to bring up race conditions and other
important timing bugs we need to watch out for.
- -O2 means easier debugging. This point is really important because
until I realized how true it was, I didn't want to bring this up. Here is a pseudo(but real) disassembly of something I've seen in my dbg = 1 kernel binary while debugging:
I've been building with -O3 exclusively for most of these reasons for a long time. I've had to fix countless bugs/warnings in the past that were only exposed with optimizations enabled. So I really do support a switch to -O2/3 although people with slower development machines might not like this. Depending on the hardware and OS the build time might increase dramatically. However, at least GCC 4.0.x still has the sibling call optimization bug, so -fno-optimize-sibling-calls should be used at least for now. I'm not sure if it's been fixed in the 4.1 branch.
- Thomas
- -O2 means easier debugging. This point is really important
because until I realized how true it was, I didn't want to bring this up. Here is a pseudo(but real) disassembly of something I've seen in my dbg = 1 kernel binary while debugging:
... Lots of code
This is how it looks with -O2
... Far less code
I hope we can all agree on which one of these is readable.
For me the most readable is the source form, not the disassembly. Compiling with optimization turned on means the correspondence between source and object forms is often broken, making tracing hard. Local variables are often kept in registers instead of on the stack and are much harder to inspect.
Since I seem to be just about the only one using a source-level debugger (I really cannot understand how the rest of you can live without it, but that's just my opinion <g>) I don't want to block the switch to -O2, but please create an easy way to switch it off.
GvG
Why not have threelevel of DBG 0 = DBG off 1 = as now 2 = -O2 with debug on.
and release build should change from -Oz to -O3 to gain more speed. the size is not a issue make it faster. if the exe file are large or smaller- the large one can be faster that the small one.
That is what I think
----- Original Message ----- From: "Ge van Geldorp" gvg@reactos.org To: "'ReactOS Development List'" ros-dev@reactos.org Sent: den 4 January 2006 09:35 Subject: RE: [ros-dev] Optimization Proposal
- -O2 means easier debugging. This point is really important
because until I realized how true it was, I didn't want to bring this up. Here is a pseudo(but real) disassembly of something I've seen in my dbg = 1 kernel binary while debugging:
... Lots of code
This is how it looks with -O2
... Far less code
I hope we can all agree on which one of these is readable.
For me the most readable is the source form, not the disassembly.
Compiling
with optimization turned on means the correspondence between source and object forms is often broken, making tracing hard. Local variables are
often
kept in registers instead of on the stack and are much harder to inspect.
Since I seem to be just about the only one using a source-level debugger
(I
really cannot understand how the rest of you can live without it, but
that's
just my opinion <g>) I don't want to block the switch to -O2, but please create an easy way to switch it off.
GvG
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Magnus Olsen wrote:
and release build should change from -Oz to -O3 to gain more speed. the size is not a issue make it faster. if the exe file are large or smaller- the large one can be faster that the small one.
I read somewhere that sometimes "optimized for size" is actually faster than "optimized for speed" in modern processors.
-Oz does not optimze loop like remove calc from loop that can done outside and other nice stuff.. it feal -Oz is slower that -O3. I use -O3 in DBG=1 and standard setting for DBG=0 it feal that release build being slower that DBG=1 with -O3. The O3 slight produst lite biger exe files. but alot better optimze for speed that -Oz.
----- Original Message ----- From: "Thomas Weidenmueller" w3seek@reactos.com To: "ReactOS Development List" ros-dev@reactos.org Sent: den 4 January 2006 14:43 Subject: Re: [ros-dev] Optimization Proposal
Royce Mitchell III wrote:
I read somewhere that sometimes "optimized for size" is actually faster than "optimized for speed" in modern processors.
Generally that's true. But I don't know if there's a noticeable difference in ReactOS ;)
- Thomas
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
No, we don't need more compile-time configurations as they give us build problems. We need less.
Casper
-----Original Message----- From: ros-dev-bounces@reactos.org [mailto:ros-dev-bounces@reactos.org] On Behalf Of Magnus Olsen Sent: 4. januar 2006 10:48 To: ReactOS General List; ReactOS Development List Subject: Re: [ros-dev] Optimization Proposal
Why not have threelevel of DBG 0 = DBG off 1 = as now 2 = -O2 with debug on.
and release build should change from -Oz to -O3 to gain more speed. the size is not a issue make it faster. if the exe file are large or smaller- the large one can be faster that the small one.
That is what I think
Ge van Geldorp wrote:
- -O2 means easier debugging. This point is really important
because until I realized how true it was, I didn't want to bring this up. Here is a pseudo(but real) disassembly of something I've seen in my dbg = 1 kernel binary while debugging:
... Lots of code
This is how it looks with -O2
... Far less code
I hope we can all agree on which one of these is readable.
For me the most readable is the source form, not the disassembly. Compiling with optimization turned on means the correspondence between source and object forms is often broken, making tracing hard. Local variables are often kept in registers instead of on the stack and are much harder to inspect.
Since I seem to be just about the only one using a source-level debugger (I really cannot understand how the rest of you can live without it, but that's just my opinion <g>) I don't want to block the switch to -O2, but please create an easy way to switch it off.
GvG
How are you source-level debugging ros?
+1 provided there is an easy way to switch that off. (which certainly will be easy :-)).
I'm using DBG builds all the time, and in fact having to build with - O2 sounds better to me.
WBR, Aleksey Bragin.
On Jan 4, 2006, at 11:35 AM, Ge van Geldorp wrote:
Since I seem to be just about the only one using a source-level debugger (I really cannot understand how the rest of you can live without it, but that's just my opinion <g>) I don't want to block the switch to -O2, but please create an easy way to switch it off.
GvG