Care to explain?
If it can be optimized the wrong way -- use a pragma or fix the compiler.
I hate these kinds of patches. I think Timo would agree.
Best regards, Alex Ionescu
On Fri, Sep 30, 2011 at 5:03 PM, dgorbachev@svn.reactos.org wrote:
Author: dgorbachev Date: Fri Sep 30 21:03:29 2011 New Revision: 53907
URL: http://svn.reactos.org/svn/reactos?rev=53907&view=rev Log: [NTOSKRNL] Use inline asm in KiIsNpxErrataPresent(). C code can be optimized in a wrong way.
Modified: trunk/reactos/ntoskrnl/ke/i386/cpu.c
Modified: trunk/reactos/ntoskrnl/ke/i386/cpu.c URL: http://svn.reactos.org/svn/reactos/trunk/reactos/ntoskrnl/ke/i386/cpu.c?rev=...
============================================================================== --- trunk/reactos/ntoskrnl/ke/i386/cpu.c [iso-8859-1] (original) +++ trunk/reactos/ntoskrnl/ke/i386/cpu.c [iso-8859-1] Fri Sep 30 21:03:29 2011 @@ -1254,9 +1254,9 @@ INIT_FUNCTION KiIsNpxErrataPresent(VOID) {
- BOOLEAN ErrataPresent;
- static double Value1 = 4195835.0, Value2 = 3145727.0;
- INT ErrataPresent; ULONG Cr0;
volatile double Value1, Value2;
/* Disable interrupts */ _disable();
@@ -1269,9 +1269,30 @@ Ke386FnInit();
/* Multiply the magic values and divide, we should get the result back*/
- Value1 = 4195835.0;
- Value2 = 3145727.0;
- ErrataPresent = (Value1 * Value2 / 3145727.0) != 4195835.0;
+#ifdef __GNUC__
- __asm__ __volatile__
- (
"fldl %1\n\t""fdivl %2\n\t""fmull %2\n\t""fldl %1\n\t""fsubp\n\t""fistpl %0\n\t": "=m" (ErrataPresent): "m" (Value1),"m" (Value2)- );
+#else
- __asm
- {
fld Value1fdiv Value2fmul Value2fld Value1fsubpfistp ErrataPresent- };
+#endif
/* Restore CR0 */ __writecr0(Cr0);@@ -1280,7 +1301,7 @@ _enable();
/* Return if there's an errata */
- return ErrataPresent;
- return ErrataPresent != 0;
}
VOID
Am 01.10.2011 00:51, schrieb Alex Ionescu:
Care to explain?
If it can be optimized the wrong way -- use a pragma or fix the compiler.
I hate these kinds of patches. I think Timo would agree.
Best regards, Alex Ionescu
I agree with Dmitri that the previous code might not do what its supposed to. On the other hand I would prefer a C solution.
The compiler could make ErrataPresent = ((Value1 * Value2) != (3145727.0 * 4195835.0)); out of it, since that's the same.
A better way of doing it would be:
volatile double Value1 = 4195835.0, Value2 = 3145727.0, Value3; Value3 = Value1 / Value2; ErrataPresent = Value3 != 1.333820449;
Anyway, this is a classical reactos discussion. While there are tons of things to do, we fix and discuss some completely obsolete bs. We have 2011! These machines are almost 20 years old and probably only exist in a museum!
The fun thing is, we're not the only people. Here's an FAQ from 2011 with the same topic: http://www.trnicely.net/pentbug/pentbug.html
Regards, Timo
To continue a classical reactos discussion, is there any reason why 32-bit versions of __writecr0, etc. use 64-bit arguments? It just causes that worse code is generated.
Am 01.10.2011 20:39, schrieb Dmitry Gorbachev:
To continue a classical reactos discussion, is there any reason why 32-bit versions of __writecr0, etc. use 64-bit arguments? It just causes that worse code is generated.
Thats what MSDN says. But that might be wrong. There are no headers that contain it and the compiler doesn't care. So feel free to make it uintptr_t
Am 01.10.2011 22:58, schrieb Timo Kreuzer:
Thats what MSDN says. But that might be wrong. There are no headers that contain it and the compiler doesn't care. So feel free to make it uintptr_t
Thomas has found a declaration in VS10 headers. and it says its unsigned int for x86 and unsigned __int64 for x64
It is not a compiler fault. The compiler does here two legitimate things:
1) Math can be moved outside _disable/_enable/__writecr0. It can be fixed by making other local vars volatile.
2) Some time ago, you suggested to look into building with -Ofast option. So I went agead and compiled with it. GCC replaces division by multiplication, which causes some loss of precision; thus, the test fails. This, however, works:
ErrataPresent = (Value1 / Value2 * 3145727.0) != 4195835.0;
I hate these kinds of patches.
I do not see a reason for grievance. This handwritten piece is shorter than compiler-generated code, it is only for i386, there is no need to port it to arm and amd64, not speaking already that it's just for "some completely obsolete bs".
Those three intrinsics should be barriers -- why is it moving the math outside?
If -Ofast is a problem, that function simply needs a pragma to disable optimization.
Best regards, Alex Ionescu
On Sat, Oct 1, 2011 at 12:38 PM, Dmitry Gorbachev d.g.gorbachev@gmail.comwrote:
- Math can be moved outside _disable/_enable/__writecr0. It can be
fixed by making other local vars volatile.
Those three intrinsics should be barriers -- why is it moving the math outside?
The math does not read or write any memory, and the result lies in a register. It can be lawfully moved across the barriers.
So, as I understand, you want C code back?
On Oct 1, 2011, at 10:39 PM, Dmitry Gorbachev wrote:
Those three intrinsics should be barriers -- why is it moving the math outside?
The math does not read or write any memory, and the result lies in a register. It can be lawfully moved across the barriers.
So, as I understand, you want C code back?
Since this code is not portable at all by definition, the only reason to make it C would be to support more compilers. However if compiler becomes a problem here (that was the root of the problem), I would advise to make an exception and leave the new code in assembly.
The compiler is not the problem here, the code is.