Hi!
Since I have seen mesa32 crash with SSE enabled I looked into fixing it a bit... When I was looking at ntoskrnl/ke/i386/tskswitch.S I found FIXMEs for debug/FPU save/restore. Now I am wondering where should this information be stored? I think it should be part of the TSS (include/ntos/tss.h) but I am not sure. For SSE support the SSE registers would also have to be saved/restored on task switches and some other stuff which I do not yet know I think. Maybe somebody can tell me where the FPU state should be stored so I can try to implement it.
Thanks, blight
Hi,
the fpu registers should be stored on a reserved area at the top of the kernel stack. There are needed two areas, one for the user mode state and the other for the kernel mode state. At startup of a thread the fpu is disabled. The first fpu instruction creates an exception. The exception handler sets a mark in the thread structure (NpxFlags?) and restore an saved state if it exist one. It must check if it should restore the user or kernel mode state. On a thread switch the mark of the current thread is checked and the fpu state is saved if it is necessary. The fpu is disabled for the new thread. An other problem are sse instructions. The bios may or may not enable the sse instructions. Currently, ros doesn't enable the sse instructions. If the cpu is set to pentium in the makefile and optimisation is enabled, gcc will always use sse instructions.
- Hartmut
-----Original Message----- From: ros-dev-bounces@reactos.com [mailto:ros-dev-bounces@reactos.com] On Behalf Of Gregor Anich Sent: Friday, October 08, 2004 6:02 PM To: ros-dev@reactos.com Subject: [ros-dev] Save FPU on task switch/SSE support
Hi!
Since I have seen mesa32 crash with SSE enabled I looked into fixing it a bit... When I was looking at ntoskrnl/ke/i386/tskswitch.S I found FIXMEs for debug/FPU save/restore. Now I am wondering where should this information be stored? I think it should be part of the TSS (include/ntos/tss.h) but I am not sure. For SSE support the SSE registers would also have to be saved/restored on task switches and some other stuff which I do not yet know I think. Maybe somebody can tell me where the FPU state should be stored so I can try to implement it.
Thanks, blight _______________________________________________ Ros-dev mailing list Ros-dev@reactos.com http://reactos.com:8080/mailman/listinfo/ros-dev
Hi again ;)
Do you have any idea where the SSE registers should be stored? The thing with the flag for the FPU state is nice, since it allows to skip saving of the FPU state when not needed - do you know if we can do the same for SSE?
Thanks, blight
Hartmut Birr wrote:
Hi,
the fpu registers should be stored on a reserved area at the top of the kernel stack. There are needed two areas, one for the user mode state and the other for the kernel mode state. At startup of a thread the fpu is disabled. The first fpu instruction creates an exception. The exception handler sets a mark in the thread structure (NpxFlags?) and restore an saved state if it exist one. It must check if it should restore the user or kernel mode state. On a thread switch the mark of the current thread is checked and the fpu state is saved if it is necessary. The fpu is disabled for the new thread. An other problem are sse instructions. The bios may or may not enable the sse instructions. Currently, ros doesn't enable the sse instructions. If the cpu is set to pentium in the makefile and optimisation is enabled, gcc will always use sse instructions.
- Hartmut
-----Original Message----- From: ros-dev-bounces@reactos.com [mailto:ros-dev-bounces@reactos.com] On Behalf Of Gregor Anich Sent: Friday, October 08, 2004 6:02 PM To: ros-dev@reactos.com Subject: [ros-dev] Save FPU on task switch/SSE support
Hi!
Since I have seen mesa32 crash with SSE enabled I looked into fixing it a bit... When I was looking at ntoskrnl/ke/i386/tskswitch.S I found FIXMEs for debug/FPU save/restore. Now I am wondering where should this information be stored? I think it should be part of the TSS (include/ntos/tss.h) but I am not sure. For SSE support the SSE registers would also have to be saved/restored on task switches and some other stuff which I do not yet know I think. Maybe somebody can tell me where the FPU state should be stored so I can try to implement it.
Thanks, blight
-----Original Message----- From: ros-dev-bounces@reactos.com [mailto:ros-dev-bounces@reactos.com] On Behalf Of Gregor Anich Sent: Friday, October 08, 2004 7:50 PM To: ReactOS Development List Subject: Re: [ros-dev] Save FPU on task switch/SSE support
Hi again ;)
Do you have any idea where the SSE registers should be stored? The thing with the flag for the FPU state is nice, since it allows to skip saving of the FPU state when not needed - do you know if we can do the same for SSE?
Thanks, blight
We can do the same for sse. If sse is available we must use fxsave/fxrstor instead of fsave/frstor and the buffer is 512 byte (108 byte for normal fpu).
- Hartmut
Hi,
As I added the KPRCB I noticed the last field is called NpxSaveArea and contains the last FPU/SSE state, iirc.
You might want to save it there...but what troubles me is that the KPRCB is a CPU-wide structure, not thread-limited.
Best regards, Alex Ionescu
Hartmut Birr wrote:
-----Original Message----- From: ros-dev-bounces@reactos.com [mailto:ros-dev-bounces@reactos.com] On Behalf Of Gregor Anich Sent: Friday, October 08, 2004 7:50 PM To: ReactOS Development List Subject: Re: [ros-dev] Save FPU on task switch/SSE support
Hi again ;)
Do you have any idea where the SSE registers should be stored? The thing with the flag for the FPU state is nice, since it allows to skip saving of the FPU state when not needed - do you know if we can do the same for SSE?
Thanks, blight
We can do the same for sse. If sse is available we must use fxsave/fxrstor instead of fsave/frstor and the buffer is 512 byte (108 byte for normal fpu).
- Hartmut
Ros-dev mailing list Ros-dev@reactos.com http://reactos.com:8080/mailman/listinfo/ros-dev
-----Original Message----- From: ros-dev-bounces@reactos.com [mailto:ros-dev-bounces@reactos.com] On Behalf Of Alex Ionescu Sent: Friday, October 08, 2004 11:48 PM To: ReactOS Development List Subject: Re: [ros-dev] Save FPU on task switch/SSE support
Hi,
As I added the KPRCB I noticed the last field is called NpxSaveArea and contains the last FPU/SSE state, iirc.
You might want to save it there...but what troubles me is that the KPRCB is a CPU-wide structure, not thread-limited.
For each thread we need two save areas (user and kernel mode) because win32k and free type are using the fpu. This is different to Windows. On a callback from kernel to user mode we need more than this two save areas for a thread. The save area in the kpcr structure is for entering/leaving the standby mode.
- Hartmut
--- Hartmut Birr hartmut.birr@gmx.de wrote:
-----Original Message----- From: ros-dev-bounces@reactos.com [mailto:ros-dev-bounces@reactos.com] On Behalf Of
Gregor Anich
Sent: Friday, October 08, 2004 7:50 PM To: ReactOS Development List Subject: Re: [ros-dev] Save FPU on task switch/SSE
support
Hi again ;)
Do you have any idea where the SSE registers
should be stored?
The thing with the flag for the FPU state is nice,
since it allows to
skip saving of the FPU state when not needed - do
you know if
we can do the same for SSE?
Thanks, blight
We can do the same for sse. If sse is available we must use fxsave/fxrstor instead of fsave/frstor and the buffer is 512 byte (108 byte for normal fpu).
- Hartmut
When were the processors that supported this first available? If they weren't available for Microsoft to include support for them in Win95 or 98, what happens when those instructions are used in programs under those windows versions?
-ShadowFlare
_______________________________ Do you Yahoo!? Declare Yourself - Register online to vote today! http://vote.yahoo.com
Considering that parts of NT4 already supported IA64, I think it's safe to assume that Win95 supported most probably even SSE.
As for what happens when a program with SSE instructions is executed on a non-SSE CPU, I believe the opcodes will be invalid and generate GPFs.
Best regards, Alex Ionescu
ShadowFlare wrote:
--- Hartmut Birr hartmut.birr@gmx.de wrote:
-----Original Message----- From: ros-dev-bounces@reactos.com [mailto:ros-dev-bounces@reactos.com] On Behalf Of
Gregor Anich
Sent: Friday, October 08, 2004 7:50 PM To: ReactOS Development List Subject: Re: [ros-dev] Save FPU on task switch/SSE
support
Hi again ;)
Do you have any idea where the SSE registers
should be stored?
The thing with the flag for the FPU state is nice,
since it allows to
skip saving of the FPU state when not needed - do
you know if
we can do the same for SSE?
Thanks, blight
We can do the same for sse. If sse is available we must use fxsave/fxrstor instead of fsave/frstor and the buffer is 512 byte (108 byte for normal fpu).
- Hartmut
When were the processors that supported this first available? If they weren't available for Microsoft to include support for them in Win95 or 98, what happens when those instructions are used in programs under those windows versions?
-ShadowFlare
_______________________________ Do you Yahoo!? Declare Yourself - Register online to vote today! http://vote.yahoo.com _______________________________________________ Ros-dev mailing list Ros-dev@reactos.com http://reactos.com:8080/mailman/listinfo/ros-dev
Alex Ionescu wrote:
Considering that parts of NT4 already supported IA64, I think it's safe to assume that Win95 supported most probably even SSE.
IIRC Win95 and NT4 knew nothing but plain FPU. fxsave & co were introduced with the introduction of WDM, in Win98 and NT5.
/Mike
--- Mike Nordell tamlin@algonet.se wrote:
Alex Ionescu wrote:
Considering that parts of NT4 already supported
IA64, I think it's safe
to assume that Win95 supported most probably even
SSE.
IIRC Win95 and NT4 knew nothing but plain FPU. fxsave & co were introduced with the introduction of WDM, in Win98 and NT5.
/Mike
If that's the case, would running multiple programs that try to use these other types of instructions cause random crashes?
-ShadowFlare
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
ShadowFlare wrote:
--- Mike Nordell tamlin@algonet.se wrote:
IIRC Win95 and NT4 knew nothing but plain FPU. fxsave & co were introduced with the introduction of WDM, in Win98 and NT5.
/Mike
If that's the case, would running multiple programs that try to use these other types of instructions cause random crashes?
-ShadowFlare
If that was the case Intel would have broken backwards compatibility which they do not do usually I think... If an OS does not know about a feature, it does usually not enable it and the processor acts like it would without the feature in most cases I think. (since it's disabled ;-)
In cr4 (Control Register number 4) is a bit called OSFXSR which means "Operating System support for FXSAVE and FXRSTOR" - if it's not set SSE instructions are disabled.
From the IA-32 manual: "When set, this flag preforms the following functions: (1) indicates to software that the operating system supports the use of the FXSAVE and FXRSTOR instructions, (2) enables the FXSAVE and FXRSTOR instructions to save and restore the contents of the XMM and MXCSR registers along with the contents of the x87 FPU and MMX registers, and (3) enables the processor to execute any of the SSE/SSE2/SSE3 instruc-tions, with the exception of the PAUSE, PREFETCHh, SFENCE, LFENCE, MFENCE, MOVNTI, and CLFLUSH. If this flag is clear, the FXSAVE and FXRSTOR instructions will save and restore the contents of the x87 FPU and MMX instructions, but they may not save and restore the contents of the XMM and MXCSR registers. Also, if this flag is clear, the processor will generate an invalid opcode exception (#UD) whenever it attempts to execute an SSE/SSE2/SSE3 instruction, with the exception of the PAUSE, PREFETCHh, SFENCE, LFENCE, MFENCE, MOVNTI, and CLFLUSH. The operating system or executive must explicitly set this flag."
You can get the PDFs at http://developer.intel.com/design/PentiumIII/documentation.htm#manuals (IA-32 Intel Architecture Software Developer’s Manual Volume 1, 2A, 2B, 3)
Hartmut Birr wrote:
We can do the same for sse. If sse is available we must use fxsave/fxrstor instead of fsave/frstor and the buffer is 512 byte (108 byte for normal fpu).
- Hartmut
Hi!
I have thought about it a bit more and talked to tamlin a bit... now here's how I think some things should be done... and some things which i am not sure about...
1) Initialize NpxState of KTHREAD to cr0 & MP | TS | EM in Ke386InitThreadWithContext (maybe in Ke386InitThread too? I don't think so...) and change FLOATING_SAVE_AREA to FX_SAVE_AREA in that function - ke/i386/thread.c 2) in Ki386ContextSwitch we save cr0 & MP | TS | EM into the old thread's NpxState, and put the NpxState of the new thread or'ed with TS into cr0 (this could be skipped, we could just set TS unless there was a reason to run some processes with EM set and some without; MP should never change I think - but this way is safer) - ke/i386/tskswitch.S 3) in KiTrapHandler handle the device-not-present exception in a special way if TS in cr0 is set (or maybe assert that TS is set?) - can the device-not-present fault happen for non-FPU /SEE/MMX code? - ke/i386/exp.c
I am not sure when to save the FPU state... I think it has to be saved when the thread enters Ki386ContextSwitch and if TS is not set - this means the thread has used the FPU. Then the new thread will run, TS is set and the first FPU code will get into KiTrapHandler, where we handle it in a special way - we restore the previous (or initial) FPU state (FX_SAVE_AREA) from the kernel stack top, unset TS and let the thread run until it reaches Ki386ContextSwitch again where the FPU state will be saved if TS is unset.
Just bad that we will always have to save it if it was used I think, even if no other program uses it. First I thought maybe we could save the state of the old thread in the exception handler for the new threads first opcode (somewhere the previous thread is saved, isnt it?) and then restore the new thread's previous state - but this wouldnt work if one thread used the FPU, it switched to another (which doesnt use the FPU) and then to yet another which uses the FPU - the previous thread would then not be the first but the second thread which didnt use FPU)
I'd appreciate some comments on this.
Thanks, blight
-----Original Message----- From: ros-dev-bounces@reactos.com [mailto:ros-dev-bounces@reactos.com] On Behalf Of Gregor Anich Sent: Sunday, October 10, 2004 2:13 AM To: ReactOS Development List Subject: Re: [ros-dev] Save FPU on task switch/SSE support
Hartmut Birr wrote:
We can do the same for sse. If sse is available we must use
fxsave/fxrstor
instead of fsave/frstor and the buffer is 512 byte (108 byte
for normal
fpu).
- Hartmut
Hi!
I have thought about it a bit more and talked to tamlin a bit... now here's how I think some things should be done... and some things which i am not sure about...
- Initialize NpxState of KTHREAD to cr0 & MP | TS | EM in
Ke386InitThreadWithContext (maybe in Ke386InitThread too? I don't think so...) and change FLOATING_SAVE_AREA to FX_SAVE_AREA in that function - ke/i386/thread.c
MP, EM and NE should be set at startup and never change again. NpxState should be initialised, that the first fpu/sse instraction triggers the initialisation of the fpu. NpxState contains a status for kernel and user mode. The status is - context is invalid, fpu must be initialized - context is valid, fpu state must be restored - fpu has execute an instruction, fpu state must be saved
- in Ki386ContextSwitch we save cr0 & MP | TS | EM into the old
thread's NpxState, and put the NpxState of the new thread or'ed with TS into cr0 (this could be skipped, we could just set TS unless there was a reason to run some processes with EM set and some without; MP should never change I think - but this way is safer) - ke/i386/tskswitch.S
In Ki386ContextSwitch we must check NpxState:
a) user mode has execute a fpu instruction -> save user mode context, set NpxState to 'user mode valid', set TS and go on b) kernel mode has execute a fpu instruction -> save kernel mode context, set NpxState to 'kernel mode valid', set TS and go on c) no fpu instruction was exceute, TS is already set
- in KiTrapHandler handle the device-not-present exception
in a special way if TS in cr0 is set (or maybe assert that TS is set?) - can the device-not-present fault happen for non-FPU /SEE/MMX code? - ke/i386/exp.c
In KiTrapHandler (device not present) we must check NpxState and the mode by testing bit 0 of the saved cs:
A) called from user mode -> clear TS Aa) user mode has a valid context -> restore user mode context, set NpxState to 'user mode execute' and go on Ab) user mode has an invalid context -> initialize fpu, set NpxState to 'user mode execute' and go on
B) called from kernel mode -> clear TS Ba) user mode has execute a fpu instraction -> save user mode context, set NpxState to 'user mode valid' Bb) kernel mode has a valid context -> restore kernel mode context, set NpxState to 'kernel mode execute' and go on Bc) kernel mode has an invalid context -> initialize fpu, set NpxState to 'kernel mode execute' and go on
4.) Entering kernel mode (Syscall/int 2e):
We must check the previous mode. If the previous mode was user mode we set TS.
5.) Leaving kernel mode (Sysret/int 2e):
We must check the previous mode. We set TS. If the previous mode was user mode we set NpxState to 'kernel mode invalid'.
6.) NtW32Call/NtCallbackReturn:
We must save/restore the current context + NpxState and initialize the new context + NpxState on the new stack.
I am not sure when to save the FPU state... I think it has to be saved when the thread enters Ki386ContextSwitch and if TS is not set - this means the thread has used the FPU. Then the new thread will run, TS is set and the first FPU code will get into KiTrapHandler, where we handle it in a special way - we restore the previous (or initial) FPU state (FX_SAVE_AREA) from the kernel stack top, unset TS and let the thread run until it reaches Ki386ContextSwitch again where the FPU state will be saved if TS is unset.
Just bad that we will always have to save it if it was used I think, even if no other program uses it. First I thought maybe we could save the state of the old thread in the exception handler for the new threads first opcode (somewhere the previous thread is saved, isnt it?) and then restore the new thread's previous state - but this wouldnt work if one thread used the FPU, it switched to another (which doesnt use the FPU) and then to yet another which uses the FPU - the previous thread would then not be the first but the second thread which didnt use FPU)
If a thread has used the fpu, we must always save the the fpu state. An other idea is to put a pointer to the last thread, which has used the cpu, into the new thread. If the new thread use the fpu, it can save the old state to the saved thread. On thread switching, there is put the saved pointer or the current thread into the new thread. This is very difficult on thread terminating, because there can exist a 'fpu save pointer' in an other thread for the thread which is terminating.
- Hartmut