I'm working on bug 713 (http://www.reactos.com/bugzilla/show_bug.cgi?id=713 for the click-happy). What I have found is this: - Telnet opens a socket - Sending and receiving are handled by different threads, let's call them S and R - Thread R starts a recv() operation, which is translated to an ioctl on the socket - No data is available, so thread R blocks, waiting for FileObject->Event - I type something, which is handled by thread S. Thread S starts a send() operation. - Again translated to an ioctl on the socket. Since this is the same socket as used by S, the FileObject will be the same - The send can complete immediately, IoCompleteRequest is called which sets the FileObject->Event. - Thread R is unblocked, the event it was waiting for was set by thread S. However, the irp of thread R was never completed. The recv() call returns with bogus info. - Thread R starts another recv(). When some data arrives from the server, two irps are waiting for it. This eventually leads to the crash.
The fundamental problem seems to be multiple overlapping I/O operations which all use FileObject->Event to signal their completion. I have no idea how to fix that...
Gé van Geldorp.
We dont support serialization of asynch. i/o agains the same fileobject. :-D Fileobject have 4 members used for this lock:
ULONG Waiters; ULONG Busy; PVOID LastLock; KEVENT Lock;
afaics, its basically a mutex but its reimplemented as a "file object lock" to support alertable waits and customized rundown.
I remeber seeing some of these members incorrectly used in Cc for some bizzare synch. but it might be removed now...
G.
Ge van Geldorp wrote:
I'm working on bug 713 (http://www.reactos.com/bugzilla/show_bug.cgi?id=713 for the click-happy). What I have found is this:
- Telnet opens a socket
- Sending and receiving are handled by different threads, let's call them S
and R
- Thread R starts a recv() operation, which is translated to an ioctl on the
socket
- No data is available, so thread R blocks, waiting for FileObject->Event
- I type something, which is handled by thread S. Thread S starts a send()
operation.
- Again translated to an ioctl on the socket. Since this is the same socket
as used by S, the FileObject will be the same
- The send can complete immediately, IoCompleteRequest is called which sets
the FileObject->Event.
- Thread R is unblocked, the event it was waiting for was set by thread S.
However, the irp of thread R was never completed. The recv() call returns with bogus info.
- Thread R starts another recv(). When some data arrives from the server,
two irps are waiting for it. This eventually leads to the crash.
The fundamental problem seems to be multiple overlapping I/O operations which all use FileObject->Event to signal their completion. I have no idea how to fix that...
Gé van Geldorp.
Ros-dev mailing list Ros-dev@reactos.com http://reactos.com:8080/mailman/listinfo/ros-dev
From: Gunnar Dalsnes
We dont support serialization of asynch. i/o agains the same fileobject. :-D
Ok, let me rephrase :) Both the recv() and the send() call are synchronous: they are not expected to return before they completed their job. However, they are being called from two different threads and so are in progress at the same time. Seems like a pretty common scenario for network apps to me.
Gé van Geldorp.
sry. i ment synch. io
Ge van Geldorp wrote:
From: Gunnar Dalsnes
We dont support serialization of asynch. i/o agains the same fileobject. :-D
Ok, let me rephrase :) Both the recv() and the send() call are synchronous: they are not expected to return before they completed their job. However, they are being called from two different threads and so are in progress at the same time. Seems like a pretty common scenario for network apps to me.
Gé van Geldorp.
Ros-dev mailing list Ros-dev@reactos.com http://reactos.com:8080/mailman/listinfo/ros-dev
Gunnar Dalsnes wrote:
sry. i ment synch. io
Ge van Geldorp wrote:
From: Gunnar Dalsnes
We dont support serialization of asynch. i/o agains the same fileobject. :-D
Ok, let me rephrase :) Both the recv() and the send() call are synchronous: they are not expected to return before they completed their job. However, they are being called from two different threads and so are in progress at the same time. Seems like a pretty common scenario for network apps to me.
Gé van Geldorp.
Hmm. I know that before moved I created different events used by each synchronous call in msafd to correct exactly that problem. Is that not working?
From: art yerkes
Hmm. I know that before moved I created different events used by each synchronous call in msafd to correct exactly that problem. Is that not working?
The problem was that the handle was opened with the FILE_SYNCHRONOUS_IO_NONALERT option. This caused NtDeviceIoControlFile to do an internal wait if the driver returned STATUS_PENDING. As a result, STATUS_PENDING was no longer returned as the status by NtDeviceIoControlFile to msafd, so the waits in msafd were bypassed. I've attached the proposed patch to bug 713. No comments so far.
Gé van Geldorp.
I believe the socket needs to be opened for async IO then.
Ge van Geldorp wrote:
From: art yerkes
Hmm. I know that before moved I created different events used by each synchronous call in msafd to correct exactly that problem. Is that not working?
The problem was that the handle was opened with the FILE_SYNCHRONOUS_IO_NONALERT option. This caused NtDeviceIoControlFile to do an internal wait if the driver returned STATUS_PENDING. As a result, STATUS_PENDING was no longer returned as the status by NtDeviceIoControlFile to msafd, so the waits in msafd were bypassed. I've attached the proposed patch to bug 713. No comments so far.
Gé van Geldorp.
Ros-dev mailing list Ros-dev@reactos.com http://reactos.com:8080/mailman/listinfo/ros-dev
That's easy enough to fix. The ioctls need to pass down their own event object and not rely on signaling of the file object.
Ge van Geldorp wrote:
I'm working on bug 713 (http://www.reactos.com/bugzilla/show_bug.cgi?id=713 for the click-happy). What I have found is this:
- Telnet opens a socket
- Sending and receiving are handled by different threads, let's call them S
and R
- Thread R starts a recv() operation, which is translated to an ioctl on the
socket
- No data is available, so thread R blocks, waiting for FileObject->Event
- I type something, which is handled by thread S. Thread S starts a send()
operation.
- Again translated to an ioctl on the socket. Since this is the same socket
as used by S, the FileObject will be the same
- The send can complete immediately, IoCompleteRequest is called which sets
the FileObject->Event.
- Thread R is unblocked, the event it was waiting for was set by thread S.
However, the irp of thread R was never completed. The recv() call returns with bogus info.
- Thread R starts another recv(). When some data arrives from the server,
two irps are waiting for it. This eventually leads to the crash.
The fundamental problem seems to be multiple overlapping I/O operations which all use FileObject->Event to signal their completion. I have no idea how to fix that...
Gé van Geldorp.