Phillip Susi wrote:
Wait a second... why is it sending only one byte at a time? The caller is only send()ing one byte at a time?
This is true. Although Send() will return indicating the data has been sent, nagling should hold the actual packet back until it is full, however this should only happen if there is unconfirmed data i.e. no ACK.
To avoid any confusion, you can switch nagling off with the TCP_NODELAY flag
I'm interested in this bug, but I'm not around tonight to play about with it :(
************************************************************************ The information contained in this message or any of its attachments is confidential and is intended for the exclusive use of the addressee. The information may also be legally privileged. The views expressed may not be company policy, but the personal views of the originator. If you are not the addressee, any disclosure, reproduction, distribution or other dissemination or use of this communication is strictly prohibited. If you have received this message in error, please contact postmaster@exideuk.co.uk mailto:postmaster@exideuk.co.uk and then delete this message.
Exide Technologies is an industrial and transportation battery producer and recycler with operations in 89 countries. Further information can be found at www.exide.com
From: Murphy, Ged (Bolton)
This is true. Although Send() will return indicating the data has been sent, nagling should hold the actual packet back until it is full, however this should only happen if there is unconfirmed data i.e. no ACK.
Ok, so it seems that the sending side is correct in holding back the packet. In ReactOS the problem then is that the ACK is delayed for too long. I've checked the delayed ACK path. Basically we fire a timer every 0.5 sec to do cleanup tasks. Only one of every 5 of these timer ticks is passed on to the oskittcp lib (routine TCPTimeout only calls TimerOskitTCP on every 5th tick). This means that the oskittcp routine tcp_fasttimo() which is responsible for sending delayed ACKs is only called every 2.5 sec, which corresponds exactly to the delays in the loop.c test program I posted earlier.
So, the solution seems to be to call tcp_fasttimo() more often. Question is how often? There's a comment in the accompanying tcp_slowtimo() that it should be called every 500ms. So I guess tcp_fasttimo() should be called substantially more often than that?
To avoid any confusion, you can switch nagling off with the TCP_NODELAY flag
Haven't checked yet if we implement this.
I'm interested in this bug, but I'm not around tonight to play about with it :(
I promise not to fix it tonight ;)
GvG
Ge van Geldorp wrote:
Ok, so it seems that the sending side is correct in holding back the packet. In ReactOS the problem then is that the ACK is delayed for too long. I've checked the delayed ACK path. Basically we fire a timer every 0.5 sec to do cleanup tasks. Only one of every 5 of these timer ticks is passed on to the oskittcp lib (routine TCPTimeout only calls TimerOskitTCP on every 5th tick). This means that the oskittcp routine tcp_fasttimo() which is responsible for sending delayed ACKs is only called every 2.5 sec, which corresponds exactly to the delays in the loop.c test program I posted earlier.
The delayed ACK is also correct behavior. Since there is plenty of room left in the window, the ACK is supposed to be delayed so it can ACK further data with one packet.
So, the solution seems to be to call tcp_fasttimo() more often. Question is how often? There's a comment in the accompanying tcp_slowtimo() that it should be called every 500ms. So I guess tcp_fasttimo() should be called substantially more often than that?
This is not the solution.
To avoid any confusion, you can switch nagling off with the TCP_NODELAY flag
Haven't checked yet if we implement this.
My current conclusion is that mozilla is poorly behaved because it sends one byte at a time. Heck, it shouldn't even be using tcp/ip for local IPC. You might want to file a bug with them on this issue. Sending larger data blocks will be FAR more efficient, however, it should not be as bad as you are reporting as long as they do set the TCP_NODELAY flag. If they do set that flag and ReactOS does not correctly implement it, that would explain why it works ok on windows and not ReactOS.
You guys must be ACK'd off with this ;)
On 12/21/05, Phillip Susi psusi@cfl.rr.com wrote:
Ge van Geldorp wrote:
Ok, so it seems that the sending side is correct in holding back the packet. In ReactOS the problem then is that the ACK is delayed for too long. I've checked the delayed ACK path. Basically we fire a timer every 0.5 sec to do cleanup tasks. Only one of every 5 of these timer ticks is passed on to the oskittcp lib (routine TCPTimeout only calls TimerOskitTCP on every 5th tick). This means that the oskittcp routine tcp_fasttimo() which is responsible for sending delayed ACKs is only called every 2.5 sec, which corresponds exactly to the delays in the loop.c test program I posted earlier.
The delayed ACK is also correct behavior. Since there is plenty of room left in the window, the ACK is supposed to be delayed so it can ACK further data with one packet.
So, the solution seems to be to call tcp_fasttimo() more often. Question is how often? There's a comment in the accompanying tcp_slowtimo() that it should be called every 500ms. So I guess tcp_fasttimo() should be called substantially more often than that?
This is not the solution.
To avoid any confusion, you can switch nagling off with the TCP_NODELAY flag
Haven't checked yet if we implement this.
My current conclusion is that mozilla is poorly behaved because it sends one byte at a time. Heck, it shouldn't even be using tcp/ip for local IPC. You might want to file a bug with them on this issue. Sending larger data blocks will be FAR more efficient, however, it should not be as bad as you are reporting as long as they do set the TCP_NODELAY flag. If they do set that flag and ReactOS does not correctly implement it, that would explain why it works ok on windows and not ReactOS.
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
-- "I had a handle on life, but then it broke"
From: Phillip Susi
The delayed ACK is also correct behavior. Since there is plenty of room left in the window, the ACK is supposed to be delayed so it can ACK further data with one packet.
I'm not arguing that delayed ACK is incorrect, but 2.5 sec??? Searching the Internet, 200ms seems a much more accepted value. This is also the value used by the MS stack: http://www.microsoft.com/technet/itsolutions/network/deploy/depovg/tcpip2k.m spx (under "Delayed Acknowledgements").
My current conclusion is that mozilla is poorly behaved because it sends one byte at a time. Heck, it shouldn't even be using tcp/ip for local IPC.
You might want to cut them some slack, their code needs to be cross-platform. TCP/IP is available on every platform they support, otherwise a browser doesn't make much sense :-)
You might want to file a bug with them on this issue.
No, it is our job to make ReactOS behave like Windows, not the job of the app vendors to change their code to suite ReactOS.
If they do set that flag and ReactOS does not correctly implement it, that would explain why it works ok on windows and not ReactOS.
My test program does not set that flag and still works ok on Windows.
Thanks for your input, I really appreciate it (and learn a lot of new things).
GvG
Ge van Geldorp wrote:
I'm not arguing that delayed ACK is incorrect, but 2.5 sec??? Searching the Internet, 200ms seems a much more accepted value. This is also the value used by the MS stack: http://www.microsoft.com/technet/itsolutions/network/deploy/depovg/tcpip2k.m spx (under "Delayed Acknowledgements").
That does sound like a more reasonable value, but if nagling is disabled, it should not matter.
You might want to cut them some slack, their code needs to be cross-platform. TCP/IP is available on every platform they support, otherwise a browser doesn't make much sense :-)
Yes, but shared memory is also available on every platform ;)
No, it is our job to make ReactOS behave like Windows, not the job of the app vendors to change their code to suite ReactOS.
Yes, that is true, but mozilla is still poorly behaved so it would be nice to let them know so they can improve on it eventually. Correctly implementing TCP_NODELAY should provide a decent work around to this poor behavior, but it would be much better if they would send the whole block of data at once instead of one byte at a time.
My test program does not set that flag and still works ok on Windows.
It may still be slow, just not AS slow due to the lower timeout. If that is the case then you should see a significant speed up by enabling TCP_NODELAY. You should also see that behavior on bsd or linux systems. Lowering the timeout will reduce the effects, but it will only be hiding the real problems.
Phillip Susi wrote:
Ge van Geldorp wrote:
I'm not arguing that delayed ACK is incorrect, but 2.5 sec??? Searching the Internet, 200ms seems a much more accepted value. This is also the value used by the MS stack: http://www.microsoft.com/technet/itsolutions/network/deploy/depovg/tcpip2k.m
spx (under "Delayed Acknowledgements").
That does sound like a more reasonable value, but if nagling is disabled, it should not matter.
If an app is going to send one byte at a time, disabling the nagle algorithm is _exactly_ the _wrong_ thing to do!
The nagle algorithm exists precisely so that applications can send one byte at a time and not have sucky performance.
Now if an app is _always_ going to do
send(); recv();
(i.e., there is going to be one call to send followed by a call to recv, disabling nagle might be appropriate.)
But any app that does:
send(); send(); . . . send(); recv();
Should not disable the nagel algorithm except in rare (I can't think of any) circumstances.
Thanks,
Joseph
Joseph Galbraith wrote:
If an app is going to send one byte at a time, disabling the nagle algorithm is _exactly_ the _wrong_ thing to do!
The nagle algorithm exists precisely so that applications can send one byte at a time and not have sucky performance.
No, it exists to prevent such applications from flooding the network with a storm of small packets and causing congestion due to the increased overhead. The loopback interface is not going to run into this problem so if you insist on sending small blocks of data and want them to go through fast, you want to turn off nagle.
But any app that does:
send(); send(); . . . send(); recv();
Should not disable the nagel algorithm except in rare (I can't think of any) circumstances.
This is exactly one such case. The app makes several sends but the total amount of data is less than one MSS, so nagle will cause delays. These delays hurt performance, and do not do any good when you are using the loopback interface.
If the app is sending a lot of data, just broken up into numerous small sends, then nagle will tend to increase the total throughput by lowering the overhead of sending numerous packets and should be left on, but in this case, the app is sending small amounts of data, but needs it to get there quickly, so it should be turned off.
It turns out that the Mozilla control does set the TCP_NODELAY flag, we just don't implement it yet. I looked at adding it, but we're missing setsockopt() all together. It doesn't make much sense to add it to our current code now, with Alex's rewrite coming up.
If there are no objections, I want to commit the attached patch. It does 2 things: - Send delayed ACKs every 200ms instead of every 2500ms - Force Nagle off for the loopback interface (testing shows that Windows does this too, there's no congestion problem on the loopback interface)
GvG
Reading all post of this I think we should do as ms does set it to every 200ms instead of every 2500.
change it to 200ms +1
----- Original Message ----- From: "Ge van Geldorp" gvg@reactos.org To: "'ReactOS Development List'" ros-dev@reactos.org Sent: den 22 December 2005 11:05 Subject: RE: [ros-dev] Loopback problem
It turns out that the Mozilla control does set the TCP_NODELAY flag, we
just
don't implement it yet. I looked at adding it, but we're missing setsockopt() all together. It doesn't make much sense to add it to our current code now, with Alex's rewrite coming up.
If there are no objections, I want to commit the attached patch. It does 2 things:
- Send delayed ACKs every 200ms instead of every 2500ms
- Force Nagle off for the loopback interface (testing shows that Windows
does this too, there's no congestion problem on the loopback interface)
GvG
---------------------------------------------------------------------------- ----
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Ge van Geldorp wrote:
It turns out that the Mozilla control does set the TCP_NODELAY flag, we just don't implement it yet. I looked at adding it, but we're missing setsockopt() all together. It doesn't make much sense to add it to our current code now, with Alex's rewrite coming up.
If there are no objections, I want to commit the attached patch. It does 2 things:
- Send delayed ACKs every 200ms instead of every 2500ms
- Force Nagle off for the loopback interface (testing shows that Windows
does this too, there's no congestion problem on the loopback interface)
GvG
+1
Joseph Galbraith wrote:
Phillip Susi wrote:
Ge van Geldorp wrote:
I'm not arguing that delayed ACK is incorrect, but 2.5 sec??? Searching the Internet, 200ms seems a much more accepted value. This is also the value used by the MS stack: http://www.microsoft.com/technet/itsolutions/network/deploy/depovg/tcpip2k.m
spx (under "Delayed Acknowledgements").
That does sound like a more reasonable value, but if nagling is disabled, it should not matter.
If an app is going to send one byte at a time, disabling the nagle algorithm is _exactly_ the _wrong_ thing to do!
The nagle algorithm exists precisely so that applications can send one byte at a time and not have sucky performance.
We know this. If you read back through the previous mails, you'll see that disabling the algorithm was suggested for testing purposes only.
But any app that does:
send(); send(); . . . send(); recv();
Should not disable the nagel algorithm except in rare (I can't think of any) circumstances.
I can give you a few (besides debugging) :
- Streaming media - X servers - network games
If packet timing is more important that consuming large amounts of bandwidth with packet overhead, it's perfectly reasonable to disable it.
Ged.