Wireshark-users: Re: [Wireshark-users] TCP question: retransmission or prodding the peer?
From: Bill Meier <wmeier@xxxxxxxxxxx>
Date: Fri, 21 Feb 2014 15:23:59 -0500
On 2/21/2014 3:23 AM, netztier@xxxxxxxxxx wrote:
2. I have several observations: a. The basic request/response sequence as follows: [ SEQ/ACK ] analysis snipped So: The fact that the seq & ack in 4 and 5 are the same is just as expected. packet 4 is just an "ack" with no data packet 5 is data (with same seq/ack as the previous) However: for some reason, B took 2.5 secs to send (the start of) a response to packet 3 in packet 5. We know that B received packet 3 immediately because B sent an ack in packet 4 (after the usual 200 ms delay). So: The "B" application failed to respond immediately even though we know that "B" received the packet at the network level.A (10.33.53.121) is the card reader and TCP initiator, while B (147.88.243.121) is the card server and TCP responder. I beg to disagree here: (if we remove the two ARP packets from the capture and restart numbering from 1):
I guess we disagree a bit about the details (see below). :)In any case, I would expect that getting a capture somewhere near the server may help to clear things up.
Packet 3 of the TCP session (packet 5 in the originally attached capture) was the A's 3rd packet of the 3-way-handshake.
Correct
Packet 4 is such a suspicous "retransmission" after 7.5 ms, with a 1byte payload (Kind of a request?)
Uh, no: this is not a retransmission. This is "A" sending the first byte of data after the connection has been established. I don't consider it suspicious. If you look at most any TCP connection startup, you'll see that the sequence is exactly the same. (That is: the seq num and the ack for the first data packet sent from the connection initiator after the 3-way handshake will match that of the previous ack sent in the 3nd packet of the 3 way handshake).
Packet 5 is B's response to packet 4, an empty ACK
acking the first data byte received from A.
Packet 6 is probably A's request to B, the first packet with some discernible payload, and probably is a "true" request of something.
As I noted above: I'm pretty sure that the complete request consists of the first byte sent plus these bytes. The split between the two packets has to do with the Nagle algorithm.
Packet 7 is B's ACK to packet 6, with a delay of 0.2 seconds
Packet 8 is B's ACK to ... (what?) with the delay of 2.5 seconds and a 1 byte payload.
B didn't receive any more data from A since B sent the ACK in 7 so the ACK number sent by B is unchanged.
Packet 9 is A's ACK to packet 8 etc... ad FINI've idea as to why. Does "Only the during the first TCP connection" suggest some kind of initial setup going on in "B" ?That is what I assume. User swipes the card - card reader contacts server. This first TCP session is always this short, just 18 or 20 packets. But a lot of them have this strange delay between packets 7 and 8. The next TCP sessions follow immediately (retrieval of print job list), and while printing, there is a longish one, probably for the "live billing" to the card of each single page printed.That being said: there's another issue having to do with the "send 1 byte", wait for ack, send remaining bytes" pattern. Rather than me trying to explain: Do a web search on "Nagle algorithm" and TCP_NODELAY for an explanation. Basically: the software isn't programmed quite right (IMHO).If i understand the bit with TCP_NODELAY correctly, setting this socket option when calling a socket causes TCP to send every data chunk that gets "pushed down the stack" from the application immediately. Given the application's nature, and the requirement that every page printed must be reported back to the card reader (and written to the card), I think that disabling Nagle is not quite wrong.Another thing I find a bit interesting: The widow size advertised by B (card server ?)just keeps decreasing as data is received from A. Normally that would mean that the app isn't taking the data from the network layer. However, that appears not to be the case since the request/response sequence seems to complete OK.The servers (B, 147.88.243.205) advertised window size starts with 8k in the SYN ACK, then jumps to almost 64240 bytes and decreases down to 64102 bytes.What kind of system is the card server. Some kind of minimal system ?Currently, I do not know. I assume it is a Windows server.Actually: I see that the continually decrementing window size advertisement applies to both the card reader and the card server.Agreed, the card reader starts at 32k, increases to 33580 and decreases to 33551.Given that we're talking embedded devices, have you discussed this issue with the vendor ?The issue in its early days had been tracked with the vendor. Back then however, the network did actually have a problem: restrictive port security timers aged-out the card reader's MAC address from the CAM table prematurely When it was not used for more than 5min, it had cleared it's own ARP table, and had to start with ARPing for it's default gateway. With port security enabled, MAC-learning on the Cisco 2960S is done in software on the CPU, not the port ASICs. The first ARP request was lost and the card reader had to retransmit it, sometimes even twice. We worked around this by increasing port security timers and by reducing arp timeout on the upstream L3-Switch to 4 minutes. Cisco L3 devices with CEF enabled perform active ARP cache maintenance. 1min before expiry, they unicast-request an ARP resolution from all known entries for a given subnet/interface. This request/reply sequence every 3 minutes keeps the CAM table entires alive.
Interesting ...
No more lost ARPs since.Thinking about this a bit more: It's certainly possible that the issue is lost data from the server to the reader. IOW: packet 5 above is actually a retransmission which eventually makes it through. Depending upon the TCP implementation, it could be that the retransmission timeout is 2.5 secs.I would agree that frame 8 with it's 2.5s delay is a retransmission that eventually gets through, but then again... We've seen frame 7 get to the card reader. If the card reader's (A's) reaction to frame 7 were lost somewhere upstream in the network, the capture should've seen it go from card reader to switchport. The ethernet hub I used to capture was between the card reader and its usual switchport - and that switchport hasn't seen a malformatted frame in months. So for some reason, there was no reaction from the card reader to frame 7. This might be perfectly ok - probably there is no reaction required.
There's no need for any reaction from A to receiving frame7 from B. Frame 7 from B is just an ACK with no data and therefore A does not
need to respond. (A is awaiting a reply with data from B).
And: If it were packet loss due to corruption (bad cabling) or congestion, I believe it would have to be random. Always missing a packet after "packet 7" isn't random. Of all the samples i have (more than a dozen), the 2.5s delay is always between packet 7 and 8. So if there actually is a packet missing from card reader (A) to Server (B), it never left the card reader. Which brings us back to the vendor.
See above. I don't think there's anything missing from A at this point.I'm pretty convinced that the issue is with the data from B which A is awaiting at this point.
I would guess that the first step would be to do a capture adjacent to the server to rule that possibility out.That's what I'll attempt next. I hope it is a server that runs on a platform where i can capture directly on the server. Capture of a non-delay-affected session will follow. Best regards & thanks a lot Marc
- Follow-Ups:
- [Wireshark-users] why there is a malformed?
- From: damker
- [Wireshark-users] why there is a malformed?
- References:
- [Wireshark-users] TCP question: retransmission or prodding the peer?
- From: netztier@xxxxxxxxxx
- Re: [Wireshark-users] TCP question: retransmission or prodding the peer?
- From: Bill Meier
- Re: [Wireshark-users] TCP question: retransmission or prodding the peer?
- From: Bill Meier
- Re: [Wireshark-users] TCP question: retransmission or prodding the peer?
- From: Bill Meier
- Re: [Wireshark-users] TCP question: retransmission or prodding the peer?
- From: netztier@xxxxxxxxxx
- [Wireshark-users] TCP question: retransmission or prodding the peer?
- Prev by Date: [Wireshark-users] TLS/SSL-PSK: Decryption not working
- Next by Date: [Wireshark-users] why there is a malformed?
- Previous by thread: Re: [Wireshark-users] TCP question: retransmission or prodding the peer?
- Next by thread: [Wireshark-users] why there is a malformed?
- Index(es):