Wireshark-dev: Re: [Wireshark-dev] Help Needed
From: Sake Blok <sake@xxxxxxxxxx>
Date: Wed, 6 Feb 2008 17:30:56 +0100
On Wed, Feb 06, 2008 at 04:28:53PM +0530, Priyadarshi Parida wrote:
> 
> I have some problem in understanding segmentation in protocols in TCP and
> top of TCP.
> 
> I have fair idea in networking and protocols stacks n routing but seem I
> have some confusion.
> 
> When TCP assembles segments:
> What should be the end condition that TCP should assume and think its done
> with assembling all segments.

TCP is a streaming protocol which means it just trasmits data from
one side to another, ensuring every octet is being delivered in the
right order to the upper layer protocol. The TCP protocol does not 
reassemble segments, as it has no knowledge on what type of PDU's it
is carrying, therefore it does not know where to start and stop. So
it just passes it's data to the upper layer protocol.

> I understood that Push flag is useful.

The Push flag is used by the upper layer protocol to tell the TCP
stack to send the remains in it's buffers straight away without 
waiting for more data (which it might do to fill up whole segments
instead of transmitting smaller sized segments).

> Again I thought may be sequence number is useful.

The sequence numbers are used to keep track of all octets that have
been received. If there are some octets missing, a duplicate ack will
be sent to retrieve the lost data. Without sequence numbers, the TCP
protocol would not be able to pass all data to the upper layer protocol
without any gaps and in the correct order.

> Not sure what exactly is the right approach for assembling.

The right approach is to let the upper layer protocol take care of it.

> Now some thing at top of TCP, Lets say HTTP.
> 
> If HTTP has Content Length than its fine and easy to assemble segments in
> HTTP, but How if content Length is missing IN HTTP.

It there is no content length header, then there could be a transfer-
encoding header, telling the client that the response is chunked. In which
case, the object is sent in fragments. Each fragment is preceded with the 
length of the segment. After all segments, there is a segment with length
0, which tells the client to stop assembling the data.

There could also be no content-length when http-keepalives are not used.
In that case, the client just collects all data till the TCP-FIN.

> How if content length data is distributed in two different frames. I mean we
> have Hex values for Conte Length: as last values in one segment and value
> for it lets say 223 in beginning of next segment

The upper layer protocol has no knowledge about packet-bounderies. It just
gets a stream of data. So to http, the content-length header is just 
seen as a complete line. So it knows how much data to expect.


As for dissection in Wireshark, if the upper layer is allowed to 
reassemble tcp segments (setting in the tcp protocol preferences), then
it can tell the tcp dissector to keep the data accross packets until
a whole PDU has been collected. This is necessary because Wireshark 
dissects each packet on it own. That way the upper layer protocol would
only see part of it's PDU. By telling the tcp dissector to collect more
data, it will in the end get the whole PDU in one go and so it is
able to dissect the PDU as a whole.

I hope this clears thing up a bit for you!

Cheers,
    Sake