Wireshark-users: Re: [Wireshark-users] match packets at sender and receiver
From: Andrej van der Zee <andrejvanderzee@xxxxxxxxx>
Date: Tue, 20 Apr 2010 17:20:02 +0900
Hi Ian,

Thank you again for your method. I used the following fields for
identifying a packet:

* src ip
* dst ip
* src port
* dst port
* tcp seq nr
* tcp ack seq nr
* ip id
* packet length

In a single 1.7GB cap-file with 17424367 TCP packets I get 27
identical packets based on the above id.

Are there any other field in the tcphdr struct that I could use? I am
not sure about their meaning:

struct tcphdr {
	unsigned short source;
	unsigned short dest;
	unsigned long seq;
	unsigned long ack_seq;	
        #  if __BYTE_ORDER == __LITTLE_ENDIAN
        unsigned short res1:4;
        unsigned short doff:4;
        unsigned short fin:1;
        unsigned short syn:1;
        unsigned short rst:1;
        unsigned short psh:1;
        unsigned short ack:1;
        unsigned short urg:1;
        unsigned short res2:2;
        #  elif __BYTE_ORDER == __BIG_ENDIAN
        unsigned short doff:4;
        unsigned short res1:4;
        unsigned short res2:2;
        unsigned short urg:1;
        unsigned short ack:1;
        unsigned short psh:1;
        unsigned short rst:1;
        unsigned short syn:1;
        unsigned short fin:1;
        #  endif
	unsigned short window;	
	unsigned short check;
	unsigned short urg_ptr;
};

Thank you,
Andrej



On Wed, Apr 7, 2010 at 8:54 AM, Ian Schorr <ian.schorr@xxxxxxxxx> wrote:
> On Tue, Apr 6, 2010 at 10:45 PM, Andrej van der Zee
> <andrejvanderzee@xxxxxxxxx> wrote:
>> What I would like to know is how to match packets on both ends of the
>> line, provided that I have the IP numbers. Are there any unique packet
>> identifiers that appear in the cap-files on both ends? What should I
>> use? For example, when I study the cap-file in Wireshark, I see under
>> "Internet Protocol" an "Identification" number that seems to be
>> incremented for packets over the same connection (or conversation?).
>> Is this Identification number generated by Wireshark or is it really
>> in the packet headers? Does it appear in both cap files? In that case,
>> I could use a tuple <IP, Identification> to match packets on both
>> ends.
>
> IP IDs are actually in the packet header.  Two ways to know:  1) Click
> on the field.  Notice how 4 bytes (containing the IP ID value) are
> highlighted in the bottom pane, the data portion of the packet.  2)
> Generally Wireshark-generated fields should be enclosed in square
> brackets (though those things aren't necessarily always going to be
> the case, they're generally true and are SUPPOSED to always be true)
>
> You could use an ID field like IP ID to identify your packets.
> However, IP ID is not only not guaranteed to be unique within your
> capture, but it's one of the most likely fields to not be unique.  For
> one thing, it's a relatively small field (16-bit) and even if a host
> increments the ID steadily (i.e. it doesn't re-use IDs more often than
> it has to), it will re-use an ID after sending only 65536 packets.  A
> reasonably busy system is going to wrap IDs pretty quickly.
>
>    Incidentally, different hosts increment the IDs differently.  Some
> increment it globally - once per
>    packet they send.  Some have an incrementing counter for each host
> they're talking to.  One host that
>    I was looking at yesterday does some weird things - it seems to be
> aware at IP-level what
>    packets are still in-flight on the network (based on being
> unacknowledged at TCP level).  It only
>    generated unique IDs for each in-flight packet, but once they'd
> been acknowledged, it'd reuse that
>    IP ID.  So it tended to use IDs 0, 1, and maybe 2 ALL the time.  Weird.
>
> And really, that's the tricky part.  There aren't really any fields in
> TCP/IP packets that are guaranteed to be unique.  There's always SOME
> chance of miscorrelating two packets that share the same properties
> that you're checking for.
>
> Anyway, if you were going to go that route, the TCP sequence number is
> probably much better for your you'd get better results by comparing
> TCP Sequence number (tcp.seq), plus the 4-tuple (source and
> destination IPs and ports).  Although it's still not 100% guaranteed
> to be unique, it's much, much more likely.  Although the numbers will
> wrap, it will only happen after 4billion bytes have been transferred.
> There's a chance that if connections are being opened and closed
> frequently, then port numbers will be reused and that the host will
> also start TCP sequence numbers at the same point, but that's also
> extremely unlikely (and a big no-no - to avoid attacks hosts typically
> should assign a random sequence ID).
>
> Now compare the IP IDs as well, and it's now very, very, VERY unlikely
> that two packets you compare match criteria but aren't actually the
> same.  Theoretically there is a chance of misidentification, but
> practically, for your purposes, that's probably plenty accurate.
>
>
> Keep in mind also that if the network is modifying your packets
> enroute, you'll have troubles.  If there is a TCP proxy of some kind,
> a NAT/PAT device, or even a router that "fragments" packets, it may
> seriously impact what you're comparing.
> ___________________________________________________________________________
> Sent via:    Wireshark-users mailing list <wireshark-users@xxxxxxxxxxxxx>
> Archives:    http://www.wireshark.org/lists/wireshark-users
> Unsubscribe: https://wireshark.org/mailman/options/wireshark-users
>             mailto:wireshark-users-request@xxxxxxxxxxxxx?subject=unsubscribe
>



-- 
Andrej van der Zee
Koenji-minami 2-40-19A
Suginami-ku, Tokyo
166-0003 JAPAN
Mobile: +81-(0)80-65251092
Phone/Fax: +81-(0)3-3318-3155