Wireshark-dev: Re: [Wireshark-dev] Reassembly of IP fragments gets confused by multiple packets
From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Sat, 6 Feb 2016 11:10:19 -0800
On Jan 20, 2016, at 8:43 AM, Anders Broman <anders.broman@xxxxxxxxxxxx> wrote:

> Trying to summarize…
>  
> captured on the "all" interface of a Linux machine acting as a router, or merged two captures from networks on different sides of a router.
>  
> various sorts of tunneling

Including, for example, MPLS pseudo-wires.

> (or "other sorts of tunneling", if you view VLANs as a form of tunneling)

That might be the best way to think about VLANs - treat them the same way various tunnels, pseudo-wires, etc. are treated.

On the other hand, as a comment:

	https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=4561#c5

in the bug you're quoting says:

	For VLAN I know cases where UL traffic and DL traffic belong to different VLAN, still to the same flow. Thus a differentiation according to VLAN id must be configurable.

so VLANs shouldn't always be treated as tunnels.

I don't know whether that's an issue for to any other form of pseudo-wire or tunnel - or even for physical networks, as identified by interface IDs; I could imagine a case where a machine has multiple physical interfaces on the same network and different fragments of an IP datagram, or different packets in a given TCP connection, or... might travel on different interfaces.

So is this really just a question of

	1) "wires", whether physical or virtual (a VLAN being a "virtual wire") or pseudo-wire or...

and

	2) whether the machine on which the capture is being done routes traffic between "wires" or not?

If the machine has multiple "wires" (or radio channels) and *is* routing between them, a capture that sees traffic from multiple "wires" might see multiple copies of network-layer packets on different "wires".

If it's not routing between them, and is, for example, using different "wires" for an uplink and a downlink, or has bonded multiple "wires" together and is sending traffic to another machine over multiple different "wires", then fragments from a given fragmented IP datagram, or packets on a given TCP connection, or... might appear on different "wires".

In the first case, you want to distinguish between "wires" when doing reassembly/TCP analysis and desegmentation/etc..

In the second case, you *don't* want to do that.

Would that be sufficient to handle the two cases?  Or are there cases where the machine might be routing between some sets of "wires" and using other sets of "wires" as parallel pipes to and from a given destination host, so that whether the "wire" should be taken into account depends on the "wire"?

> The right generalization might be to have some sort of "network tag" which incorporates a network interface ID plus all VLAN tags for the packet ("all VLAN tags" to handle QinQ).
>  
>  
> So if we go for network tag, or key should that be created by
> Outer VLAN tag, Hash of Source MAC, protocol-level, interface index(Pcap-ng)?
>  
> Outer VLAN tag should take care of, VLAN and QinQ, right?
> Source MAC should take care of, “duplicate caused by mirroring” and alike(?)
> Pinfo- curr_layer_num Should take care of tunneling(?)
> Interface index should take care of ANY interface traces(?)
>  
> What size should the key be, is 32bits enough?

If we have a notion of "wire", perhaps we could assign sequential internal numbers to "wires" - just as we do for conversations - and use the internal "wire" number.

We could also use that for filtering, e.g. "show me everything on this "wire"".

(And maybe we could have a sidebar in Wireshark, showing all the "wires", conversations, etc. we've seen, and let you just click on one to show only the packets on that "wire"/in that conversation/etc..)