Ethereal-dev: [ethereal-dev] Decoding multi-frame "packets"

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: guy@xxxxxxxxxx (Guy Harris)
Date: Fri, 11 Dec 1998 22:16:34 -0800 (PST)
(This is a followup to a comment I made about the topic of decoding
"packets" that take more than one frame.  It isn't about whether to
write packet decoders in C, or in a language that's translated into C,
or in an interpreted language, or...; these techniques could possibly be
used with any or all of those approaches, although it might affect how
those approaches are made to work.)

> I have been thinking about that (what got me thinking about working on
> packet analyzers in the first place was "snoop"s inability to decode
> more than the first frame's worth of data in an NFS READDIR reply), and
> have some ideas on how to do it.

One trick is handling the detail display for frames past the first one.

One way of doing that would be to lift an idea from Microsoft's Network
Monitor (NetMon) - there are some other potential benefits from that as
well.

NetMon decoders attach a list of property instances to a frame.  Each
protocol decoded by NetMon has a list of properties, e.g. ARP has an
"Opcode" property.  The ARP decoder would, for example, attach an
instance of the "Opcode" property to each ARP packet; the property
instance includes a pointer to the starting location in the frame of the
data for that instance (NetMon memory-maps the capture file), and the
length of the data in bytes.

Property instances also have a level number, giving a decode tree like
that of Ethereal and other GUI decoders.

The decoding is done when the capture file is read - it (presumably)
goes sequentially through the list of packets, calling the decoders for
each one, building up the property instance lists for the frames.

A display filter, to limit which of the packets are shown in the summary
view, can be constructed out of, among other things, relational
operations on properties - depending on the type of the property,
different relational operators are possible, such as "equals", "doesn't
equal", "greater than", etc..  One could, for example, say "show all ARP
packets where the 'Opcode' property equals 1", or "show all NFS ops
where the 'Filename' property equals 'hello_sailor.c'".  (Well, one
could if it let you specify a string value; it's not clear that it
does....)

If you click on a packet in the summary list, it doesn't have to decode
the packet, it just has to display the property list; many protocols
require information about packets earlier in the packet stream to decode
some packets (e.g., TFTP requires it because most TFTP packets go
between ports neither of which is the official UDP TFTP port - the reply
to the first packet specifies what port on the server handles the rest
of the TFTP session - and ONC RPC services such as NFS require it
because ONC RPC replies don't contain the procedure number, because the
client is supposed to match a reply with an outstanding request), and
doing all decoding sequentially may make that easier.

(However, one needn't do a full decode to make that work - one merely
needs to attach enough data to a frame to allow one to fully decode it,
so you don't *need* an "attach a property list" decoding scheme to
handle that.  In addition, one could do filtering and searching with a
property list without attaching a property list to the packet in an
initial decoding phase - one could arrange to have the decoders
do property matching if passed the right argument, and, in effect,
generate properties on the fly.)

The way I was thinking of decoding multi-frame "packets" was to have
packet decoder routines take, as an argument, not a single frame, but a
"frame set", which specifies a list of parts of packets.

For example, the IP decoder would attach properties for the IP header to
each frame, but, if a frame was part of a fragmented IP packet, and the
IP decoder didn't have all the fragments yet, it'd create a reassembly
data structure for the first fragment it sees, and attach subsequent
fragments to that structure, until it found all fragments.  When it
found all fragments, it'd construct a "frame set", specifying, for each
frame, the IP payload portion (leaving the IP header out), and pass that
to the appropriate next-level decoder, if any (UDP or TCP or ICMP
or...).  (If a frame *isn't* part of a fragmented IP packet, it'd
immediately make a one-frame frame set out of the payload, and hand it
to the decoder, if any.)

If the fragmented datagram was, say, an NFS-over-UDP READDIR reply, it'd
just look like UDP to IP, so IP would hand it to UDP, which would
recognize it as ONC RPC (perhaps by seeing that, at the appropriate
offsets, there was a 0 or 1 for "call" or "reply", and at another offset
there was a 2 for RPC version 2, and at another offset there was an RPC
program number it knows about, e.g. 100003 for NFS - although "snoop"
appears to detect RPC calls even if it *doesn't* know about the
program).  The ONC RPC code would decode the RPC header, and then hand
the packet to NFS, which would further decode it - as a READDIR reply,
given that RPC decoded it as a reply, and that the procedure number was
that of READDIR.

It'd extract field values from the frame set if it needed them to make
decisions about the packet, and otherwise just attach properties saying
e.g. "there's a 4-byte file ID starting at this offset", leaving it up
to the display or print or filtering code to extract the data from the
frame set.  The extracting would have to handle data that crossed frame
boundaries.

Properties would be attached to the frame where the property's data
begins.

This is one place where the property list stuff might be hard to do
without - jumping into the *middle* of an NFS READDIR reply, say, might
be hard, but if the entire READDIR reply were decoded in one chunk,
that'd be easier.

(However, it might be possible, instead, just to attach to a frame an
indication that, to decode it, you have to back up to a particular frame
and start there.)

For protocols running over TCP, it's a bit more complicated, as there's
no correspondence between frame boundaries and "packet" boundaries -
"packet" protocols running over TCP have to put packet boundaries in
themselves, as is the case with DNS-over-TCP, ONC-RPC-over-TCP, and
NetBIOS-over-TCP.  For those, the TCP reassembly code would have to hand
TCP frames sequentially to a decoder for the protocol layer that imposes
the packet boundaries (after stripping out retransmitted data, and
presumably showing it as retransmitted data), and *that* layer would
assemble the data into "packets", with a frame set specifying the packet
data, as well as decoding the headers that specify packet boundaries.