Ethereal-dev: Re: [ethereal-dev] Standards in decode routines

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Gilbert Ramirez <gramirez@xxxxxxxxxx>
Date: Fri, 2 Apr 1999 22:25:27 -0600
On Sat, Apr 03, 1999 at 11:20:47AM +0900, Richard Sharpe wrote:
> Hi folks,
> 
> I have insinuated my way onto the team by contributing code and promising
> to contribute more :-) At some stage I am going to have to add my name to
> the list of contributors on the help screen :-) 

Make it so.

> I am starting to wonder if we have any standards around information
> presented by the various decode routines. I am looking for some guidelines,
> and don't mind being part of the solution (ie, coming up with guidelines if
> they are needed). Are there any guidelines? As I am looking more and more
> at TCP level protocols (like SMTP, DNS Zone Transfers, SSH, Telnet, etc),
> it may be that there is a need for such guidelines.

There are no guidelines. Ethereal has been evolving. We have mostly
stayed away from the protocols that are stacked above the
session-oriented transports (TCP, SPX) because of the difficulty of
matching packets of fragmented TCP write()s. Guy has some good ideas on
how to fix this.

What I'd like to do in regards to this is to have an "unfragment"  mode
in ethereal. When used, multiple packets that are known to be part of a
single TCP transmision (write()) will be coalesced together. The packet
summary line will be two lines high. The first line will be the decoding
of the packet (once you put all the packets together for a single TCP
write(), you can send that entire buffer to the next higher layer,
dissect_http() or whatever). The second line will contain a drop-down
list of the individiual frames that make up this TCP write(). If you
select one of these frames, a window pops up showing you the protocol
tree and hex dump of that single frame.

As far as guidelines, I have begun writing a small tutorial on how to
write dissect() routines. It's not yet complete; I wasn't going to show
it until it was ready, but since you asked.... I plan to go into more
detail on how to create the packet summary line, how to associate fields
in the protocol tree with byte ranges in the hex dump, and how to use
the various utility routines in ethereal (ip_to_strval, etc). If anyone
wants to add some paragraphs, please send me the additions! (or wait
until its in CVS)

> THINGS ABOUT ETHEREAL I HAVE A PROBLEM WITH
> 
> 1. It does not presently handle enough protocols. I am contributing towards
> fixing this, as are many others.

Perhaps this tutorial will help with that.

--gilbert
How to Add a New Protocol to Ethereal
=====================================
$Id$

Overview of Decoding Routines
-----------------------------
To add a new protocol to ethereal's decoding collection, you need to know
how to program in the C language. In the future ethereal might provide
an easy scripting language so users can decode arbitrary protocols,
but until then, you have to write in the C.

When analyzing a trace file, ethereal makes use of the file in two
different stages.  Immediately after opening the capture file, ethereal
loops through the entire file, packet-by-packet. It calls the top-level
packet-decoding function, dissect_packet(), for each packet. But the
decoding done at this time is minimal: this is the time in which the
packet list is created, showing a single summary line for each packet. More
importantly, ethereal also records the offset of each packet in the trace
file.

In the GUI, when the user highlights a specific packet, ethereal seeks
to point in the capture file where that packet is located, and again
calls dissect_packet(). This time, however, a full-decode is done. At
this time the tree view and byte views are created and displayed on
ethereal's window.

Packet decoding is done in a hierarchial fasion. This is natural since the
protocol layers are stacked upon each other. It is dissect_packet()'s job to
figure out which datalink type the packet is encapsulated in (ethernet,
token-ring, fddi, PPP, etc.) and call the appropriate dissect_*() routine.
After the datalink dissect_*() routine decodes its portion of the packet, it
is responsible for determining which dissect_*() routine to call next. The
dissect_*() routines continue to call other dissect_*() routines until some
dissect_*() determines that there is no more data to decode.

The calling sequence for a packet in a telnet trace, for example, might look
like this:

ethereal -> dissect_packet()             Top level routine
            -> dissect_eth()             Ethernet layer
               -> dissect_ip()           IP layer
                  -> dissect_tcp()       TCP layer

How to Structure Your New Routine
---------------------------------
Take a look at any packet-*.c file. Each dissect_*() routine works the same
way. The datalink-level dissect routine accept 3 arguments:

void dissect (const u_char *data_buffer,
		frame_data *packet_information,
		proto_tree *tree_widget);

Each non-datalink-level dissect accepts 4 arguments.

void dissect (const u_char *data_buffer,
		int data_offset,
		frame_data *packet_information,
		proto_tree *tree_widget);

The data_buffer is where a copy of the packet is stored. The data_offset
is the beginning of this protocol layer's portion of the packet. It is
an offset for data_buffer[]. The frame_data structure contains low-level
information about the packet; you will rarely need this in your own
dissect_*() routine.  The tree_widget is an abstract object representing
the point in the protocol tree where the information from *this* layer
will be attached.  The protocol tree is the widget viewable in the middle
pane of ethereal.  In the GTK version of ethereal, a proto_tree is really
a GtkWidget. But the programmer is never aware of theat.

Notice that there is no flag passed to your dissect_*() routine that tells
your code whether ethereal wants a short-decode (for the summary line) or a
full-decode. Your code will determine this by the value of the tree widget. If
tree_widget is NULL, then you have no tree to draw on. Therefore, ethereal
wants a short-decode for the summary line. If the tree_widget != NULL, then
there is a tree object to draw on, so ethereal wants a full decode.

In your routine then, enclose any tree-related code in an if-statement:

	if (tree_widget) {
		/* do something */
	}

Any code that is used in the creation of the summary line should exist outside
the true condition of the if statement.

Your protocol layer ends in 1 of 3 ways:

	1. Another protocol layer exists after yours.
	2. Application data exists after your protocol layer.
	3. The packet ends with your protocol layer.

If another protocol layer exists after yours, your code must determine which
protocol that is and call the appropriate dissect_*() routine. You must also
determine the offset at which the next protocol layer begins and pass that in
the arguments to the next dissect_*() routine. The next protocol begins at the
byte after your protocol layer ends.

If your protocol layer is carrying data which does not need to be decoded, but
which should be marked as data in the protocol tree and hex dump, then call
the dissect_data() routine with the appropriate offset. It marks the rest of
the packet as data and does no decoding of the data.

If the packet ends with your protocol layer, do nothing. Simply return; you
are the end of the dissect_*() chain.