Wireshark-dev: [Wireshark-dev] Fileshark (AKA Dissecting Files with Wireshark)
From: Evan Huus <eapache@xxxxxxxxx>
Date: Thu, 20 Jun 2013 11:54:37 -0700
This topic has come up a couple of times already on the list, and
people have has submitted several patches to dissect files but there
has been some worry about scope creep and the potential architectural
differences between file and packet dissection, so not a lot has
gotten done and most of those patches are still sitting in bugzilla.

We just had a fairly in-depth discussion at Sharkfest on the topic, so
I'm sending this out to summarize our conclusions and to ask for
feedback from those who didn't attend Sharkfest this year (and from
those who did, if I got something wrong).

Everything here is still tentative and subject to change, but
hopefully it will provide a good set of guidelines going forward.

The obvious benefits of dissecting files in Wireshark are two-fold.
First, our epan framework library is very well suited to the task and
makes writing file dissectors very easy for the most part. Second,
many network protocols carry files (HTTP etc) and it would be
beneficial to be able to decode those files in Wireshark proper.

The potential problems, as already mentioned, are scope creep and
possible conflicts in needs between packet and file dissection.

The tentative plan we have come up with is as follows:
1. Create a third dissector library libfiledissectors (in addition to
the existing libdissectors and libdirtydissectors) and link all
file-type dissectors into that.

2. Link libfiledissectors into Wireshark so that network protocols
(HTTP etc) can call them.

3. Create a separate application called Fileshark that doesn't use
wiretap and links only against libfiledissectors. When dissecting
files directly, wiretap simply gets in the way. Additionally, people
have pointed out several possible UI changes that make sense for files
but not for packets. While hopefully Fileshark and Wireshark will be
able to share much of their code, there are still places where they
will diverge and we want to be able to do that cleanly.

Some of the consequences of this:
- Wireshark proper will not open non-capture files directly. It will
parse files it sees over transport protocols, but to parse an on-disk
file directly will require opening it in Fileshark.
- File dissectors should not register with wiretap. They should
register with network protocols like mime-type/http and with whatever
mechanism ends up being used for Fileshark registrations.

Obviously this is not a small project, but hopefully we can leverage
as much existing infrastructure as possible to make it relatively
painless.

Regarding patches: creating libfiledissectors should be fairly easy to
do, at which point it makes sense to start accepting patches for file
formats as long as they behave appropriately. Work on the Fileshark
application proper is also needed, so patches in that direction are
welcome as well.

Again, this is not a final declaration of law, just an approximate
direction we want to go in.

Thoughts? Comments? Suggestions?

Cheers,
Evan