Wireshark-dev: Re: [Wireshark-dev] GSoC 2013: Process Information
From: Gerald Combs <gerald@xxxxxxxxxxxxx>
Date: Wed, 24 Apr 2013 11:20:25 -0700
On 4/24/13 3:26 AM, Ashish wrote:
> Hi all,
> 
> This mail is in reply to the mail sent (below) by Kostadin.
> 
> 
>     I'm contacting you with an intent to request some further info about
>     the task "Process Information" as found on the Wireshark's Google
>     Summer of Code 2013 project page.
> 
>     After a short research on the matter, I cant help but suspect/am
>     getting drawn to the conclusion that this task is too simple for a
>     full project commitment, which is then again challenged by the
>     thought I might be overlooking the complexity of it.
> 
> 
> I'm interested to work on this project and I'm going to apply for it.
> Even I thought it to be a simple task when I read it on the ideas page
> for the first time. But after going through some related literature, I
> understand that it would take a decent summer time to complete that task.
> 
> 
>     This task seems like it can be done feasibly well by making a call
>     in C to the commands netstat and tasklist on Windows and netstat or
>     ss on Linux and looking up the port given in the Layer 4 packet info
>     in Wireshark in the command output. But I dont know the time
>     efficiency of this, so maybe a direct kernel access would be prefered?
> 
>     However I noticed that when looking up the port of an UDP packet,
>     the port often closes quicky and cant be found in the table (I
>     recall someone adressing this issue in the bug page given as a
>     reference), so I suppose a solution to this could be a working set
>     data structure, which remembers the set of recently used ports and
>     their PIDs - as to reduce memory consumption. I would appreciate
>     feedback on this idea.
> 
> Yes, remembering the port in a struct form would be one task but you
> can't just correlate a recently used port to any packet that comes
> within a millisecond or a lesser time interval. Pardon me if I have
> understood your "remembering" part wrong.
> 
> Also the methods of making a method to call netstat won't be efficient,
> from my knowledge, since I believe that netstat relies polling
> mechanisms which you can't rely on to capture short bursts of packets.
> Reason? The same as above.
> 
> @others: Can anyone confirm that the Packet Information task is not so
> simple (as it looks) as in just having a struct with a port number of a
> network socket and pulling in info from existing tools like netstat,
> lsof etc.?
> 
> I feel that packets should be handled at the kernel level i.e using
> Netfilter's hooks and other kernel structures/objects
>  
> which can reveal the port information at Layer 3 or a packet's source
> information at Layer 4.
> 
> It would be really grateful if someone brainstorms (with me) on this
> idea so that it would be helpful for me to revise/update my proposal
> before submitting it.

Polling the system's TCP and UDP connection tables is trivial but its
usefulness is limited since it assumes that your interesting traffic has
a corresponding table entry at the instant you poll. This may not be the
case for short-lived connections such as DNS or DHCP and it certainly
won't be the case for ICMP or non-IP protocols.

System event tracing (e.g. Event Tracing for Windows, dtrace, or
whatever happens to be popular on Linux this month) or Guy's suggestion
of exposing process information through libpcap would be better, but
neither are trivial.

The project should also consider authorization. We've been trying to get
people to stop running Wireshark as root for the better part of a decade
and gathering process information shouldn't impede (much less reverse)
that effort.