Wireshark-dev: Re: [Wireshark-dev] Wireshark memory handling
From: Erlend Hamberg <hamberg@xxxxxxxxxxxx>
Date: Tue, 13 Oct 2009 20:00:04 +0200
On Saturday 10. October 2009 03.48.29 Guy Harris wrote:
> The data Wireshark currently keeps in its address space that could
> grow in size as the capture file grows are:
> 
> 	the frame_data structure (epan/frame_data.h) - one structure instance
> per packet;

Ok, so – if my understanding is correct – for every packet that is read, an 
frame_data structure is created, this is 80 bytes on a 32-bit machine and 128 
bytes on a 64 bit machine according to gdb.

> 	the text for some or all of the columns in all of the rows of the
> packet list (all, in current releases of Wireshark; some, in the
> development branch);

Ok, not much to save here after the introduction of the new packet list, I 
guess.

> 	various per-packet private data attached to some frames by dissectors;
> 
> 	various per-dissector private data structures;

Hard to avoid. :-)

> 	the results of reassembly.

See below.

> The data from the frames in the capture file are not kept in
> Wireshark's address space - they are read in as necessary, into a
> small number of buffers (one for the main window, and one for each
> packet window opened).  *HOWEVER*, if data from a frame is reassembled
> into a higher-level multiple-frame packet, the result of the
> reassembly is, as noted, kept in Wireshark's address space.

So, when Wireshark reads the capture file, if it finds a single-frame packet, 
it will only create a frame_data structure in memory and possibly data from 
the dissector for that type of packet. But if the packet is made up of several 
frames, the packet is reassembled and kept in memory? If so, do you think this 
could be changed? Would it be worth it?
 
> People complain about it enough that, while in *most* cases it might
> not be a problem, we frequently get mail from people who have to split
> up capture files to read them - I'd call it enough of a problem that
> we should work on it (ideally, by reducing the amount of address space
> required by the aforementioned data items).

Yes, absolutely.

It would still be nice if would be possible for people to analyse more data 
than will fit in virtual memory (in the case of Linux/Solaris, etc. where the 
swap space is fixed). I see that there is an "abstraction" of memory 
allocation in epan/emem.c (se_alloc* and friends), but g_malloc, and plain 
malloc is used as well, it seems.
If the functions in emem.c were used for all memory allocation/freeing, that 
would mean that this could be done by intercepting requests for memory in 
those functions.

What is the status on the use of these functions? I got the impression from 
README.malloc that these are recommended, but I mostly see allocations done 
using g_malloc. Or is that just allocations that should outlive a capture 
session?

-- 
Erlend Hamberg
"Everything will be ok in the end. If its not ok, its not the end."
GPG/PGP:  0xAD3BCF19
45C3 E2E7 86CA ADB7 8DAD 51E7 3A1A F085 AD3B CF19

Attachment: signature.asc
Description: This is a digitally signed message part.