Wireshark-dev: Re: [Wireshark-dev] Reduce memory consumption by re-reading data from file for r
From: Jeff Morriss <jeff.morriss.ws@xxxxxxxxx>
Date: Wed, 03 Nov 2010 15:00:10 -0400
Jakub Zawadzki wrote:
Hi,

On Wed, Nov 03, 2010 at 04:24:21PM +0100, Anders Broman wrote:
Looking more closly at the reassembly code the best solution may be to save the reasembled data
To a new temp file at the first pass and read from that file when the reassembled data is needed again.
Possibly by attaching the file offset pointer to per packet data or in the reassembled hash table.
Ultemately it would be nice to store this data in the libpcap file and have it available the next time the file
Is opened.

Does any one see any problems with having a temp file with reassembled PDU:s? I suppose this file can become
Almost as big as the original trace file.

Temp files are also kind of memory, so you won't decrease memory by using them :>
Btw. it's like reinventing swap, so if you want to use your disk as memory,
it's enough to create some big file and do mkswap & swapon.
(Well maybe on 32-bit systems it's not so easy...)

I like Jeff Moriss idea about mmap()ed files.

It depends on what the goal is.

Memory mapped files count as part of the process' address space so mapping the capture file means that, for reassembly, we'd be trading malloc'd memory for mmap()'d memory *and* it would mean we're holding the whole file--including the bits not used in reassembly--in memory. (We don't currently do this.)

Those of us with a 64-bit address space, this isn't so bad, except if we start swapping. For 32-bit users it means there's no way they could even attempt to load a file greater than 2 Gbytes[1]--even if they aren't reassembling and/or have disabled every feature they can in an attempt to get the file loaded. (Personally I think the whole world should have moved to 64-bit by now, but hey...)

[1] minus whatever memory Wireshark needs for other purposes

Memory-mapped files also are problematic when doing live captures: as the file grows we'd have to remap the memory (yikes!) or maybe create thousands of maps to one file--though mmap() says the file offset has to be aligned on page size boundaries, creating yet another problem.