Sorry about the late reply. I am one of the other students in the group.
Thanks for your answers. I have commented below and would appreciate further
feedback.
On Monday 5. October 2009 20.23.42 Guy Harris wrote:
> The paper says
>
> Since exhausting the available primary memory is the problem ...
>
> What does "primary memory" refer to here?
That could certainly have been worded more clearly. By primary memory, we mean
main memory, as your reasing lead you to.
The "problem", as we have understood it, and as we have seen it to be, is that
Wireshark keeps its internal representation (from reading a capture file) in
memory. I write "problem" in quotes, because in most use cases I guess that
this is not a problem at all, and this is also how almost any program
operates.
We work for an external customer who uses Wireshark and would like to be able
to analyze more data than is allowed by a machine's virtual memory without
having to splitup the captured data.
To be able to do this we looked at the two solutions mentioned in the PDF
Håvar sent, namely using a database and using memory-mapped files. Our main
focus is 64-bit machines due to 64-bit OS-es' liberal limits on a process'
memory space. Doing memory management ourselves, juggling what is mapped in
the 2 GiB memory space at any time, is considered out of the scope of this
project. (We are going to work on this until mid-November.)
[...]
> In effect, using memory-mapped files allows the application to extend
> the available backing store beyond what's pre-allocated (note that OS
> X and Windows NT - "NT" as generic for all NT-based versions of
> Windows - both use files, rather than a fixed set of separate
> partitions, as backing store, and I think both will grow existing swap
> files or add new swap files as necessary; I know OS X does that),
> making more virtual memory available.
So, on OS X (and possibly other modern OS-es), as long as you have available
harddisk space, a process will not run out of memory, ever? (A process can
have address space of ~18 exabytes on 64-bit OS X. [1])
This would mean that this problem would only continue to exist on operating
sytems using a fixed swap space, like most (all?) Linux distros still do.
> The right long-term fix for a lot of this problem is to figure out how
> to make Wireshark use less memory; we have some projects we're working
> on to do that, and there are some additional things that can be done
> if we support fast random access to all capture files (including
> gzipped capture files, so that involves some work).
Absolutely.
> However, your
> scheme would provide a quicker solution for large captures that
> exhaust the available main memory and swap space, as long as you can
> intercept all the main allocators of main memory (the allocators in
> epan/emem.c can be intercepted fairly easily; the allocator used by
> GLib might be harder, but it still might be possible).
Yes, the solution we planned was to have memory mapped files which we can
create as they are needed of a configurable size and then map that directly
into the process' address space. This would mean that if this is enabled,
memory access needs to be intercepted and "re-routed" to the next available
chunk of a memory-mapped file. This would of course be significantly slower
and only be a real benefit on a 64-bit system, but it would at least make it
*possible* to do.
Comments are welcome. :-)
[1]
http://developer.apple.com/mac/library/documentation/Performance/Conceptual/ManagingMemory/Articles/AboutMemory.html
--
Erlend Hamberg
"Everything will be ok in the end. If its not ok, its not the end."
GPG/PGP: 0xAD3BCF19
45C3 E2E7 86CA ADB7 8DAD 51E7 3A1A F085 AD3B CF19
Attachment:
signature.asc
Description: This is a digitally signed message part.