Ethereal-dev: Re: [Ethereal-dev] Performance. Ethereal is slow when using large captures.
Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.
From: Richard Sharpe <rsharpe@xxxxxxxxxxxxxxxxx>
Date: Thu, 13 Nov 2003 21:55:37 -0800 (PST)
On Fri, 14 Nov 2003, Ronnie Sahlberg wrote: > Ethereal is very slow when working with very large captures containing > hundred of thousands of packets. > Every operation we do that makes ethereal rescan teh capture file taks a > long time. Yup. I have several 300-400+MB captures of NetBench traces etc, and things can get really bad :-( > I would like to start looking at methods to make ethereal faster, especially > for large captures trading a larger memory footprint > for faster refiltering speeds but would like to know if there are others > interested in helping. > (you dont have to be a programmer, someone that knows how to run gprof and > read the data would be very very helpful as well) Well, I am a programmer, and, within the constraints of the little remaining time I have, I would like to help work on this. > If there is interest in this, I propose we take 0.9.16 as baseline, and > measure the execution speed and memory footprint for ethereal > for this version to start with. Sounds good. > A good example capture might be setting snaplen to 100 and downloading > ethereal from the website over and over until the capture file has 100.000 > packets (10MB). Well, I can supply some fraction of my NetBench captures :-) > Create three color filters : "ip" "tcp" and "frame.pkt_len>80" (random > filter strings from the top of my head) > Open the UPD and the TCP conversation list. > Load the capture into ethereal. > > then measure the time it takes to refilter the capture as fast as possible > 10 times using the display filter "tcp.port==80" > After the 10th refilter of the trace, measure the memory footprint of > ethereal using ps. > > Repeat this using a gzipped copy of the capture file. > This could be the baseline. > (the time to perform the test should be a couple of minutes at least so that > we can see any differences in test implementations easier) > > > After this it would be good if someone did this test and collected gprof > data to see where most of the cpu is spent. > I guess but am not sure it is in the actual dissection of the packet. > It would be great to have this verified. > Everything below this line assumes the theory that the dissection of packets > are taking up a significant part of the time to refilter a packet. > > > Hypothesis: a lot of time is spent redissecting the packets over and over > everytime we refilter the trace. > Possible Solution: reduce the number of times we redissect the packets. > (We would need a preference option called "Faster refiltering at the expense > of memory footprint". Memory is cheap but CPU power is finite) > > currently: > Everytime we rescan a capture file we loop over every single packet and: > dissect the packet completely and build an edt tree (in file.c) > prune the edt tree to only contain those fields that are part of any of the > filters we use (display filter, color filter, the filter used by taps etc) > evaluate each filter one at a time using the now pruned edt tree. > discard the edt tree and go to the next packet. > > possible optimization: > When we dissect the packet the first time, keep the full edt tree around > and never prune it. > The loop would then be: > Loop over all the packets: > For this packet, find the unpruned edt tree (dont dissect the packet at > all) > apply all filters one at a time on the unpruned edt tree. > > This would at the expense of keeping the potentially very memory consuming > edt tree around eliminate the need to redissect the packet. > The memory required by the edt tree could probably be optimized over time. > The risk is, how will the time it takes to evaluate a filter be affected by > the tree not being pruned? What is the execution time compared to number > of fields held in the edt tree ? Ordo(n) or Ordo(n^2) ? Maybe we would > have to do major brainsurgery there as well? > Maybe it turns out to be unfeasible? Who knows until we have tried. > Still, SOME things we do would require a redissect of the packet anyway such > as when clicking on the packet in the packet list but that is not > performance critical. > > > Can anyone see any obvious flaws in the reasoning above which would make it > unfeasible or the approach irrelevant? > > > Would anyone that knows gprof be interesting is doing the measurements on > the "baseline" as suggested? and keep testing the same sample file > for all further performacne tests we do so we know if we are going in the > right direction or not? > > > _______________________________________________ > Ethereal-dev mailing list > Ethereal-dev@xxxxxxxxxxxx > http://www.ethereal.com/mailman/listinfo/ethereal-dev > -- Regards ----- Richard Sharpe, rsharpe[at]ns.aus.com, rsharpe[at]samba.org, sharpe[at]ethereal.com, http://www.richardsharpe.com
- References:
- [Ethereal-dev] Performance. Ethereal is slow when using large captures.
- From: Ronnie Sahlberg
- [Ethereal-dev] Performance. Ethereal is slow when using large captures.
- Prev by Date: Re: [Ethereal-dev] Ethereal 0.9.16 HPUX IPF -- core dumped
- Next by Date: Re: [Ethereal-dev] Ethereal 0.9.16 HPUX IPF -- core dumped
- Previous by thread: Re: [Ethereal-dev] Performance. Ethereal is slow when using large captures.
- Next by thread: Re: [Ethereal-dev] Performance. Ethereal is slow when using large captures.
- Index(es):