Ethereal-dev: [Ethereal-dev] Re: Performance: change the time format for a file with 500 000 f

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Jeff Morriss <morriss@xxxxxxxxx>
Date: Thu, 04 Dec 2003 19:46:48 -0500


Jeff Morriss wrote:


Speaking of performance...

Whatever happened to making changing the time format faster? E.g. from this thread:

http://www.ethereal.com/lists/ethereal-dev/200302/msg00029.html


I ask because I recently had the pleasure of looking at a 170,000 packet
file and wanted to change the time display format--I ended up giving up
after 5 minutes.  :(


If it's still not possible to make it very fast, would it be possible to
make the GUI update more frequently while it's doing this?  Currently it
seems to update every 3 seconds (which makes clicking "Stop" difficult)...

I started looking into increasing N_PROGBAR_UPDATES when changing the
time display but quickly realized that any change I made didn't have
much affect when the number of packets displayed was large.

A little investigation showed why: it's doing O(N^2) work because it does:

for each packet
   packet_list_find_row_from_data()
   update the time


where 'packet_list_find_row_from_data()' does a linear search of the
rows and then returns the row number.

The attached patch changes this function to make a (cheap) educated guess that the caller might be doing a linear update of all the rows--if so, it short cuts the search. Of course, this helps only when there are no display filters that have eliminated packets from the display, but hey... Look at the results when it does help:

On a 2.4 GHz Xeon system, loading my 310,000 packet capture took this much time (this is for reference only):

real    0m14.379s
user    0m11.920s
sys     0m0.400s

Changing the time display without this patch took:

real    34m13.849s
user    33m58.330s
sys     0m5.400s


With the patch it took:

real    8m56.941s
user    8m45.020s
sys     0m1.550s


Changing the time format is still slow because it's still doing O(N^2) work:

'packet_list_set_text()' is called for each packet with a row number--which was gotten from the potentially O(N) 'packet_list_find_row_from_data()' above.

This then calls 'gtk_clist_set_text()' which uses the (I believe) O(N) 'ROW_ELEMENT()' to get the pointer to the actual row element again (even though we just had that value when we were in 'packet_list_find_row_from_data()'). I suspect new versions of 'packet_list_set_text()' and 'gtk_clist_set_text()' which take pointers to the row elements (making them O(1)) would be a Good Thing, but I'll have to look at that a bit later...

(I noted a couple of other places that do the same 'packet_list_find_row_from_data()' followed almost immediately by 'gtk_clist_set_text()', so there's probably a lot of room for improvement here...)

(Maybe all uses of ROW_ELEMENT() should be investigated for possible improvements?)

Attachment: gtkclist.patch.gz
Description: GNU Zip compressed data