Wireshark-bugs: [Wireshark-bugs] [Bug 7149] fast conversation lookup
Date: Thu, 19 Apr 2012 12:31:47 -0700 (PDT)
https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=7149

--- Comment #11 from Cristian Constantin <const.crist@xxxxxxxxxxxxxx> 2012-04-19 12:31:47 PDT ---
(In reply to comment #9)
> Further exploration reveals: all of the tested functions ("Flow Graph", "Voip
> Calls", etc.) call cf_retap_packets() in file.c, which causes Wireshark to
> completely reparse whatever packets have been selected for analysis (typically
> the whole capture's worth). This in itself is questionable, but not relevant
> here.

cristian: I have seen this as well. pretty annoying, I did not have time to
look into it though.

> 
> Parsing the capture when loading it into Wireshark for the very first time is
> fast, because of the 'last' pointer. Each packet creates a new element on the
> end of the list, which is accessible in O(1) time due to the 'last' pointer, so
> this is O(n) on the number of packets.
> 
> Re-parsing, however, is slow because the list already exists. The first packet
> can be found instantly, as it is the head. The second packet requires looking
> one further into the list, etc. This is O(n) amortized per lookup, which means
> it's O(n^2) for the number of packets. (Coincidentally, the very last packet
> can be found in O(1) again due to the 'last' pointer, but that doesn't affect
> the order of the running-time in any meaningful way).

cristian: I have tried to give a more succinct description of this in the "bug"
Description. you are right, it is O(n^2) because:

1+2+3+...+n=n(n+2)/1 -> of order O(n^2)

> 
> While there are several structurally nicer ways to solve this problem, they're
> all fairly invasive in terms of changing dissectors and data structures.
> Cristian's caching idea should actually fix this particular case if implemented
> correctly, since the cached pointer will end up acting like an iterator through
> the list when reparsing (bringing the file back to O(n)).

cristian: the problem itself is quite interesting and it seems it needs some
special tailored data structures; one could also try some kind of binary search
since the chain lists are ordered on the frame number (the problem is that they
have to grow dynamically)

> 
> Will experiment more.

cristian: thanks.

-- 
Configure bugmail: https://bugs.wireshark.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching all bug changes.