Wireshark-bugs: [Wireshark-bugs] [Bug 7149] New: fast conversation lookup
Date: Thu, 19 Apr 2012 06:46:59 -0700 (PDT)
https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=7149

           Summary: fast conversation lookup
           Product: Wireshark
           Version: 1.7.x (Experimental)
          Platform: x86-64
        OS/Version: Debian
            Status: NEW
          Severity: Enhancement
          Priority: Medium
         Component: Wireshark
        AssignedTo: bugzilla-admin@xxxxxxxxxxxxx
        ReportedBy: const.crist@xxxxxxxxxxxxxx


Build Information:
wireshark 1.7.1 (SVN Rev Unknown from unknown)

Copyright 1998-2012 Gerald Combs <gerald@xxxxxxxxxxxxx> and contributors.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Compiled (64-bit) with GTK+ 2.24.10, with Cairo 1.10.2, with Pango 1.29.4, with
GLib 2.30.2, with libpcap, with libz 1.2.3.4, with POSIX capabilities (Linux),
with SMI 0.4.8, with c-ares 1.7.5, with Lua 5.1, without Python, with GnuTLS
2.12.10, with Gcrypt 1.5.0, with MIT Kerberos, with GeoIP, with PortAudio
V19-devel (built Jul 20 2011 00:01:38), with AirPcap.

Running on Linux 3.1.0-1-amd64, with locale en_US.UTF-8, with libpcap version
1.1.1, with libz 1.2.3.4, GnuTLS 2.12.10, Gcrypt 1.5.0, without AirPcap.

Built using gcc 4.6.1.

--
while caching the last element from the conversation hash chain lists speeds-up
the operation when the hash/chain lists are actually built, it 
does NOT help a lot when a certain random conversation which is in the hash
table is looked-up.

I did some profiling and tracing and I saw that a lot of cpu time is spent in
the function conversation_lookup_hashtable() when wireshark
is asked to show the "Flow Graph", "TCP Conversations", "Voip Calls".  I used
two types of captures with over 500k packets:

- tcp packets having the _same_ src ip addr, src tcp port, dst ip addr, dst tcp 
  port
- (mostly) sip packets containing sdp payloads which advertise the _same_ ip
  addr, udp port for media

these types of captures lead to _huge_ chain lists behind the same hash bucket 
(to which the conversation is actually mapped)

the solution would be to cache the last found conversation into the head of the 
chain list and to use it whenever it is possible; most of the time the look-up
will be in O(1) instead of O(n) (n - number
of elements in the list).

see the attached patch.

-- 
Configure bugmail: https://bugs.wireshark.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching all bug changes.