Wireshark-dev: [Wireshark-dev] String Handling API
From: Sebastien Tandel <sebastien@xxxxxxxxx>
Date: Sun, 24 Dec 2006 23:39:23 +0100
Hi all, Guy *really* thinks the only solution is to get an accurate string handling library and I *really* want to get rid of useless warnings and confusing code (think to the one(s) who will change the code when having a string handling API and who will have to know which field is a string(character) or not) ... I am proposing to work on this string handling API (at least on the first part of it). As described on the website iconv can't be the only way to determine character encoding. Glib2.0 is working with iconv. I thought however it was a good starting point anyway to work with glib2.0 API as wireshark is already using it. I then thought in a first draw to get an intermediate interface to the glib2.0 API for the manipulation of unicode strings as it lets the door open to differentiate the platform and the possibility to add others solutions for manipulation strings and character encoding determination. Do you think it's a good starting point? Have you more ideas than the ones described in the wishlist page? I also read again Guy's mail about (guint8*) ... here are some others thoughts on ctype functions and friends ... > Note also that you can safely pass a guint8 or guchar to one of the > <ctype.h> routines, but you can't safely pass a gchar to them, as they > might get sign-extended into negative values if the 8th bit is set (I > think that none of the popular platforms for Windows and modern UN*Xes > have C compilers with "char" an unsigned type, so I think "might" can be > replaced by "will" in practice). > Two questions : 1) How many of these functions are used in wireshark? 2) Are they correctly used? number of ctype functions : 89 (dissectors part) - So few ctype functions, we might hope that none is using a (g)char. However here are the files using gchar (or char) : (patch provided except for packet-tds) packet-catapult-dct2000.c : 449 packet-exec.c : 436, 454 packet-sip : 1178, 1236, 1298 packet-slimp3 : 372, 439 packet-tds : 958 (#if 0 but nevertheless) 10% of misuse and ctype functions are not heavily used ... sic :-/ We can then conclude that the use of guint8* in tvb_get_* don't avoid misuses of ctype functions with gchar. It does not avoid them and no one noticed them. I think you'd better put a comment on ctype and guchar(guint8) in the README.developer. packet-ftp do not use (g)char for ctypes functions but will use gchar for tvb_get_nstringz0. In this case, we nevertheless get useful warning! There are more than 100 warnings of this style in wireshark!(no patch for these ones :() There is even more confusion when seeing the signature of proto_tree_add_string using char ... :-/ Just to mention furthermore the number of g_str.* (or g_???printf) functions : more than 1000 (in dissector part) It's really a pity to get all these useless warnings. All these useless warnings are hiding all the useful ones ... (see here above) Best Regards, Sebastien Tandel
Index: gtk/isis_analysis.c =================================================================== Index: gtk/isis_analysis.h =================================================================== Index: epan/dissectors/packet-sip.c =================================================================== --- epan/dissectors/packet-sip.c (révision 20211) +++ epan/dissectors/packet-sip.c (copie de travail) @@ -1151,7 +1151,7 @@ guint transport_slash_count = 0; gboolean transport_name_started = FALSE; gboolean colon_seen = FALSE; - gchar c; + guchar c; gchar *param_name = NULL; /* skip Spaces and Tabs */ Index: epan/dissectors/packet-exec.c =================================================================== --- epan/dissectors/packet-exec.c (révision 20211) +++ epan/dissectors/packet-exec.c (copie de travail) @@ -46,8 +46,8 @@ /* Forward declaration we need below */ void proto_reg_handoff_exec(void); -gboolean exec_isprint_string(gchar *string); -gboolean exec_isdigit_string(gchar *string); +gboolean exec_isprint_string(guchar *string); +gboolean exec_isdigit_string(guchar *string); /* Variables for our preferences */ static gboolean preference_info_show_username = TRUE; @@ -113,7 +113,7 @@ proto_tree *exec_tree=NULL; /* Variables for extracting and displaying data from the packet */ - gchar *field_stringz; /* Temporary storage for each field we extract */ + guchar *field_stringz; /* Temporary storage for each field we extract */ gint length; guint offset = 0; @@ -250,7 +250,7 @@ */ if(length == 1 || (exec_isdigit_string(field_stringz) && length <= EXEC_STDERR_PORT_LEN)){ - proto_tree_add_string(exec_tree, hf_exec_stderr_port, tvb, offset, length, field_stringz); + proto_tree_add_string(exec_tree, hf_exec_stderr_port, tvb, offset, length, (gchar*)field_stringz); /* Next field we need */ hash_info->state = WAIT_FOR_USERNAME; } else { @@ -270,13 +270,13 @@ /* Check if this looks like the username field */ if(length != 1 && length <= EXEC_USERNAME_LEN && exec_isprint_string(field_stringz)){ - proto_tree_add_string(exec_tree, hf_exec_username, tvb, offset, length, field_stringz); + proto_tree_add_string(exec_tree, hf_exec_username, tvb, offset, length, (gchar*)field_stringz); /* Store the username so we can display it in the * info column of the entire conversation */ if(!hash_info->username){ - hash_info->username=se_strdup(field_stringz); + hash_info->username=se_strdup((gchar*)field_stringz); } /* Next field we need */ @@ -298,7 +298,7 @@ /* Check if this looks like the password field */ if(length != 1 && length <= EXEC_PASSWORD_LEN && exec_isprint_string(field_stringz)){ - proto_tree_add_string(exec_tree, hf_exec_password, tvb, offset, length, field_stringz); + proto_tree_add_string(exec_tree, hf_exec_password, tvb, offset, length, (gchar*)field_stringz); /* Next field we need */ hash_info->state = WAIT_FOR_COMMAND; @@ -321,13 +321,13 @@ /* Check if this looks like the command field */ if(length != 1 && length <= EXEC_COMMAND_LEN && exec_isprint_string(field_stringz)){ - proto_tree_add_string(exec_tree, hf_exec_command, tvb, offset, length, field_stringz); + proto_tree_add_string(exec_tree, hf_exec_command, tvb, offset, length, (gchar*)field_stringz); /* Store the username so we can display it in the * info column of the entire conversation */ if(!hash_info->command){ - hash_info->command=se_strdup(field_stringz); + hash_info->command=se_strdup((gchar*)field_stringz); } } else { @@ -427,7 +427,7 @@ /* Custom function to check if an entire string is printable. */ gboolean -exec_isprint_string(gchar *string) +exec_isprint_string(guchar *string) { guint position; @@ -445,7 +445,7 @@ /* Custom function to check if an entire string is digits. */ gboolean -exec_isdigit_string(gchar *string) +exec_isdigit_string(guchar *string) { guint position; Index: epan/dissectors/packet-slimp3.c =================================================================== --- epan/dissectors/packet-slimp3.c (révision 20211) +++ epan/dissectors/packet-slimp3.c (copie de travail) @@ -231,7 +231,7 @@ gint i1; gint offset = 0; guint16 opcode; - char lcd_char; + guchar lcd_char; char lcd_str[MAX_LCD_STR_LEN + 1]; int to_server = FALSE; int old_protocol = FALSE; Index: epan/dissectors/packet-catapult-dct2000.c =================================================================== --- epan/dissectors/packet-catapult-dct2000.c (révision 20211) +++ epan/dissectors/packet-catapult-dct2000.c (copie de travail) @@ -26,6 +26,8 @@ # include "config.h" #endif +#include <glib.h> + #include <string.h> #include <ctype.h> #include <epan/packet.h> @@ -105,7 +107,7 @@ void proto_register_catapult_dct2000(void); static dissector_handle_t look_for_dissector(char *protocol_name); -static void parse_outhdr_string(char *outhdr_string); +static void parse_outhdr_string(guchar *outhdr_string); static void attach_fp_info(packet_info *pinfo, gboolean received, const char *protocol_name, int variant); @@ -433,7 +435,7 @@ /* Populate outhdr_values array with numbers found in outhdr_string */ -void parse_outhdr_string(char *outhdr_string) +void parse_outhdr_string(guchar *outhdr_string) { int n = 0; @@ -444,7 +446,7 @@ guint digits; /* Find digits */ - for (digits = 0; digits < strlen(outhdr_string); digits++, n++) + for (digits = 0; digits < strlen((gchar*)outhdr_string); digits++, n++) { if (!isdigit(outhdr_string[n])) { @@ -702,7 +704,7 @@ (strcmp(protocol_name, "fp_r5") == 0) || (strcmp(protocol_name, "fp_r6") == 0)) { - parse_outhdr_string((char*)tvb_get_ephemeral_string(tvb, outhdr_start, outhdr_length)); + parse_outhdr_string(tvb_get_ephemeral_string(tvb, outhdr_start, outhdr_length)); attach_fp_info(pinfo, direction, protocol_name, atoi((char*)tvb_get_ephemeral_string(tvb, variant_start, variant_length))); }
- References:
- [Wireshark-dev] guint8* and gchar* ... and Vim ?! :)
- From: Sebastien Tandel
- Re: [Wireshark-dev] guint8* and gchar* ... and Vim ?! :)
- From: Guy Harris
- [Wireshark-dev] guint8* and gchar* ... and Vim ?! :)
- Prev by Date: Re: [Wireshark-dev] [Wireshark-commits] rev 20209: /trunk/epan/ftypes/ /trunk/epan/ftypes/: ftype-pcre.c
- Next by Date: [Wireshark-dev] build problem
- Previous by thread: Re: [Wireshark-dev] guint8* and gchar* ... and Vim ?! :)
- Next by thread: [Wireshark-dev] How to dissect bit information??
- Index(es):