Wireshark-bugs: [Wireshark-bugs] [Bug 13033] Various TreeItem:add_packet_field bugs
Date: Mon, 24 Oct 2016 07:13:54 +0000

Comment # 3 on bug 13033 from
(In reply to Redford from comment #0)
> Only the first character of UTF-16-LE sequence is displayed. Sample input
> for this field: 3C00610031003E000000 (hex, 10 bytes)
> 
> 4. parsing strings returns \oct sequence instead of string
> Code: same as above (3.), but with "ENC_UTF_16 + ENC_STRING +
> ENC_LITTLE_ENDIAN" replaced by "ENC_UTF_16".

That means the same thing as "...replaced by "ENC_UTF_16 + ENC_BIG_ENDIAN", as
ENC_BIG_ENDIAN is 0, so that means "treat that string as *big-endian* UTF-16",
which means that the sample input is U+3C00 U+6100 U+3100 U+3E00 U+0000, which
maps to UTF-8 (Wireshark's internal encoding for strings) as...

> Result: "test: \343\260\200\346\204\200\343\204\200\343\270\200"

"343\260\200\346\204\200\343\204\200\343\270\200", if represented as a C
string.

> btw.: Is there any documentation for these ENC_* flags? Especially the
> string ones seems quite cryptic to me (e.g. what does ENC_STRING really do).

ENC_STRING is documented here:

    https://www.wireshark.org/docs/wsdg_html_chunked/lua_module_Tree.html

"Another new feature added to this function in Wireshark version 1.11.3 is the
ability to extract native number ProtoField`s from string encoding in the
`TvbRange, for ASCII-based and similar string encodings. For example, a
ProtoField of as ftypes.UINT32 type can be extracted from a TvbRange containing
the ASCII string "123", and it will correctly decode the ASCII to the number
123, both in the tree as well as for the second return value of this function.
To do so, you must set the encoding argument of this function to the
appropriate string ENC_* value, bitwise-or’d with the ENC_STRING value (see
init.lua). ENC_STRING is guaranteed to be a unique bit flag, and thus it can
added instead of bitwise-or’ed as well. Only single-byte ASCII digit string
encoding types can be used for this, such as ENC_ASCII and ENC_UTF_8."

ENC_BIG_ENDIAN, ENC_LITTLE_ENDIAN, ENC_ASCII, ENC_UTF_8, etc. are documented in
the doc/README.dissector file in the source code:

   
https://code.wireshark.org/review/gitweb?p=wireshark.git;a=blob_plain;f=doc/README.dissector;hb=HEAD


You are receiving this mail because:
  • You are watching all bug changes.