Ethereal-dev: Re: [Ethereal-dev] Unicode strings ...
Guy Harris wrote:
On Sun, Aug 12, 2001 at 05:12:34PM -0700, Guy Harris wrote:
My inclination would be to go with UTF-8 internally,
UTF-8 has a number of advantages, and we can use iconv for conversion,
but that requires another library :-(
...although that would require either that "proto_tree_add_string()"
and so on take an argument specifying the character set to be used or
that some indication of the character set be associated with the
field.
The latter has the disadvantage that not all occurrences of the field
would necessarily have the same character set, so I'd be inclined to
add a character-set argument.
"proto_tree_add_item()" might also have to have that argument added
(the byte-order argument couldn't, I think, be used, as if the
character set in the packet is UCS-2, you'd have to specify whether
it's big-endian or little-endian UCS-2).
A problem I have just encountered is that proto_tree_add_string etc all
expect the string you are going to add to be the "value" argument. What
I want is a routine, like proto_tree_add_ucs2_xx, that will take a
string of length X from the TVB at offset, and mark its extent in the
byte view, but then convert that string into internal format (UTF-8) for
display. I would also want to do formatting as well, which means more
work ...
--
Richard Sharpe, rsharpe@xxxxxxxxxx, LPIC-1
www.samba.org, www.ethereal.com, SAMS Teach Yourself Samba
in 24 Hours, Special Edition, Using Samba