Ethereal-dev: Re: [Ethereal-dev] New dissector: CDT (CompressedDataType) [PATCH]

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <gharris@xxxxxxxxx>
Date: Tue, 22 Nov 2005 16:56:32 -0800
Stig Bjørlykke wrote:

I found a problem using a norwegian letter in my name in the AUTHORS file.

I've fixed a couple of problems with Scandinavian names. (A *quick* scan didn't show any problems with other names with accented letters, but that doesn't mean there aren't any.)

I suppose this file should be in UTF-8, sorry about that.

UTF-8 is probably the best choice, as we have names other than those in Western European languages - it's the standard character encoding for OS X, I think it's at least supported by other modern UN*Xes (and the trend might be to make it the preferred encoding), and, hopefully, at least *some* Windows apps can be made to assume files not in Unicode are in UTF-8 rather than the local code page.

AUTHORS also starts with the UTF-8 byte order mark:

	http://www.unicode.org/unicode/faq/utf_bom.html#25

so perhaps some of those Windows apps will recognize them as UTF-8.

What is the policy for using such letters in other files?

I'm not sure we have one. I'd say "avoid them in string constants displayed to the user", at least for now, as we don't currently have code to handle UTF-8 strings with GTK+ 1.2[.x] (GTK+ 2.x uses UTF-8 internally) or to handle them when you're sending them to a terminal or a pipe (for example, for arrows, use "->" rather than any ISO 10646 arrow character).