Wireshark-dev: Re: [Wireshark-dev] Replacing g_iconv and different codesets
From: Michael Lum <michael.lum@xxxxxxxxxxxxxxxxx>
Date: Fri, 20 Dec 2013 11:59:20 -0800
Okay, thanks for the responses.

I started to make some changes but its probably more than I have time for.

But in case I pick it up I had a question about the ENC_ values from proto.h.

This is what I have from SVN:

#define ENC_CHARENCODING_MASK   0x7FFFFFFE      /* mask out byte-order bits */
#define ENC_ASCII               0x00000000
#define ENC_UTF_8               0x00000002
#define ENC_UTF_16              0x00000004
#define ENC_UCS_2               0x00000006
#define ENC_EBCDIC              0x00000008
#define ENC_WINDOWS_1250        0x0000000A
#define ENC_ISO_8859_1          0x0000000C
#define ENC_ISO_8859_2          0x0000000E
#define ENC_ISO_8859_5          0x00000014
#define ENC_ISO_8859_9          0x0000001C

I can't detect a pattern aside from not using the least significant bit.

I was thrown off at the gaps between 0x0000000E and 0x00000014 and 0x0000001C.

Michael Lum (michael.lum@xxxxxxxxxxxxxxxxx) | STAR SOLUTIONS | Principal Software Engineer
4600 Jacombs Road, Richmond BC, Canada V6V 3B1 | +1.604.303.2315
 

> -----Original Message-----
> From: wireshark-dev-bounces@xxxxxxxxxxxxx 
> [mailto:wireshark-dev-bounces@xxxxxxxxxxxxx] On Behalf Of Guy Harris
> Sent: December 20, 2013 11:35 AM
> To: Developer support list for Wireshark
> Subject: Re: [Wireshark-dev] Replacing g_iconv and different codesets
> 
> 
> On Dec 20, 2013, at 11:24 AM, Jakub Zawadzki 
> <darkjames-ws@xxxxxxxxxxxx> wrote:
> 
> > In euc-kr [1] you can see that it's using ksc5601_to_ucs4() 
> which can be find in ksc5601.h [2].
> > ksc5601_to_ucs4() is using convertation tables: 
> > __ksc5601_hangul_to_ucs, __ksc5601_hanja_to_ucs, 
> __ksc5601_sym_to_ucs from ksc5601.c [3].
> > 
> > These tables are quite big (about 1K lines) and 
> __ksc5601_sym_to_ucs 
> > is using C99 array index initialization (which you would 
> need to expand (covert to switch?) before commiting).
> > 
> > So I'd rather suggest still using g_iconv() for EUC-KR, but 
> feel free to introduce new ENC_EUC_KR and move it to core.
> 
> ...and even if we do use g_iconv() for EUC-KR (or anything 
> else), it should ideally be buried in tvb_get_string_enc() 
> and so on, rather than directly used in dissectors.
> ______________________________________________________________
> _____________
> Sent via:    Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
> Archives:    http://www.wireshark.org/lists/wireshark-dev
> Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
>              
> mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe
>