Comment # 35
on bug 8382
from Evan Huus
(In reply to comment #34)
> > I agree we'll want a flag for string fields at some point,
> > though I'm not sure if it should be on the hf field or in the encoding arg
> > of the tree_add call.
>
> The encoding arg says what character encoding is used for the string in the
> packet.
>
> The formatting arg says how it should be presented, and the same string
> might be presented in different ways in different contexts:
OK, makes sense.
> for XML, we'd probably want to encode all non-printable characters as
> entities, except that the 1.1 spec says in section 4.1 "Character and Entity
> References":
>
> http://www.w3.org/TR/xml11/#sec-references
>
> that "Characters referred to using character references must match the
> production for Char.", and the production for Char is
>
> Char ::= [#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
>
> which excludes NUL, the surrogate blocks, U+FFFE, and U+FFFF, so I'm not
> sure how you say, in XML, "this string has the value "hi\0mom\xFFFE"";
Interesting, I'd never dug that deep into XML. If that's the production for
char, perhaps those characters aren't permitted in XML strings? So XML strings
can't have embedded nulls in any form, escaped or not? If this is the case then
I'm not sure what the appropriate thing to do is. Strip them? Invent our own
escape format?
> for JSON, we can encode anything (RFC 4627 says "All Unicode characters
> may be placed within the quotation marks except for the characters that must
> be escaped: quotation mark, reverse solidus, and the control characters
> (U+0000 through U+001F)", which sure sounds as if a JSON string can have an
> embedded NUL);
Unicode null is U+0000, so it's included in the control characters set as
needing to be escaped, but it does sound permitted as long as you escape it.
You are receiving this mail because:
- You are watching all bug changes.