logo       

Re: Next step: msg#00031

ietf.apps-discuss

Subject: Re: Next step

Frank Ellermann said:
> o Java uses the form \uNNNN, but can represent characters outside
> Plane 0 (i.e., above U+FFFF) only by the use of surrogate pairs.
>
> One of the reasons why anything with \u or \U is a non-starter, there
> are too many incompatible conventions in use.

In particular, C uses \U with 8 hex digits because, at the time, there was
a serious possibility that ISO 10646 would still allow all 32 bits (or at
least 31) to be used.

Had it been a few years later, we would probably have make \U indicate 6
hex digits rather than 8.

> There is one significant disadvantage of the recommended form. The
>
> No, there are more, folks will assume that it's a convention they know
> or a variant of U+NNNN[N[N]] with an arbitrary number of leading 0s.
> Nobody will use \U012345 when they can hope to get away with \U12345.

+1

> should not introduce any security issues that are not present as a
>
> My objections are also security considerations, because folks will
> screw up with this encoding it could cause havoc.

+2

If UTF-8 can make a security issue out of having more than one way to
encode a character, so can we.

[Which reminds me: being able to encode ASCII characters in this form might
be a security issue as well, or it might be a useful benefit.]

> In theory your proposal is compatible with C044, but in practice I
> fear that it won't work as you expect it. I could live with e.g.
> "authors SHOULD either pick hex. NCRs as in XML or" (your proposal),
> but in fact I think that the XML-notation is much better.

My only, small, discomfort is that people will expect all protocols to
accept both hex (ሴ) and decimal (Ӓ).

--
Clive D.W. Feather | Work: <clive@xxxxxxxxx> | Tel: +44 20 8495 6138
Internet Expert | Home: <clive@xxxxxxxxxx> | Fax: +44 870 051 9937
Demon Internet | WWW: http://www.davros.org | Mobile: +44 7973 377646
THUS plc | |




<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise