|
Re: FWD: I-D ACTION:draft-klensin-unicode-escapes-00.txt: msg#00013ietf.apps-discuss
John C Klensin said: >> - it should be clear that this is for newly-designed protocols >> only. it shouldn't be interpreted as a request to change >> existing protocols (including deployed and nonstandard >> protocols being standardized by IETF), as this would generally >> break backward compatibility by changing the meaning of '\' > > That was intended to be clear already. If it is not > sufficiently so, suggested text, or at least a place to put it, > would be welcome. How about adding "new" before "protocols" in the middle paragraph of 1.1 and the abstract? >> - it should be clear that this is for occasional use of >> non-ASCII characters within a protocol field that is >> constrained to contain only ASCII characters (or a subset), >> rather than a recommendation for how to represent non-ASCII >> characters in a protocol field that is capable of carrying, >> say, UTF-8. > I don't know if it is clear enough or not. At some level, if > you didn't conclude that it was clear on reading the draft, then > that is evidence that it isn't clear enough... but I don't know > how carefully you read it. I don't think it would hurt to add something in 1.1. I'm not sure how to word it, but something about "Some protocols already accept native UTF-8 or some other encoding of Unicode, and this recommendation does not apply to such protocols.". > I've looked at several RFCs > and U+NNNN seems to be the preferred format for character > literals and, more commonly, for identifying the code point > associated with a named character. It is also, fwiw, the one I > prefer for that purpose. But it is fairly poor for inline use > in a protocol. The authoritative definition and reference for > that form is the "Code Points" section of "Appendix A: > Notational Conventions" of Unicode 5.0 (the reference to the > book is the I-D). I don't have that book. The online version 4.1 suggests the notation <U+0061, U+0300>, which can be abbreviated to <0061, 0030>. This would still need some kind of introductory indicator (like \u) to show that it's a Unicode escape. >> one more caveat: protocol specifications need to specify this >> notation explicitly (either directly or by reference to the >> published RFC) if they are going to use it. conversely, this >> notation SHOULD NOT (maybe MUST NOT) be used unless it is part >> of the protocol specification. > Please suggest text for specifying those rules. I constructed > this rather more as advice to protocol designers and, to a > lesser extent, to document authors, rather than a base for > notational definitions to be included by reference. That could > be changed, but I'd welcome textual suggestions. "This specification is a recommendation to protocol designers and document authors. A protocol or other specification MUST NOT be interpreted as using it unless it explicitly copies this syntax or refers to this RFC as normative." > But it is also, if I have done > the calculation correctly, %C3%83 and that form (used in URIs > and IRIs) is seriously non-intuitive and certainly can't be > converted visually. I certainly agree that encoding of UTF-8 sequences is the wrong thing to do. Oh: you should explicitly forbid the use of surrogates to encode characters above U+FFFF. > But I > have no particularly strong commitment to any particular > recommendation as long as we establish a recommendation. (1) I agree that anything is better than nothing. (2) While \uXXXX is better than encoded UTF-8, it's far worse than something explicitly delimited. -- Clive D.W. Feather | Work: <clive@xxxxxxxxx> | Tel: +44 20 8495 6138 Internet Expert | Home: <clive@xxxxxxxxxx> | Fax: +44 870 051 9937 Demon Internet | WWW: http://www.davros.org | Mobile: +44 7973 377646 THUS plc | | |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: Typographical error in draft-klensin-unicode-escapes-00: 00013, John C Klensin |
|---|---|
| Next by Date: | Re: Escaping the escape (Was: I-D ACTION:draft-klensin-unicode-escapes-00.txt: 00013, Clive D.W. Feather |
| Previous by Thread: | Re: FWD: I-D ACTION:draft-klensin-unicode-escapes-00.txti: 00013, John C Klensin |
| Next by Thread: | Re: FWD: I-D ACTION:draft-klensin-unicode-escapes-00.txt: 00013, Stephane Bortzmeyer |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |