|
RE: alpha, print, graph, blank, etc.: msg#00253text.unicode.devel
Mark Davis wrote: > The POSIX/C-style property names (punct, alpha, lower, upper, > digit, xdigit, alnum, cntrl, graph, print, space, blank) are > not well specified, and don't really map well to the broader > types of characters available in Unicode/10646. For example, > there is no provision for titlecase, [...] My 0.2 euros: IMHO, title-case letters should be treated as *both* upper-case and lower-case. I.e., my suggestion is that: - is[w]lower() returns TRUE for both lower-case and title-case letters; - is[w]upper() returns TRUE for both upper-case and title-case letters; - is[w]alpha() returns TRUE for any Unicode letter (general category L*). For applications unaware of the existence if "title-case" letters, this saves the basic semantics of is[w]alpha() (namely, "Is it a letter?"), and one of the most basic semantics of is[w]lower() and is[w]upper() (namely, "Can this character be converted to lower/upper-case?"). For applications aware of the existence if "title-case" letters, the is[w]upper(), is[w]lower(), and is[w]alpha() can be used in combination to determine the exact "case type" of any letter: if (iswalpha(c)) { if (iswupper(c) && iswlower(c)) { printf("This is a title-case letter (Lt).\n", c); } else if (iswupper(c) && !iswlower(c)) { printf("This is an upper-case letter (Lu).\n", c); } else if (!iswupper(c) && iswlower(c)) { printf("This is a lower-case letter (Ll).\n", c); } else /* if (!iswupper(c) && !iswlower(c)) */ { printf("This is letter with no case distinctions (Lo or Lm).\n", c); } } else { printf("This is not a letter.\n", c); } Unfortunately, there is no corresponding trick to obtain a "to-title-case" functionality, apart a non portable construct such as: c1 = towctrans(c2, wctrans("Title-case")); Anyway, converting to title case is something less fundamental than upper/lower-casing, and it only makes sense at the string level. _ Marco
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: alternative names for letterlike symbols(was..Re: Release of Unicode 4.0), Jungshik Shin |
|---|---|
| Next by Date: | RE: *Complete* Big5 to Unicode mappings, Marco Cimarosti |
| Previous by Thread: | alpha, print, graph, blank, etc., Mark Davis |
| Next by Thread: | Re: alpha, print, graph, blank, etc., Mark Davis |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |