|
Re: multi-byte character sets in sed: msg#00006editors.sed.user
>I usually use gVim that I can match character correctly with the encoding >set to cp936. Though in gVim I can match both a double-byte Character and >a single-byte ASCII letter by '.', I still want to know if if could be >achieved with sed. Or does sed plan to put the encoding support into >future versions that we can pass the encoding to sed either by environment >variable or by commandline option? > Yes, starting from version 4.1 on sed has full support for MBCS. The environment variables are LC_CTYPE and LC_COLLATE. Though, if you use them you may encounter weird behavior when a script expects an environment with the default values of these variables, i.e. LC_CTYPE=C LC_COLLATE=C: for example some locales demand that ranges (e.g. [A-Z]) match case-insensitively, and this is by now the most reported sed non-bug (this behavior is mandated by POSIX). Paolo ------------------------ Yahoo! Groups Sponsor --------------------~--> Most low income households are not online. Help bridge the digital divide today! http://us.click.yahoo.com/I258zB/QnQLAA/TtwFAA/dkFolB/TM --------------------------------------------------------------------~-> -- Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/sed-users/ <*> To unsubscribe from this group, send an email to: sed-users-unsubscribe@xxxxxxxxxxxxxxx <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/ |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | (unknown): 00006, hq00e |
|---|---|
| Next by Date: | Re: (unknown): 00006, Ruud H.G. van Tol |
| Previous by Thread: | (unknown)i: 00006, hq00e |
| Next by Thread: | Re: multi-byte character sets in sed: 00006, hq00e |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |