|
(unknown): msg#00005editors.sed.user
Hi, I wonder that if/will sed support encoding setting? Chinese character mostly use an so called GBK encoding which is double byte. The problem is that a ASCII letter which is single byte encoding can be inserted into the double byte Chinese character which means I cannot simple match a Chinese character by regexp '..'. Because '..' might match either a Chinese character or combination of a Ascii letter and half a Chinese character. It seems that NLS just translate the message from English to Chinese. Does not help to this problem. I usually use gVim that I can match character correctly with the encoding set to cp936. Though in gVim I can match both a double-byte Character and a single-byte ASCII letter by '.', I still want to know if if could be achieved with sed. Or does sed plan to put the encoding support into future versions that we can pass the encoding to sed either by environment variable or by commandline option? -- Regards, hq00e ------------------------ Yahoo! Groups Sponsor --------------------~--> 1.2 million kids a year are victims of human trafficking. Stop slavery. http://us.click.yahoo.com/.QUssC/izNLAA/TtwFAA/dkFolB/TM --------------------------------------------------------------------~-> -- Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/sed-users/ <*> To unsubscribe from this group, send an email to: sed-users-unsubscribe@xxxxxxxxxxxxxxx <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/ |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: windows1252-to-Unicode.sed: 00005, Angus Leeming |
|---|---|
| Next by Date: | Re: multi-byte character sets in sed: 00005, Paolo Bonzini |
| Previous by Thread: | windows1252-to-Unicode.sedi: 00005, Eric Pement |
| Next by Thread: | Re: multi-byte character sets in sed: 00005, Paolo Bonzini |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |