logo       

GnuWin32 sed can handle double-byte character: msg#00052

editors.sed.user

Subject: GnuWin32 sed can handle double-byte character


On Thu, 08 Dec 2005 01:19:34 +0800, Paolo Bonzini wrote:

> Yes, starting from version 4.1 on sed has full support for MBCS. The

Yes you are right. The GnuWin32 version sed works well when processing
Chinese character. But not every GNU sed with version number higher
than 4.1 support MBCS well on Windows platform. I think it is concerned
with compiling. Some I18n support option was disabled when compiling
sed.


E:\bin>echo a阿 | od -t x1
0000000 61 b0 a2 20 0d 0a <--'\xb0 \xa2' is a Chinese Char
0000006

The following script switch the 2 char.

With GNU sed v4.1.4 from GnuWin32 project in sourceforge.net:
E:\bin>echo a阿 | GWsed "s/\(.\)\(.\)/\2\1/" |od -t x1
0000000 b0 a2 61 20 0d 0a <-- desired output
0000006

With GNU sed v4.1.1 DJGPP port
E:\bin>echo a阿 | gsed "s/\(.\)\(.\)/\2\1/" |od -t x1
0000000 b0 61 a2 20 0d 0a <-- undesired output
0000006



PS: I find this in "TODO" file from ssed 3.62 source,
'Make pcre/reg{perl,posix}.c handle [[=x=]] and [[.x.]] and MBCS'
Does that mean ssed current version cannot support MBCS?

--
Regards,
hq00e



------------------------ Yahoo! Groups Sponsor --------------------~-->
Get Bzzzy! (real tools to help you find a job). Welcome to the Sweet Life.
http://us.click.yahoo.com/KIlPFB/vlQLAA/TtwFAA/dkFolB/TM
--------------------------------------------------------------------~->

--

Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/sed-users/

<*> To unsubscribe from this group, send an email to:
sed-users-unsubscribe@xxxxxxxxxxxxxxx

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/






<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise