Subject: bug#1654: 23.0.60; auto encoding detection
(detect-coding-region) not working




Kenichi Handa <handa@xxxxxxxx> writes:

> In article <ukwsabwo8x.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>, poppyer
> <poppyer@xxxxxxxxx> writes:
>
>> But for the big5, in the list returned by
>> "(detect_coding_region (region-beginning) (region-end))",
>> there is not big5. I do understand that gbk and big5's sequences might
>> not be easy to distinguish, but in this case, both encodings are
>> compatible to the input literal text, so both should be in the returned
>> list. Am
>> I right?
>
> You are right. But, the current Emacs can't have both GBK
> and Big5 in a list of coding systems to try for detecting
> because they are in the same category of coding-system
> (i.e. charset-base). I know that this restriction is not
> good, and improving it is in my todo list, but I still don't
> have a time to work on it.
>

I just re-examine the code, and find the bug.
And it is a bug in lisp/language/chinese.el
near line 125:
=================
(define-coding-system 'chinese-big5
"BIG5 8-bit encoding for Chinese (MIME:Big5)"
:coding-type 'charset
:mnemonic ?B
:charset-list '(ascii big5)
:mime-charset 'big5)
=====================
should be:
=================
(define-coding-system 'chinese-big5
"BIG5 8-bit encoding for Chinese (MIME:Big5)"
:coding-type 'big5 ;; change charset to big5 here, poppyer
:mnemonic ?B
:charset-list '(ascii big5)
:mime-charset 'big5)
=====================

recompile emacs again, i would be able to get
(coding-system-category 'big5) => coding_category_big5

and coding_category_big5 is already defined in coding.c
so
gbk belongs to codi...

ng_category_charset
big5 belongs to coding_category_big5
sjis belongs to coding_category_sjis

three diff categories, and the results can be listed by
detect-coding-region

Cheers,
poppyer






Privacy