Support Requests item #815184, was opened at 2003-09-30 15:22
Message generated for change (Comment added) made by ronboris
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=424136&aid=815184&group_id=39046
Category: tv_check
Group: None
Status: Open
Priority: 5
Submitted By: Ron Boris (ronboris)
Assigned to: Robert Eden (rmeden)
Summary: Shows with accented characters
Initial Comment:
Adding programs containing accented characters (e.g.,
Fútbol) to the shows.xml file seems to cause tv_check
to stop working. It gives the following error:
not well-formed at line 367, column 17, byte 14090
at /PerlApp/XML/Parser.pm line 168
Trying to open the shows.xml file in IE generates the
following error:
An invalid character was found in text content. Error
processing resource 'file:///C:/Program
Files/xmltv/shows.xml'. Line 367, Position 18
<shows title="F
The show was added using tv_check --configure.
Is there a workaround for this problem or is a
fix/enhancement necessary?
Thanks for your support.
Ron Boris
----------------------------------------------------------------------
>Comment By: Ron Boris (ronboris)
Date: 2004-01-26 20:08
Message:
Logged In: YES
user_id=823152
The change looks good. Programs containing accented
characters can be added using tv_check --configure, and the
encoding header is added to the file. tv_check finds and lists
occurrences of these programs.
I consider this fixed.
Ron
----------------------------------------------------------------------
Comment By: Robert Eden (rmeden)
Date: 2004-01-26 03:55
Message:
Logged In: YES
user_id=270469
I just committed a change to tv_check that supports UTF-8
encoded characters and encoding header.
If tv_check writes the show file (--configure or --myreplay)
it will now create the header.
You still get the error you mention if there is an accented
character w/o the header... just add the header manually and
you're back in business.
Please let me know how it works.
CVS has the update, and a EXE is here:
http://alpha-exe.xmltv.org
Robert
----------------------------------------------------------------------
Comment By: Robert Eden (rmeden)
Date: 2004-01-07 06:53
Message:
Logged In: YES
user_id=270469
btw... I was hoping this would fix itself when xmltv.exe
changed to perl 5.8.
It didn't. :(
I'll keep the ticket open and look in to it when I get a chance.
Robert
----------------------------------------------------------------------
Comment By: Ron Boris (ronboris)
Date: 2003-10-26 23:04
Message:
Logged In: YES
user_id=823152
Thanks for your response. I think you are right about
character encoding being the problem. The show was added
using tv_check --configure. tv_check doesn't specify any
encoding when it creates/saves a shows.xml file (i.e., there's
no <?xml ?> specification at all).
If I copy the <?xml ?> specification (<?xml version="1.0"
encoding="ISO-8859-1"?>) and the "Fútbol" title (<shows
title="Fútbol" />)from a downloaded listings file to the shows
file using an editor, I can open the file without error in IE and
tv_check. However, tv_check does not find the show in the
listings file when it runs. If I save the shows file with
tv_check, it drops the <?xml ?> specification and changes
the title to "Fútbol", and tv_check does not find the show in
the listings file when it runs.
Whatever I do, it does not work. Apparently, tv_check does
not recognize the encoding used in the listings file for non-
standard characters.
----------------------------------------------------------------------
Comment By: Ed Avis (epaepa)
Date: 2003-10-26 17:48
Message:
Logged In: YES
user_id=10769
How did you add 'Fútbol' to your shows.xml file? Did you
just change the file in an editor?
I suspect character encoding is to blame. At the top of
shows.xml it will say what character set and encoding is
used, for example 'UTF-8' or 'ISO-8859-1'. If it's UTF-8
then your editor needs to understand that, and add the
accented U character as a multibyte sequence.
Alternatively, if you want to use an editor that doesn't
understand UTF-8, you could change the doctype at the top of
the file to specify ISO-8859-1. Make sure there aren't any
existing accented characters in the file when you do that.
Essentially, UTF-8 and ISO-8859-1 are the same for
unaccented Latin characters, but they store more exotic
things differently. A file encoded in UTF-8 cannot contain
ISO-8859-1 character sequences, and vice versa.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=424136&aid=815184&group_id=39046
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
|