|
Best way to read UTF-8 data?: msg#00111tv.xmltv.devel
I'm trying to read an xmltv file that is in utf-8 mode from a perl script however I'm having some weird problems. The xml file is like this: --- <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE tv SYSTEM "xmltv.dtd"> ... <programme start="200411150600 +0100" stop="200411150700 +0100" channel="animalplanet.dagenstv.se"> <title lang="sv">Skogstigrar - berättelsen om Sita</title> <desc>Del 2. En fängslande dokumentär som följer förhållandet mellan den sextonåriga bengaliska tigrinnan Sita och den enda ungen av hankön i hennes kanske sista kull.</desc> </programme> --- My script goes like this: --- use XMLTV; my $data = XMLTV::parsefile($ARGV[0]); my ($encoding, $credits, $ch, $progs) = @$data; my $langs = [ 'en', 'sv']; foreach (@$progs) { my ($title, $langt) = @{XMLTV::best_name($langs, $_->{'title'})}; my ($desc, $langd) = @{XMLTV::best_name($langs, $_->{'desc'})}; } --- The problem is that if I look at $title, instead of looking like this: Skogstigrar - berättelsen om Sita it looks like this: Skogstigrar - berättelsen om Sita same thing with the description, instead of being like this from the xml file: Del 2. En fängslande dokumentär som följer förhållandet mellan den sextonåriga bengaliska tigrinnan Sita och den enda ungen av hankön i hennes kanske sista kull. it shows up like this: Del 2. En fängslande dokumentär som följer förhÃ¥llandet mellan den sextonÃ¥riga bengaliska tigrinnan Sita och den enda ungen av hankön i hennes kanske sista kull. Any ideas on the correct way to read UTF-8 data from a perl script? Cheers Chris ------------------------------------------------------- This SF.Net email is sponsored by: InterSystems CACHE FREE OODBMS DOWNLOAD - A multidimensional database that combines robust object and relational technologies, making it a perfect match for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | [ xmltv-Bugs-1066597 ] tv_grab_no is down: 00111, SourceForge.net |
|---|---|
| Next by Date: | Re: tv_grab_se_swedb in CVS - testing wanted: 00111, Mattias Holmlund |
| Previous by Thread: | [ xmltv-Bugs-1066597 ] tv_grab_no is downi: 00111, SourceForge.net |
| Next by Thread: | Re: Best way to read UTF-8 data?: 00111, Mattias Holmlund |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |