|
How can I become universal utf/unicode: msg#00028lang.perl.modules.lwp
I don't know where else to post this question. I'm already using LWP::UserAgent and HTML::Parser and successfully fetch and parse documents without problem. However, I would like to be universal. I'm using Perl 5.8.3 with the latest HTML::Parser as of today. Sometimes when fetching a document you have no idea the encoding and sometimes you do. What I want to know is how do I convert the incoming Web page regardless of encoding to UTF-8 as well as encode entities to something like Aacute (for keyword matching)? Maybe I'm stupid because I've tried everything I can think of as well as following some examples I've found and no matter what I do, it just doesn't work. Any help would be appreciated. Thanks, John _________________________________________________________________ Check out Election 2004 for up-to-date election news, plus voter tools and more! http://special.msn.com/msn/election2004.armx |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: Help, Please: Can't Get a Hold of <input type=button ...>: 00028, John J Lee |
|---|---|
| Next by Date: | Re: How can I become universal utf/unicode: 00028, Bjoern Hoehrmann |
| Previous by Thread: | HTML::Parser modifies unicode charactersi: 00028, Moshe Kaminsky |
| Next by Thread: | Re: How can I become universal utf/unicode: 00028, Bjoern Hoehrmann |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |