logo       

Re: LWP produces unreadable characters: msg#00058

Subject: Re: LWP produces unreadable characters
tom arnall wrote:
> i am trying to use LWP to fetch material from the following site:
> 
>       http://education.yahoo.com/reference
>                       
> using, to pick an example:
> 
>       http://education.yahoo.com/reference/dict_en_es/spanish/
>               curia;_ylt=A9FJq_FmyklEIWYAkAD2s8sF
>               
> the result is unreadable characters, tho' lynx does fine with it (tho', of 
> course, it doesn't return a true copy of the html), and the stuff looks fine 
> when i use the 'view source' function on the page displayed in my browser 
> window.
> 
> my code:
> 
>       $host = 'education.yahoo.com';
>       $base = 'reference/dict_en_es/spanish_index';
>       $file = 'B;_ylt=Ag1EmmqDVPhfK7pmll28XUX2s8sF';

After a few minor adjustments, it works for me.

use strict;
use LWP;
use URI;

my $host = 'http://education.yahoo.com';
my $path =
'/reference/dict_en_es/spanish_index/B;_ylt=Ag1EmmqDVPhfK7pmll28XUX2s8sF';
my $uri = URI->new_abs($path,$host);

# I like a uri object as I often run in a loop and change query strings
# host etc.

my %header = (
    'Keep-Alive' => '300',
    'Connection' => 'keep-alive',
    'User-Agent' => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10)
        Gecko/20050925 Firefox/1.0.4 (Debian package 1.0.4-2sarge5)',
    'Pragma' => 'no-cache',
    'Cache-control' => 'no-cache',
    'Accept' => 'image/png,*/*;q=0.5',
    'Accept-Encoding' => 'gzip,deflate',
    'Accept-Charset' => 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
    'Accept-Language' => 'en-us,en;q=0.5',
    'Host' => $host,
    );

my $ua = LWP::UserAgent->new(%header);

my $res1 = $ua->get($uri);
my $page = $res1->content;
print $page;


-Tim



<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
audio.irate.dev...    yellowdog.gener...    ietf.ips/2002-0...    xfree86.fonts/2...    busybox/2003-07...    emacs.jdee/2004...    linux.mandrake....    hardware.microc...    user-groups.lin...    science.analysi...    version-control...    db.filemaker.de...    cluster.openmos...    mail.eyebrowse....    text.xml.xerces...    kde.devel.kwrit...    finance.moneyda...    gcc.regression/...    network.routing...    os.freebsd.deve...    recreation.radi...    qnx.openqnx.dev...    python.xml/2002...   
Home | blog view | USPTO Patent Archive | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe