logo       

Patch for WWW::RobotsRules.pm: msg#00030

lang.perl.modules.lwp

Subject: Patch for WWW::RobotsRules.pm

I've got a spider that uses LWP::RobotUA (WWW::RobotRules) and a few
users of the spider have complained that the warning messages were
not obvious enough. I guess I can agree because when they are
spidering multiple hosts the message doesn't tell them what robots.txt
had a problem.

So maybe something like:

--- RobotRules.pm.old 2004-04-09 08:37:08.000000000 -0700
+++ RobotRules.pm 2004-09-16 09:46:03.000000000 -0700
@@ -70,7 +70,7 @@
}
elsif (/^\s*Disallow\s*:\s*(.*)/i) {
unless (defined $ua) {
- warn "RobotRules: Disallow without preceding User-agent\n";
+ warn "RobotRules: [$robot_txt_uri] Disallow without preceding
User-agent\n";
$is_anon = 1; # assume that User-agent: * was intended
}
my $disallow = $1;
@@ -97,7 +97,7 @@
}
}
else {
- warn "RobotRules: Unexpected line: $_\n";
+ warn "RobotRules: [$robot_txt_uri] Unexpected line: $_\n";
}
}





--
Bill Moseley
moseley@xxxxxxxx




<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise