logo       

Include/exclude lists: msg#00285

nutch-user.lucene.apache.org

Subject: Include/exclude lists

Is there any way other than the config files to specify the url filter
parameters? I have a few dozen sites to crawl, and for each site I
want to specify its own includes and excludes. I don't want to have
to go into the config file and change the
<property><name>urlfilter.regex.file</name> each time. Can I specify
that on the command line to bin/nutch generate or something?

--
http://www.linkedin.com/in/paultomblin

<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | Mail Home | sitemap | FAQ | advertise