apache - Nutch 2.1 urls injection takes forever -
i'm trying deploy nutch 2.1 on ubuntu 12.04 following tutorial. goes until try inject urls database. when type ($bin/nutch inject urls) , press enter get
injectorjob: starting injectorjob: urldir: urls
and remains there (for hours) until decide cancel execution. urls directory contains file urls. added proxy , port details in nutch-site.xml suggested here doesn't solve. tried apache nutch 2.2.1 , issue continues.
if know how fix issue, please, me!
thanks in advance.
ubuntu defaults loopback ip address in hosts 127.0.1.1. hbase (according this page) requires loopback ip address 127.0.0.1.
the ubuntu /etc/hosts
file default contains (with mycomputername being computer name):
127.0.0.1 localhost 127.0.1.1 mycomputername
use sudo gedit /etc/hosts
update hosts file follow:
127.0.0.1 localhost 127.0.0.1 mycomputername
reboot ubuntu. nutch should no longer have trouble injecting urls hbase.
Comments
Post a Comment