Issues I met when running NutchIndexing and How I fixed them
1. Background Recently I set up a Big Data cluster (using Bigtop 1.3.0 ) with three Arm servers and tried to HiBench. My goal is to make all menchmarkings in HiBench 7.0 run and pass on Arm servers. It all went well until it comes to NutchIndexing. NutchIndexing is a benchmark which "tests the indexing sub-system in Nutch, a popular open source (Apache project) search engine. The workload uses the automatically generated Web data whose hyperlinks and words both follow the Zipfian distribution with corresponding parameters. " This post lists all the issues I met when running NutchIndexing, and also how I fixed them. Information about how to setup a cluster with Bigtop 1.3.0 and how to install and run HiBench7.0, please see my other posts, listed here . Overall, my test set up is like this, with 'Head Node' is master, and 'Node 2' and 'Node 3' are slaves. From here on, I will refer them as node-001, node-002, and node-003. Profile s