Cygwin is used to run Nutch on Windows. Of course, you may run Nutch on Linux if desired.
1) Go to Cygwin site to download setup.exe.
2) Run setup.exe to set up Cygwin. No additional package is required to run Nutch.
3) Download the Nutch package (please choose at least version 0.8).
4) Unzip the package, preferably to the Cygwin home folder for easy access.
5) Test that the installation works by typing the following in the nutch folder:
Verify that the following is shown:
6) Set Classpath to the Lucene core(core version may vary):
7) Set JAVA_HOME
Note: When setting CLASSPATH or JAVA_HOME, do not include folders that have names with spaces in them.
For example, naming the Nutch folder 'Nutch 0.9' instead of 'Nutch-0.9' will result in the CLASSPATH or JAVA_HOME not being recognized.
8) Type the following to verfiy that the paths are set correctly: './bin/nutch crawl'
The above output will appear if CLASSPATH is set correctly.
Nutch is now ready to crawl and index.
For further information on how to use Nutch, please follow the tutorials located in the Nutch website and the java.net introduction to Nutch. The urls are given in the Introduction post.