Setup and configure a single-node Hadoop cluster environment; include a one page document on your understanding of the process and purpose, along with supporting screen shots. Start the cluster and all associated services as needed to prepare the cluster for use in subsequent tasks.
Use the following steps to learn all you can to accomplish this task.
Step 1 – Reading
- Read Chapter 1 in “Pro Hadoop”
- Read Chapter 2 in “Pro Hadoop”
- Read Chapter 3 in “Hadoop for Dummies”
Step 2 Task 1 – Environment Creation in Skytap Using VirtualBox
- Install Oracle Virtual Box in the virtual environment. If needed, you can download Oracle Virtual Box from the following hyperlink (https://www.virtualbox.org/wiki/Downloads).
- Unzip the Cloudera CDH provided on the desktop. You can download the Cloudera CDH from (http://www.cloudera.com/downloads/quickstart_vms/5-8.html) if you do not have the needed files. Ensure you choose the correct version for your virtualization software (i.e, the Oracle Virtual Box format for use in Oracle Virtual Box products).
- Import the Cloudera CDH image (*.vmdk) by clicking on File -> Import Appliance. Browse to the location of the VMDK on the Desktop, and then click Open, and then click Next. Change the CPU count to use the maximum number available based on machine capacity. Change the RAM to a minimum of 1,024 * 8 = 8,192 MB RAM (more if available). Rename the Virtual Machine if so desired. Then click Import, and let the process complete.
- After completion of importing, select the VM and click Start. Let the virtual machine boot up.
- Once the virtual machine has completed boot up you will see the Cloudera desktop. Open a terminal and type the following command:
$ sudo /home/cloudera/cloudera-manager –express
Let the process complete, to reveal a URL and accompanying username (cloudera) and password (cloudera).
- Right-click the URL that is identified in the terminal and click open. Enter the username and password to login to Cloudera manager and view the dashboard. The services will all display as stopped.
- Next to the label indicating the cluster name “Cloudera Quickstart”, click on the drop-down menu and select Start to start the entire cluster and associated services.
- Once you see the tasks have completed, close the progress window to see the dashboard which will indicate red, green, and yellow status. So long as these processes indicate a color the service is running and is fine.
Step 3 – Task 1 – Report
Write a report (4-6 pages) includes:
- Following APA standards cover page and table of content,
- Short research report on Big, Data, and Big Data Platforms,
- Setup and configure a single-node Hadoop cluster environment; include a document on your understanding of the process and purpose, along with supporting screen shots.
- Start the cluster and all associated services as needed to prepare the cluster for use in subsequent tasks, include a document on your understanding of the process and purpose, along with supporting screen shots.