CSE 548 Advanced Computer Network Security Email Trust in MobiCloud using Hadoop Framework Updates...

CSE 548 Advanced Computer Network Security

Email Trust in MobiCloud using Hadoop Framework

Updates

Sayan Cole

Jaya Chakladar

Group No: 1

Overview

• Installation of Hadoop

• Understanding the existing email trust system and its suitability as a MapReduce application

Project Tasks (updated)

Tasks Responsible Status

Learn MapReduce and Hadoop Jaya & Sayan 100 %

Install and configure Hadoop in MobiCloud

Jaya & Sayan 60 %

Develop UI web application Jaya 25 %

Search mapper algorithm Sayan 25 %

Search reduction algorithm Jaya 25 %

HDFS data store creation and updates

Jaya & Sayan Jaya & Sayan

Testing and problem resolution Jaya & Sayan Not started

Delivery and demo Jaya & Sayan Not started

Software and Hardware Requirements

• Hadoop

• Database software e.g. MySQL or Apache HDFS

• 3 or 4 Android phones mapped to virtual machines in 2 different Linux boxes

Hadoop Single Cluster Installation Prerequisites

• Java 6– Add the canonical partner repository to the apt repository– Update the source list– Install JDK– Select Sun’s Java as the default on the machine

• Add a dedicated Hadoop system user• Configure SSH

– Configure SSH access for Hadoop system user– Generate an SSH key for Hadoop user– Enable SSH access to local mahine with the new key created

• Disable IPv6

Hadoop Single Cluster Installation

• Download Hadoop from Apache mirror sites and extract

• Set JAVA_HOME in /conf/hadoop-env.sh• Configure core-site.xml

– Set path for hadoop.tmp.dir to local directory– Set the HDFS variable

• Configure mapred-site.xml to set the host and port of mapReduce job tracker.

• Configure hdfs-site.xml to specify the number of replications for each file in the system.

Hadoop Single Cluster Installation

• Format the Hadoop HDFS name node – make sure data is backed up

• Start a single node cluster, this starts the name node, data node, job tracker & task tracker.

Hadoop Multiple Cluster Installation

• Setup two single node clusters to continue• Designate one as master and the other one as slave• Shutdown clusters in both machines• Update /etc/hosts on both machines with appropriate names (master and

slave) and addresses• SSH configuration between master and slave

– Hadoop user must connect to users master and slave– Password less connection


• Master node runs master daemons like name node for HDFS and job tracker•Both nodes run slave daemons like data node fro HDFS and task tracker


• Master vs. Slave configuration– On master /conf/master lists the master– On slaves, /conf/slaves lists two entries master and slave

• Update core-site.xml on all machines to setup fs.default.name as hdfs://master:<Port number>

• Update mapred-site.xml on all sites to fix mapread.job.tracker as master:<port number>• Change dfs.replication variable in hdfs-site.xml to the number of sites avaiable , 4 in our

case.• Format the name node


• Start up the multi-node cluster– Start HDFS daemons like name node and data node

daemons in master and slaves respectively– Start Map reduce daemons like job tracker on master

and task tracker in slaves

Challenges faced so far

• Multi node setup errors

Project Time Line

CSE 548 Advanced Computer Network Security Email Trust in MobiCloud using Hadoop Framework Updates...

Documents

Transcript of CSE 548 Advanced Computer Network Security Email Trust in MobiCloud using Hadoop Framework Updates...