Hadoop infrastructure for education

Post on 02-Jul-2015

172 views 1 download

description

Hadoop infrastructure for education

Transcript of Hadoop infrastructure for education

HADOOP INFRASTRUCTURE FOR EDUCATION

Darko Marjanović, darko@elab.rs

Miloš Milovanović, milovanovicm@elab.rs

Božidar Radenković, boza@elab.rs

University of Belgrade

Faculty of Organizational Sciences

Laboratory for E-business

Laboratory for E-business

• Exists within the Faculty of Organizational Sciences, University of Belgrade

• Organizes e-learning courses since 2001. by using Moodle LMS and blended learning concept

• More than 1000 students take our courses each year• Research areas:

E-business Internet and mobile technologiesBig DataCloud ComputingE-educationAdaptive e-services Internet of things Social media

Overview

• Introduction

• Hadoop model for education

• Implementation

• Cluster organizaton

• Conclusion

Introduction

• Education institutions need to have access to relevant information in order to offer high-quality education to students.

• Main problem – Information arrive to organizations

• from variety of sources

• with rapidly increasing speed

• in variety of types.

• Hadoop as a possible solution to this matter

Hadoop

• Apache Hadoop is an open-source software framework for storage and large-scale processing of data-sets on clusters of commodity hardware.

• All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common and thus should be automatically handled in software by the framework.

Big Data

• Big data is a blanket term for any collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.

Hadoop model for education

Guidelines used for deploying Hadoop model:

• Efficient data import

• Reliable manipulation

• Flexible output

Model for managing Big Data in educational institutions

Implementation

• Three node cluster

• Integration with Moodle LMS

• Distributed storage

• In its performance, Hadoop cluster consumes a significant amount of resources, and controlling them is inevitable.

Implemented Hadoop e-learning infrastructure

Cluster organization

• Central role that is responsible for Hadoop’s performance is represented by Master node.

• In order to optimize Hard Disk Drive Memory, the implementation described here contains Data Node installed on the Master Node

• Imposed mechanism for preventing data losing between nodes is to constantly monitor network infrastructure.

• Data replication as a mechanism for preserving data within cluster

Hadoop cluster organization in Laboratory for E-business

Conclusion

• A scalable platform that brings Big Data based on Hadoop to e-learning environment is presented.

• Main contribution of described paper is providing environment for manipulating data generated from variety of sources in education activities.

• Primary objective is improvement of e-learning process.

• Future research is directed to:-optimizing integration with e-learning services-integration with cloud platform

HADOOP INFRASTRUCTURE FOR EDUCATION

Darko Marjanović, darko@elab.rs

Miloš Milovanović, milovanovicm@elab.rs

Božidar Radenković, boza@elab.rs

University of Belgrade

Faculty of Organizational Sciences

Laboratory for E-business