OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk...

25
OPENSTACK LOG FILE AUTOMATION DUPLICATION MUHAMMAD NADZMI B MOHD ZAMERI BACHELOR OF COMPUTER SCIENCE (COMPUTER NETWORK SECURITY) WITH HONOURS BTBL 16043903

Transcript of OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk...

Page 1: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

OPENSTACK LOG FILE AUTOMATION DUPLICATION

MUHAMMAD NADZMI B MOHD ZAMERI

BACHELOR OF COMPUTER SCIENCE (COMPUTER NETWORK SECURITY) WITH

HONOURS

BTBL 16043903

Page 2: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

TABLE OF CONTENTS

ABSTRACT……………………... ..………………………………………………………...3

Chapter 1………………………....

1.1 Introduction……………...

1.2 Background……………....

1.3 Problem Statement……...

1.4 Objective………………....

1.5 Project Method………….

1.6 Limitation Of Work…….

..………………………………………………………...4

..………………………………………………………...4

..………………………………………………………...5

..………………………………………………………...6

..………………………………………………………...6

..………………………………………………………...7

..………………………………………………………...8

Chapter 2………………………....

2.1 Introduction……………….

2.2 Cloud Computing………...

2.3 Virtual Machine…………..

2.4 Openstack………………….

2.5 Log File…………………….

2.7 Summary Of Openstack ....

And Log File System

Research Paper

..………………………………………………………...9

..………………………………………………………...9

..……………………………………………………….10

..…………………………………………………....10-11

......…………………………………………………11-12

....……………………………………………………...13

...…………………………………………………...13-16

Chapter 3………………………....

3.1

Introduction……………….

3.2 Flowchart………………….

3.3 Openstack Installation

And Configuration

..……………………………………………………….17

..……………………………………………………….17

..……………………………………………………….18

..……………………………………………………….19

1

Page 3: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

3.3.1 Installing Centos 7 In

A Virtual Machine

3.3.2 Installing Openstack..

3.3.3 Configuring………....

Openstack System

3.3.4 Openstack System…..

Flow Design

..……………………………………………………….20

..……………………………………………………….21

..……………………………………………………21-22

REFERENCE………………….... ……………………………………………………..23-24

2

Page 4: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

ABSTRACT

OpenStack Log File Duplication is aim to reduce and simplify the logfile system in the

OpenStack dashboard while giving it an option to make a backup of a log file in each of the

openstack nodes to save it as a backup on swift. In this project we will focusing in creating

backup to make it safer for us to use openstack and centralising the openstack logfile system to

make one large log file that is compelling for an average joe to read it.

3

Page 5: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

CHAPTER 1

1.1 Introduction

In this chapter we will talk about the background of OpenStack, its problem statement,

objectives, scopes and limitation of work. In the background section we will talk about the

history and the background of the software and project call OpenStack and how it was achieved

and what kind of services that it provided. In the Problem Statement section, we will talk about

our limitation of OpenStack and how it gives inconvenience in to the user which will lead up to

the objective section where we will talk about the criteria that we want to achieve during the

development of this project. After then that we will talk about the project scope in who, where,

and what is the use case of the project. Lastly, we will talk the limitation of developing the

system.

4

Page 6: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

1.2 Background

The internet has become a very essential aspect in everyday life on this day and age

where everyone and everything is connected to it. As such the growth of cloud computing

services must not be taken lightly as its importance grows day to day from personal and public

usage in completing and storing everyday life events. One such services is an open source

Infrastructure-as-a-service (IaaS) name OpenStack. Open Stack started in 2010 and was manage

by the OpenStack Foundation. It was a joint project between The National Aeronautics and

Space Administration (NASA) and Rackspace Hosting which intended to help organization by

offering a cloud computing services that can run on a regular hardware. As openstack is a cloud

computing service it is important to keep track off your data. As such shows the importance of

log files in the system as it records all the data flow and authentication that has happens in the

system. With that comes a great reliability on the log file in security management and data

management as we will search the log file if there are suspicious activity done to our cloud

computer. But the great reliability does not have an alternate way in handling the system if the

specific log file is corrupted or lost. Other than that, it is hard for normal user to understand the

logfile for the first time. With so comes the logfile automation duplicate system in theory will

make it harder for you to lose your log file because of the duplication provided by the project and

make it easier for us to read our logfile where we can sort and visualize the logfile data.

5

Page 7: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

1.3 Problem statement

Nowadays Log files are an insental system in a computer network environment. This is

because a log file is a system that lets us view the changes that has been done to our system in

depth which helps us in securing and debugging our system. As such many problems will occur

when a log file is lost or corrupted. This is because there is no easy way to retrieve the lost log

file and as such the data that is stored in the logfile system is also lost.

Secondly is that the log files data are to complicated as many history are compile and

save in a log file. This makes reading the specific log file harder for an average joe to read the

history of the file system. Lastly is that how log files are seperated and not centralized for every

nod in a system. As a log file is separated in smaller part we may have to check multiple logfile

to know the problem like to know the problem in logfile A we must search the problem in log

file B which have an error problem in log file C. As such researching a log file becomes time

consuming and tedious.

1.4 Objective

1. To develop a Log File duplicate so that when a log file is corrupted or lost there is a way

to obtain the previous and current backup of the logfile.

2. To design a simplify openstack dashboard that can summaries the logfile data so that an

average user can read it easily.

3. To test the usability of the system in increasing productivity by centralizing the logfile

obtain from the services.

6

Page 8: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

1.5 Project Method

The method of the project is divided into three parts that is the installation, the

modification of the open stack system and the testing and implementation of the system. The first

part of installing the system is by setting up and configuring the centos 7 operating system to

allow open stack to access and use the computer as a server that will set up the online cloud

storage. Secondly is to modify the existing module of the open stack system to receive data and

log file that will make a duplicate data of the exact data and store it to the dashboard.

The Last part is the testing and implementation of the system as we will set it up for a

real use case environment where we will monitor the usability of the system and its performance

in helping user to recover log files that has been lost by corruption or other means.

7

Page 9: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

1.6 Limitation of Work

The project stated has several limitations that it could not resist:

I. Computer hardware performance

The hardware of a specific computer will determine the speed of

efficiency an open stack system will work. This is because an older computer will

run much more slowly then a new computer as such provide a slower speed for

the system to process, store and record all of its data running in the system. It is

also noted that a computer CPU cannot run as fast as a server CPU thus making it

slower to encrypted and decrypted data then the speed of a server CPU.

II. Internet speed connection bandwidth

An internet is a crucial part of the openstack system as it is a cloud computing

software meaning it uses the internet to process all of the things that it can do. As

of that, the speed of the internet plays a crucial part in transferring and updating

data of the open stack system. With the geological different in internet speed we

may found it difficult to run the system if the internet speed is to slow when we

access it in another area.

8

Page 10: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

CHAPTER 2

LITERATURE REVIEW

2.1 Introduction

In this particular chapter we will be explain some of the particular components and terms

that have been used in our previous chapter and some that will be use in the future. These terms

consist of Cloud Computing, Virtual machine, log file and openstack that will be explain more in

depth to deepen our understanding on how,what and why we are using them in our current

project.

9

Page 11: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

2.2 Cloud Computing

Cloud Computing is a term that is widely and broadly used nowadays that is commonly

associated with the growth of the internet over this past few decade. This is because cloud

computing is usually done in a manner that is accessible anywhere or anytime by the use of the

internet. It is a high level services that is often located on the internet or cloud that is easily

improvise by the management with minimal effort as cloud computing is a big pools of

configurable computer system resources that is shared in an economic scale. This helps decrease

the time taken for a many organisation to set up their businesses and also improve the

maintenance cost of the particular system as we only need to maintain only one big core system

rather than a multiple of smaller system.

2.3 Virtual Machine

Virtual Machine or VM is an emulator software that simulate and visualise a seperate

computer in a host environment which is able to perform many computing as it is a separate

computer. VM also known as a guest is created as the main computer that is known as a host as it

is hosting the VM in its computer environment. VM are capable of running most programs and

application that a normal computer can but sometimes the task are perform in a different way

but as the outcome are mostly same it is overlook by most people. VM are divided into two

category because of their many uses. These uses are depended on their level of correspondent of

the original computer as it cannot perform task that surpass the original computer can do.

10

Page 12: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

The first category is the System Virtual Machine which is a substitute of a normal

computer that have the capability of simulating a whole new environment within a VM.This will

allow you to install multiple different logical environment that can be use seperetedly in a single

physical computer. We will be using this VM in our current project. Secondly is a Processing

virtual machine which is constructed to perform a computer task in a platform-independent

environment.

2.4 Openstack

Openstack is a cloud computing software that is mostly deploy as an infrastructure as a

service (IaaS) that is use to build and manage cloud computing of public or private platforms. It

is a free open source software platform that is back by a substantial amount of company and

countless community members working to improve the platform. As openstack is a cloud

computing software it is only logical that it has its on VM which allows user to manage and

modify thear cloud environment easily. Openstack is comprise of seven main components that

consist of Nova, Swift, Cinder, Neutron, Horizon, Keystone and Glance These component have

their own task ands uses that is use to operate the openstack software.

11

Page 13: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

Figure 2.1: Openstack Components

The First component is Horizon that is the dashboard in openstack. It is the Graphical

User Interface (GUI) for the user and the first component that the user may see when launching

openstack. Horizon gives developer access to all of the component of openstack through an

application programing interface (API) and gives an administrator service to the user to manage

the cloud. Secondly is Nova the brain or main computing engin of openstack.Nova is task with a

crucial part of openstack that is to deploy and manage the large number of VM instance in an

openstack.

After that comes the keystone component in openstack. The keystone is the identity

service or authentication of openstack. It is task to provides and list the identity of all the user in

openstack and store their permission on openstack in which component they are allow to use and

modify.

The fourth component of openstack is the Swift. Swift is an object storage module that is

task to store the object and files in the openstack system. After that we have Glance that is the

Image service for openstack. Glance is task in providing image as in a saved virtual hard disk

services to the openstack that allow it to use as example in launching a new VM.

The sixth component is the neutron that is the networking component of openstack.

Neutron helps the VM in openstack to communicate with each other easily and quickly. Finally

is cinder that is the block storage of openstack. Cinder also provides a storage services like swift

but differently as swift store object randomly in openstack, cinder uses a more traditional way of

storing object that is storing the object in a place that is easily access by the user where the speed

of accessing the data is the priority for the user.

12

Page 14: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

2.5 Log File

In an IT environment a log file is crucial in developing,maintaining and securing a

system or application. A log file is a file that records the modification or activation that have

accoure in a system. As such a log file will list all of the changes a user may make to they system

which is important as it can provide a clue or evidence that a change have happened in a system.

Other than that a log file can also helps us in debugging a system when face with an error to the

system.The act of keeping a log file is logging in which a log file is save to be use if an error

occur. A log file can be divided into many category to improve the readability of the log file as

such some example of these chetagory are the event log, Transaction logs, message logs and

many more.

2.7 Summary of Openstack and log file system Research Paper

The table below show the summary of literature review related to openstack and logfile

system.

Title Of Paper Author & Year Background

LOG FILE MANAGEMENT TOOLS

Alan Gatto, Dean Cottle, Oleg Fylypenko, Shivakumar Gurusiddappa, Kevin Haselhuhn, Greg Hollis, Luis Lamprea, Sergey Aleksin, Gaurav Kumar, Narendra Datar, Michael Pougnet, Poras Bharucha, Brett Dale (19-12-2017)

When providing technical support for computer systems support specialists often use log files generated by various components of the systems to diagnose technical issues. Such logs are generally stored in various locations scattered across various computers ( e . g . , servers ) of the software system and across the file systems of those

13

Page 15: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

computers thereby complicating collecting those logs from a customer computer system installation

METHODS AND SYSTEMS TO DETECT ANOMALIES IN COMPUTER SYSTEM BEHAVIOR BASED ON LOG-FILE SAMPLING

Darren Brown, Junyuan LIN, Nicholas Kushmerick (30-10-2018)

Methods and systems that detect computer system anomalies based on log file sampling are described. Computers systems generate log files that record various types of operating system and software run events in event messages. For each computer system, a sample of event messages are collected in a first time interval and a sample of event messages are collected in a recent second time interval. Methods calculate a difference between the event messages collected in the first and second time intervals. When the difference is greater than a threshold, an alert is generated. The process of repeatedly collecting a sample of event messages in a recent time interval, calculating a difference between the event messages collected in the recent and previous time intervals, comparing the difference to the threshold, and generating an alert when the threshold is violated may be executed for each computer system of a cluster of computer systems.

SYSTEM METHOD, AND COMPUTER READABLE MEDIA FOR IDENTIFYING A USER-INITIATED LOG

Danny Yen-Fu Chen, David A. Cox, Sheryl S. Kinstler, Fabian F. Morgan (03-01-2017)

A system, a method, and a computer readable media for identifying a user-initiated log file record in a log file are

14

Page 16: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

FILE RECORD IN A LOG FILE

provided. The log file has a user-initiated log file record and a repeating pattern of log file records automatically generated by a software program. The system allows a user to identify first and second timestamp values corresponding to first and second times which identify a time interval of interest in the log file. The system further analyzes the log file to identify the user-initiated log file record having a timestamp value between the first and second timestamp values. The system further identifies the repeating pattern of log file records in the log file.

OPENSTACK AND SOFTWAREDEFINED NETWORKING THE ENORMOUS POTENTIAL OF OPEN SOURCE SOFTWARE COLLABORATION

Hoai Le (September 2017) Throughout the theoretical part, cloud computing, OpenStack architecture, OpenStack core services, Software-Defined Networking architecture, and SDN-related technologies were researched. The outcome indicated that OpenStack and Software-Defined Networking could play well together. It also showed why people favored OpenStack, and why it has become one of the fastest growing open source communities.

DISTRIBUTED LOG ANALYSIS ON THE

Galip Aydin, Ibrahim Riza Hallac (10-02-2018)

In this paper we describe our work on designing a web

15

Page 17: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

CLOUD USING MAPREDUCE

based, distributed data analysis system based on the popular MapReduce framework deployed on a small cloud; developed specifically for analyzing web server logs. The log analysis system consists of several cluster nodes, it splits the large log files on a distributed file system and quickly processes them using MapReduce programming model. The cluster is created using an open source cloud infrastructure, which allows us to easily expand the computational power by adding new nodes. This gives us the ability to automatically resize the cluster according to the data analysis requirements. We implemented MapReduce programs for basic log analysis needs like frequency analysis, error detection, busy hour detection etc. as well as more complex analyses which require running several jobs. The system can automatically identify and analyze several web server log types such as Apache, IIS, Squid etc. We use open source projects for creating the cloud infrastructure and running MapReduce jobs.

16

Page 18: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

CHAPTER 3

METHODOLOGY

3.1 INTRODUCTION

In this chapter i will be reporting the methodology that was proposed by other researcher

and how their research help in building and improving the present framework. In this chapter i

will present the framework and system model, flowchart and the approach i take in taking on the

project. The selection of methodology that is most suited for the development of this project is

crucial in determining the outcome of the project as choosing the incorrect methodology may

hinders the project time flow and incidentally be the reasons of the project delay or

discontinuation. This is because of the developer relies in the time flow of a project to guide

them through their work and because the wrong methodology hinders a developer timestamp

that will unintentionally effect the project

17

Page 19: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

3.2 FLOWCHART

Figure 3.0 Flowchart of Openstack log file automation duplication

Figure 3.0 show an overall flow chart of configuring the log file system in an openstack

environment. The first step is to run a virtual machine and install a centos 7 operating system

preferably the minimal on that particular virtual machine. Secondly we will have to instal the

openstack framework on that particular machine and then configure the existing log file system

and openstack dashboard so that we can implement a duplication task for the logfile system and

maybe improve on the dashboard of openstack.

18

Page 20: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

3.3 OPENSTACK INSTALLATION AND CONFIGURATION

3.3.1 INSTALLING CENTOS 7 IN A VIRTUAL MACHINE

Installing Centos 7 in a virtual machine is a straightforward process. In this

project we will be using VMware to simulate a secondary instance in our computer

to install centos 7 minimal. Installing Centos required any centos 7 Iso that is up to

date and some tweaking to the configuration to the instance. This configuration

range from setting the connection type of the instance to be a bridge connection, to

allowing the VM to use preferably more than 20 GB of computer storage so that

openstack can be run smoothly.

19

Page 21: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

3.3.2 INSTALLING OPENSTACK

The first part in Installing Openstack on a centosOS is, we must check our ip

address as whether it is the same to the host computer or not. Other then that we

must stop, disable and remove unneeded services such as networkmanager and

firewall as openstack already have its own network manager and it does not play

well with our network manager and firewall. After that we have to set up our

hostname for our system and synchronize the server time to ours with ntpdate.

The second part is to find and install missing respiratory like the RPM

distribution for openstack and from there we will install centos release of openstack

and then update all of our computer system respiratory . After that we will install

openstack with packstack that is a openstack utility tool that uses puppet module in

helping us to deploy the openstack module. But before that we will configure our

admin password,SSL, Server password and many more in our openstack answer

file.

The third part is to start the openstack installation using packstack answer

file configuration that we have done before that will help automatically install our

openstack. Finally after the installation we are able to access our openstack

dashboard from a remote host.

20

Page 22: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

3.3.3 CONFIGURING OPENSTACK SYSTEM

Openstack System are configure in each of the system configuration file. As such

to configure an openstack system we must first enter the node file containing the

configuration file of the node and open the configuration file to edit of the

openstack configuration manually. As an example and openstack Nova

configuration file is located in its Nova file and is name nova.cfg. In the nova file

we will be configuring its logging file which we will add a syntax that will provide

the logfile do duplicate itself as a backup. Other then that is the dashboard where

we will try to simplify it to make it more applicable for normal user.

3.3.4 OPENSTACK SYSTEM FLOW DESIGN

The design of the openstack system flow is fairly easy in this project. As

we know openstack log file are store in their individual node and its configuration

is base on that node configuration. As such we will design a system, that will

duplicate the log file when a log file is recorded and send store it in a backup

folder. Other than that we will merge the current data of the log file to create one

centralized log file where all of the data can be view in the dashboard of the

openstack.

21

Page 23: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

22

Page 24: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

REFERENCE

Hoai Le. (2017). OPENSTACK AND SOFTWARE DEFINED NETWORKING. The Enormous

Potential of Open Source Software Collaboration.

Ranger, S. (2018, December 13). What is cloud computing? Everything you need to know about

the cloud, explained. Retrieved from ZD Net :

https://www.zdnet.com/article/what-is-cloud-computing-everything-you-need-to-know-fr

om-public-and-private-cloud-to-software-as-a/

CHANDAN KUMAR. (2018). Cloud-based Log Analyzer. 8 Cloud-based Log Analyzer for IT

Operational Insights.

Ben Silverman. (2017). How to explain OpenStack to a complete newcomer.

Miao He, Jin Feng Li,Chang Rui Ren,Bing Shao,Ming,Xie,Tian Zhi Zhao. (2017).

GENERATING IMPORTANT VALUES FROM A VARIETY OF SERVER LOG

FILES.

OpenStack. (2018, December 5). Introduction to OpenStack. Retrieved from OpenStack:

23

Page 25: OPENSTACK LOG FILE AUTOMATION DUPLICATION … · 1.1 Introduction In this chapter we will talk about the background of OpenStack, its problem statement, objectives, scopes and limitation

https://docs.openstack.org/security-guide/introduction/introduction-to-openstack.html

OpenStackComunity. (2018, December 5). Networking Services security best practices.

Retrieved from Openstack:

https://docs.openstack.org/security-guide/networking/securing-services.html

24