P CES 2019

41
MUHAMMAD SHAFIQ BACHELOR OF COMP UTER SCIENCES 2019

Transcript of P CES 2019

MU

HA

MM

AD

SHA

FIQ

BA

CH

ELOR

OF C

OM

PU

TER SC

IENC

ES

2

01

9

A SIMPLE TOOL FOR CONVERTING STRUCTURED

DATA TO SEMI-STRUCTURED DATA

MUHAMMAD SHAFIQ BIN NOR SAHAIDI

BACHELOR OF COMPUTER SCIENCE

(SOFTWARE DEVELOPMENT) WITH HONOURS

UNIVERSITI SULTAN ZAINAL ABIDIN

2019

A SIMPLE TOOL FOR CONVERTING STRUCTURED DATA TO SEMI-

STRUCTURED DATA

MUHAMMAD SHAFIQ BIN NOR SAHAIDI

BACHELOR OF COMPUTER SCIENCE

(SOFTWARE DEVELOPMENT) WITH HONOURS

UNIVERSITI SULTAN ZAINAL ABIDIN

2019

I

DECLARATION

I hereby declare that this report is based on my original work except for quotations and

citations, which have been duly acknowledged. I also declare that it has not been

previously or concurrently submitted for any other degree at University Sultan Zainal

Abidin or other institutions.

___________________________________________

Name: …………………………………………………

Date: ……………………………………….

II

CONFIRMATION

This is to confirm that the project title Structured to Semi-Structured Data Converter

was prepared and submitted by MUHAMMAD SHAFIQ BIN NOR SAHAIDI, matric

number BTAL16043506 and has been found satisfactory in term of scope, quality and

presentation as partial fulfilment of the requirement for the Bachelor of Computer

Science (Software Development) in University Sultan Zainal Abidin (UniSZA). The

research conducted and the writing of this report was under my supervisor.

___________________________________________

Name: …………………………………………………

Date: ……………………………………….

III

DEDICATION

First of all, thanks to Allah the most merciful and the most gracious. I would like to

thanks to my parents who give me a bunch of supportive on my studies time. A

gracefully thanks to my supervisor PROF MADYA DR NORDIN BIN ABD

RAHMAN for his guidance and support in my academic. I am very thankful of his

patience and invaluable advises that inspired me to keep on achieving my dream and

felt honoured with his trust on my capability.

Last but not least, special thanks to my beloved friends for providing moral support

during my studies. Not forgetting those who have directly or indirectly helped me on

this thesis. May Allah rewards your kindness in abundance in this world and next.

IV

ABSTRACT

SQL is a domain specific language used for programming and designed for managing

data in database management system (DBMS). SQL perfectly handling structured data

which has relation between entities or variables of the data. Terms of scope in SQL it

contain data query, data manipulation (insert, update and delete), data definition (set of

creation and modification) and data access control. SQL is a language for querying data

that is represented as a table but it does not tell how those tables are store. In this project,

a converting tools is developed to share data information between applications as XML

and JSON format. XML files are just text files store on a machine and it is a good for

storing configuration settings and passing data between different systems. While, JSON

is likely the same as XML but has advantage in certain scenario. It helps break down

data really well and help passing across server which XML is not preferred because of

the tags. It is used primarily to transmit data between a server and web application, as

an alternative to XML. JSON has been popularized by web services developed

utilizing Representational State Transfer (REST) principles. Currently, new apps has

been design for comparing, this method need database from others system to be read.

Hence, user are often to use semi-structured data more than structured data because the

data does not reside in fixed fields or records, but does contain elements that can

separate the data into various hierarchies for simplify user to upload the data for their

new system according to the interface design.

V

Table of Content

DECLARATION ....................................................................................................................... I

CONFIRMATION ................................................................................................................... II

DEDICATION ......................................................................................................................... III

ABSTRACT ............................................................................................................................. IV

LIST OF TABLE ................................................................................................................... VII

LIST OF FIGURE ................................................................................................................ VIII

LIST OF ABBREVIATIONS ................................................................................................ IX

CHAPTER 1 ............................................................................................................................. 1

INTRODUCTION ................................................................................................................. 1

1.1 Background................................................................................................................. 1

1.2 Problems Statement ................................................................................................... 2

1.3 Objectives .................................................................................................................... 3

1.4 Scope ............................................................................................................................ 4

1.5 Limitation of Work .................................................................................................... 5

1.6 Expected Outcome ..................................................................................................... 5

1.7 Milestone ..................................................................................................................... 6

1.8 Gant Chart .................................................................................................................. 7

CHAPTER 2 ............................................................................................................................. 8

LITERATURE REVIEW ...................................................................................................... 8

2.1 Introduction ................................................................................................................ 8

2.2 Web Service ................................................................................................................ 8

VI

2.2.1 SOAP (Simple Object Access Protocol) ............................................................ 9

2.2.2 WSDL (Web Service Description Language) .................................................... 9

2.3 Review of Tools Converter on Different Journal .................................................. 10

CHAPTER 3 ........................................................................................................................... 11

METHODOLOGY .............................................................................................................. 11

3.1 Introduction .............................................................................................................. 11

3.2 Project Methodology ................................................................................................ 11

3.3 System Requirements .............................................................................................. 14

3.4 System Design ........................................................................................................... 16

3.5 Context Diagram (CD) ............................................................................................. 16

3.6 Entity Relation Diagram (ERD).............................................................................. 17

3.7 Data Flow Diagram (DFD) ...................................................................................... 18

CHAPTER 4 ........................................................................................................................... 19

IMPLEMENTATION AND TESTING ............................................................................... 19

4.1 Introduction .............................................................................................................. 19

4.2 Implementation ........................................................................................................ 20

CHAPTER 5 ........................................................................................................................... 27

CONCLUSION .................................................................................................................... 27

5.1 Introduction .............................................................................................................. 27

5.2 Project Contribution ................................................................................................ 27

5.3 Overall Conclusions ................................................................................................. 28

REFERENCES ....................................................................................................................... 29

VII

LIST OF TABLE

TABLE TITLE PAGE

1.1 Milestone 13

1.2 Gantt Chart 14

2.1 Table of Comparison 17

3.1 Software Requirement 21

3.2 Hardware Requirement 22

VIII

LIST OF FIGURE

FIGURE TITLE PAGE

3.1 Iterative Model 20

3.2 Context Diagram 24

3.3 Entity Relationship Diagram 25

3.4 Data Flow Diagram 26

4.1 Homepage 28

4.2 Sign up Page 29

4.3 Log in Page 29

4.4 Task Button 30

4.5 User Homepage 31

4.6 Input area 32

4.7 Output area 33

4.8 Load form URL Function 34

IX

LIST OF ABBREVIATIONS

SQL Structured Query Language

XML Extensible Mark-up Language

JSON JavaScript Object Notation

REST Representational State Transfer

SOAP Simple Object Access Protocol

WSDL Web Service Description Language

UDDI Universal Description, Discovery, and Integration

CD Context Diagram

ERD Entity Relationship Diagram

DFD Data Flow Diagram

1

CHAPTER 1

INTRODUCTION

1.1 Background

In converting a structured data to semi-structured data, there are choices for select semi-

structured data format from structured data. Today, data arrives in diverse forms from

diverse sources. The rapid decrease in the cost of storing data and the growth in

distributed systems has led to an explosion of machine-generated data. This includes

data from applications, sensors, mobile devices, and more. Semi-structured data

formats such as JSON, XML, and others have become the actual form in which this

data is sent and stored. Semi-structured data is easy for these applications to create and

capable of representing a wide array of information. In this scenario, structured data

(SQL) stores data in database or actually is a database, for passed data from server to

client or passed to webpage for JavaScript to consume and present to the user, the data

need to be reconstructed to semi-structured to store data as text file for save the data. In

addition, data that need to be share between applications or system have to be organize

in a complex manner that makes it sophisticated to access and analysis. However, it

may have information associated in it for allowing elements contained to be addressed.

Thus, develop this tools system will help user to convert structured data (SQL) to semi-

structured data (XML or JSON).

2

1.2 Problems Statement

Processes for making comparison system requires a data that can be either easier to

access and process and able to be addressed. Structured data may be logical structure

to ensure clear flow of control and easier to update or fix the database. However, for

sharing data information it can lead to:

I. Lack of information hiding

II. Lack of Encapsulation

III. Difficult to track data changes if a single data structure in a program is

change even in a reasonable sized program.

In addition, semi-structured data is a better way of sharing data between applications as

it:

I. Possible to view structured data as semi-structured

II. Flexible, which the schema can easily be changed

III. Data is not constrained by a fixed schema

IV. Portable

Therefore, a converting tools system is needed to simplify user to converting structured

data SQL to semi-structured data XML or JSON in a short time rather than to develop

a new same type of data with a different file format.

3

1.3 Objectives

The main objectives of this project is to develop a tools that can convert structured data

(SQL) to semi-structured data (XML/JSON) for enable the ability to represent a wide

range of information. In order to succeed the above project, the objectives of this project

are:

I. Support for nested or hierarchical data which simplifies data models

representing complex relationships between entities.

II. For converting structure data to semi-structured data format to transmit data

between a server and web application.

III. To develop a tool that convert SQL files to XML or JSON file format which is

data interchange for enable data from one machine to be stored or processed on

another machine.

4

1.4 Scope

This project focus on developing a converting tools for user use based on their needs,

this tools include:

I. Target user :

a. User

1) Access system

2) Sing Up and Log in

3) Upload file to convert

4) Load from URL

5) Write SQL database

6) Convert SQL file to XML or JSON format

7) Update Converted file

8) Download converted file

9) Save File

10) Log out

5

1.5 Limitation of Work

Many user can access to the system at one time and convert SQL file to XML or JSON

files, but the system will only convert to one files format at a time whether XML files

or JSON files.

1.6 Expected Outcome

The following are expected outcome for this project:

I. A tools that convert SQL file format to XML and JSON file format.

II. A web service tools that give user accessibility to use based on their needs.

6

1.7 Milestone

Milestone Date Status

Topic discussion and determination Week 1 Completed

Project title proposal Week 2 Completed

Proposal writing-Introduction Week3 Completed

Proposal writing-Literature review Week 4 & 5 Completed

Proposal progress presentation & evaluation Week 6 Completed

Discussion, Correction proposal & Proposed

solution methodology

Week 7 Completed

Proposed solution methodology(continued) Week 8 Completed

Proof of concept Week 9 Completed

Drafting report of the proposal Week 10 &

11

Completed

Submit draft report to supervisor Week 12 Completed

Seminar presentation Week 13 Completed

Report correction Week 14 Completed

Final report submission Week 15 Completed

Table 1.1: Milestone

7

1.8 Gant Chart

Table 1.2: Gantt chart

8

CHAPTER 2

LITERATURE REVIEW

2.1 Introduction

In this chapter, the study of translating the data using SQL Server Management Studio,

Adventure Works database, and Java is done. The aim for this review is to show an

overview of how the converter tools is built by using Java method.

2.2 Web Service

Web service commonly provides an object-oriented web-based interface to a database

server, utilized by another web server, or by a mobile app, that provides a user interface

to the end user. Web services describes a standardized way of integrating Web-

based applications using the XML, SOAP, WSDL and UDDI open standards over an

Internet protocol backbone.

I. XML is used to tag the data.

II. SOAP is used to transfer the data.

III. WSDL is used for describing the services available.

IV. UDDI is used for listing what services are available.

Web services allow different applications from different sources to communicate with

each other without time-consuming custom coding, and because all communication is

in XML, Web services are not tied to any one operating system or programming

language.

9

2.2.1 SOAP (Simple Object Access Protocol)

SOAP (originally Simple Object Access Protocol) is a messaging protocol specification

for exchanging structured information in the implementation of web services in

computer networks. Its purpose is to induce extensibility, neutrality and independence.

It uses XML Information Set for its message format, and relies on application layer

protocols, most often Hypertext Transfer Protocol (HTTP) or Simple Mail Transfer

Protocol (SMTP), for message negotiation and transmission. SOAP allows processes

running on disparate operating system (such as Windows and Linux) to communicate

using Extensible Mark-up Language (XML). Since Web protocols like HTTP are

installed and running on all operating system, SOAP allows clients to invoke web

services and receive responses independent of language and platforms.

2.2.2 WSDL (Web Service Description Language)

WSDL is an XML-based interface description language that is used for describing the

functionality offered by a web service. The acronym is also used for any specific WSDL

description of a web service (also referred to as a WSDL file), which provides a

machine-readable description of how the service can be called, what parameter it

expects, and what data structure it returns. Therefore, its purpose is roughly similar to

that of a type signature in a programming language.

10

2.3 Review of Tools Converter on Different Journal

Table 2.1 shows a review of converter tools concept with different perspective

Author/Journal

/Year

System Name Method Description

Markus

Forsberg &

Aarne Ranta /

December

2003

BNF Converter

Labelled

BNF

(LBNF)

This system is a multi-lingual

compiler. It can compile like

any other compiler but can

work in various languages

without the need to change

the code editor and use a

different compile.

Sanjay

Chatterji,

Subrangshu

Sengupta,

Bagadhi Gopal

Rao, and

Debarghya

Banerjee /

2014

A Tool for

Converting Different

Data Representation

Formats

This system converted

memory to textual

representation or vice versa.

Rajasekar

Krishnamurthy

/ 2004

XML-to-SQL Query

Translation

XML-to-

Relational

mapping

This system is to use

relational databases to store

and query existing XML

data, while in the latter,

existing relational data is

exported as XML.

Table 2.1: Table of comparison

11

CHAPTER 3

METHODOLOGY

3.1 Introduction

In this chapter is more about the description of the service’s process or the methodology

that was used when developing the tools. There are a lot of model that is based on the

Software Development Lifecycle (SDLC) Model such as waterfall model, spiral model,

iterative model and many more. But for this tools the method that was chosen was the

Iterative model which contains many phases and repeated cycles (iterative). Each of

them will be explain in this chapter.

3.2 Project Methodology

Iterative model were developed to overcome the weaknesses of the waterfall model. It

starts with an initial planning and ends with deployment with the cyclic interactions in

between. The basic idea behind this method is to develop a system through repeated

cycles (iterative) and in smaller portions at a time (incremental), allowing software

developers to take advantage of what was learned during the development of earlier

parts or versions of the system. It can consist of mini waterfalls or mini V-Shaped

model. The reason why this model was chosen is because this model can accommodate

some change requests between increments for focusing on customer value than the

linear approaches and can detect project issues and changes earlier.

12

Figure 3.1: Iterative model

1) Planning & Requirements: As with most any development project, the first

step is go through an initial planning stage to map out the specification

documents, establish software or hardware requirements, and generally

prepare for the upcoming stages of the cycle.

2) Analysis & Design: Once planning is complete, an analysis is performed to

nail down the appropriate business logic, database models, and the like that

will be required at this stage in the project. The design stage also occurs

here, establishing any technical requirements (languages, data layers,

services, etc.) that will be utilized in order to meet the needs of the analysis

stage.

13

3) Implementation: With the planning and analysis out of the way, the actual

implementation and coding process can now begin. All planning,

specification, and design docs up to this point are coded and implemented

into this initial iteration of the project.

4) Testing: Once this current build iteration has been coded and implemented,

the next step is to go through a series of testing procedures to identify and

locate any potential bugs or issues that have cropped up.

5) Evaluation: Once all prior stages have been completed, it is time for a

thorough evaluation of development up to this stage. This allows the entire

team, as well as clients or other outside parties, to examine where the project

is at, where it needs to be, what can or should change, and so on.

14

3.3 System Requirements

3.3.1 Software Requirements

Table 3.1: Software requirement

SOFTWARE DESCRIPTION

Windows 10 The operating system used to house all the

applications and tools

Microsoft Office 2013

This software is used to prepare documentation

and presentation of the report

XAMPP Server Manage connection between Apache and MySQL

in the local host server

Draw.io Create figures and diagram such as ERD, DFD,

Context Diagram and etc.

Web Browsers

(Google Chrome, Mozilla Firefox)

Software to run and display the system

Notepad++

Integrated Development Environment for writing

codes

15

3.3.2 Hardware Requirements

HARDWARE DESCRIPTION

Acer Aspire F 15 The computer used to develop the system with the

following specification:

- Intel Core i5-6200U 2.3GHz-2.8GHz

- NVIDIA GeForce 940MX + 4GB VRAM

Graphics

- Windows 10

- 128GB SSD + 1TB HDD

- 4GB DDR4 memory

Kingston Data Traveler

USB 3.0 (16GB)

Used to store and transfer data files

Printer Print out the documentation files

Table 3.2: Hardware requirement

16

3.4 System Design

Modelling and design are a diagram that built to scale which is represent the detail in

order to explain more about system function. System design is explained by Context

Diagram (CD), Data Flow Diagram (DFD), and Entity Relationship Diagram (ERD).

3.5 Context Diagram (CD)

Figure 3.2: Context Diagram (CD)

17

3.6 Entity Relation Diagram (ERD)

Figure 3.3: Entity Relationship Diagram (ERD)

18

3.7 Data Flow Diagram (DFD)

Figure 3.4: Data Flow Diagram (DFD)

19

CHAPTER 4

IMPLEMENTATION AND TESTING

4.1 Introduction

This chapter describes the implementation of Structured to Semi-structured Converter

tools should be. Implementation and testing is to develop the product as proposed in

previous phase, to ensure that they are fully meet objectives. Implementation and testing

should be done before the tools is fully utilized. Implementation consider that the

hardware and software system components are installed, the selected software is

configured and tested, the software may be customized to meet local functional

requirement and becoming a fully operational production system. Testing depends on

the testing method employed, it can be implemented at any given time in the

development process. However, the more test effort is employed after the requirements

have been defined and coding process has completed.

20

4.2 Implementation

The tools is developed and produces a prototype which is about fully functioning. It

was developed using html as interface, php as programming language for function and

SQL as writing query for database.

Figure 4.1: Homepage

Figure 4.1 shows the index page of the system which is the home page for user. It has

button for directing to sign up and log in page.

21

Figure 4.2: Sign up page

Figure 4.2 shows the sign up page where first time user need to register first before

they can use the tools.

Figure 4.3: Log in page

Figure 4.3 shows the form for log in page where user will need to enter their

credentials in order to use the tools after completing in figure 4.2. User need to use

their own Username and Password that has been register in Sign up form.

22

Figure 4.4: Task Button

Figure 4.4 shows the task button of function that given to user for user explore and

performing a task with these tools.

23

Figure 4.5: User Homepage

Figure 4.5 shows the user homepage after login in, it shows the user username on top

left on the page and log out link for user log out of the user homepage and go back to

homepage in figure 4.1.

24

Figure 4.6 Input area

Figure 4.6 shows the input area where user write SQL query or insert using task button

function or insert input using copy and paste function before converting the

structured data to semi-structured data using this tools.

25

Figure 4.7: Output area

Figure 4.7 shows the output area where user get an output after perform a converting

process using converter function on the task button function that has been shown in

figure 4.4.

26

Figure 4.8: Load from URL function

Figure 4.8 shows the form of Load from URL function that has been shown in figure

4.4, where user can paste a link from other page to retrieve data to Input area.

27

CHAPTER 5

CONCLUSION

5.1 Introduction

This section concludes the documentation of this project in the aspect of planning,

designing, implementation and testing.

5.2 Project Contribution

Structured to Semi-structured Converter tools simplify user to generate database format

from structured data to semi-structured data format and also provides choices to user

for converting SQL files to XML format or JSON format. On the homepage or user

homepage, user will see a menu like button of preferences to be choose. User will be

given a choice for insert a data. This system can successfully convert SQL format to

XML or JSON format and able to update output content.

28

5.3 Overall Conclusions

Briefly, this project has been developed and carried out by following the

objectives that has been explained clearly in Chapter 1. This tool will help you

to convert your SQL String/Data to XML or JSON Format and simplify user for

generate database format from structured data to semi-structured data format in

one click. It also provides choices to user for selecting semi-structured data

format whether XML or JSON to be converted to that based on user needs. User

also given a choice pf function by able to upload, download, convert SQL file to

XML or JSON file format, and save progress.

29

REFERENCES

Balog, K. (2013). Semistructured Data Search. University of Stavanger: Promise Winter School

2013.

Bethke, U. (2017, August 19). Which is better for storing data? SQL, JSON, or XML? Retrieved

from Quora: https://www.quora.com/Which-is-better-for-storing-data-SQL-JSON-or-

XML

Kathirvalavakumar, R. P. (2014). Mining Intelligence and Knowledge Exploration. Cork, Ireland:

Second International Conference.

Massimo. (2009, August 6). Where are the differences using XML and MySQL database? Which

should i use? Retrieved from stackoverflow:

https://stackoverflow.com/questions/1238749/where-are-the-differences-using-

xml-and-mysql-database-which-should-i-use

Mburu, D. (2017, November 20). What are the advantages and disadvantages of structured

programming? Retrieved from Quora: https://www.quora.com/What-are-the-

advantages-and-disadvantages-of-structured-programming

Rouse, M. (2014, November). semi-structured data. Retrieved from WhatIs.com:

https://whatis.techtarget.com/definition/semi-structured-data