P CES 2019
Transcript of P CES 2019
A SIMPLE TOOL FOR CONVERTING STRUCTURED
DATA TO SEMI-STRUCTURED DATA
MUHAMMAD SHAFIQ BIN NOR SAHAIDI
BACHELOR OF COMPUTER SCIENCE
(SOFTWARE DEVELOPMENT) WITH HONOURS
UNIVERSITI SULTAN ZAINAL ABIDIN
2019
A SIMPLE TOOL FOR CONVERTING STRUCTURED DATA TO SEMI-
STRUCTURED DATA
MUHAMMAD SHAFIQ BIN NOR SAHAIDI
BACHELOR OF COMPUTER SCIENCE
(SOFTWARE DEVELOPMENT) WITH HONOURS
UNIVERSITI SULTAN ZAINAL ABIDIN
2019
I
DECLARATION
I hereby declare that this report is based on my original work except for quotations and
citations, which have been duly acknowledged. I also declare that it has not been
previously or concurrently submitted for any other degree at University Sultan Zainal
Abidin or other institutions.
___________________________________________
Name: …………………………………………………
Date: ……………………………………….
II
CONFIRMATION
This is to confirm that the project title Structured to Semi-Structured Data Converter
was prepared and submitted by MUHAMMAD SHAFIQ BIN NOR SAHAIDI, matric
number BTAL16043506 and has been found satisfactory in term of scope, quality and
presentation as partial fulfilment of the requirement for the Bachelor of Computer
Science (Software Development) in University Sultan Zainal Abidin (UniSZA). The
research conducted and the writing of this report was under my supervisor.
___________________________________________
Name: …………………………………………………
Date: ……………………………………….
III
DEDICATION
First of all, thanks to Allah the most merciful and the most gracious. I would like to
thanks to my parents who give me a bunch of supportive on my studies time. A
gracefully thanks to my supervisor PROF MADYA DR NORDIN BIN ABD
RAHMAN for his guidance and support in my academic. I am very thankful of his
patience and invaluable advises that inspired me to keep on achieving my dream and
felt honoured with his trust on my capability.
Last but not least, special thanks to my beloved friends for providing moral support
during my studies. Not forgetting those who have directly or indirectly helped me on
this thesis. May Allah rewards your kindness in abundance in this world and next.
IV
ABSTRACT
SQL is a domain specific language used for programming and designed for managing
data in database management system (DBMS). SQL perfectly handling structured data
which has relation between entities or variables of the data. Terms of scope in SQL it
contain data query, data manipulation (insert, update and delete), data definition (set of
creation and modification) and data access control. SQL is a language for querying data
that is represented as a table but it does not tell how those tables are store. In this project,
a converting tools is developed to share data information between applications as XML
and JSON format. XML files are just text files store on a machine and it is a good for
storing configuration settings and passing data between different systems. While, JSON
is likely the same as XML but has advantage in certain scenario. It helps break down
data really well and help passing across server which XML is not preferred because of
the tags. It is used primarily to transmit data between a server and web application, as
an alternative to XML. JSON has been popularized by web services developed
utilizing Representational State Transfer (REST) principles. Currently, new apps has
been design for comparing, this method need database from others system to be read.
Hence, user are often to use semi-structured data more than structured data because the
data does not reside in fixed fields or records, but does contain elements that can
separate the data into various hierarchies for simplify user to upload the data for their
new system according to the interface design.
V
Table of Content
DECLARATION ....................................................................................................................... I
CONFIRMATION ................................................................................................................... II
DEDICATION ......................................................................................................................... III
ABSTRACT ............................................................................................................................. IV
LIST OF TABLE ................................................................................................................... VII
LIST OF FIGURE ................................................................................................................ VIII
LIST OF ABBREVIATIONS ................................................................................................ IX
CHAPTER 1 ............................................................................................................................. 1
INTRODUCTION ................................................................................................................. 1
1.1 Background................................................................................................................. 1
1.2 Problems Statement ................................................................................................... 2
1.3 Objectives .................................................................................................................... 3
1.4 Scope ............................................................................................................................ 4
1.5 Limitation of Work .................................................................................................... 5
1.6 Expected Outcome ..................................................................................................... 5
1.7 Milestone ..................................................................................................................... 6
1.8 Gant Chart .................................................................................................................. 7
CHAPTER 2 ............................................................................................................................. 8
LITERATURE REVIEW ...................................................................................................... 8
2.1 Introduction ................................................................................................................ 8
2.2 Web Service ................................................................................................................ 8
VI
2.2.1 SOAP (Simple Object Access Protocol) ............................................................ 9
2.2.2 WSDL (Web Service Description Language) .................................................... 9
2.3 Review of Tools Converter on Different Journal .................................................. 10
CHAPTER 3 ........................................................................................................................... 11
METHODOLOGY .............................................................................................................. 11
3.1 Introduction .............................................................................................................. 11
3.2 Project Methodology ................................................................................................ 11
3.3 System Requirements .............................................................................................. 14
3.4 System Design ........................................................................................................... 16
3.5 Context Diagram (CD) ............................................................................................. 16
3.6 Entity Relation Diagram (ERD).............................................................................. 17
3.7 Data Flow Diagram (DFD) ...................................................................................... 18
CHAPTER 4 ........................................................................................................................... 19
IMPLEMENTATION AND TESTING ............................................................................... 19
4.1 Introduction .............................................................................................................. 19
4.2 Implementation ........................................................................................................ 20
CHAPTER 5 ........................................................................................................................... 27
CONCLUSION .................................................................................................................... 27
5.1 Introduction .............................................................................................................. 27
5.2 Project Contribution ................................................................................................ 27
5.3 Overall Conclusions ................................................................................................. 28
REFERENCES ....................................................................................................................... 29
VII
LIST OF TABLE
TABLE TITLE PAGE
1.1 Milestone 13
1.2 Gantt Chart 14
2.1 Table of Comparison 17
3.1 Software Requirement 21
3.2 Hardware Requirement 22
VIII
LIST OF FIGURE
FIGURE TITLE PAGE
3.1 Iterative Model 20
3.2 Context Diagram 24
3.3 Entity Relationship Diagram 25
3.4 Data Flow Diagram 26
4.1 Homepage 28
4.2 Sign up Page 29
4.3 Log in Page 29
4.4 Task Button 30
4.5 User Homepage 31
4.6 Input area 32
4.7 Output area 33
4.8 Load form URL Function 34
IX
LIST OF ABBREVIATIONS
SQL Structured Query Language
XML Extensible Mark-up Language
JSON JavaScript Object Notation
REST Representational State Transfer
SOAP Simple Object Access Protocol
WSDL Web Service Description Language
UDDI Universal Description, Discovery, and Integration
CD Context Diagram
ERD Entity Relationship Diagram
DFD Data Flow Diagram
1
CHAPTER 1
INTRODUCTION
1.1 Background
In converting a structured data to semi-structured data, there are choices for select semi-
structured data format from structured data. Today, data arrives in diverse forms from
diverse sources. The rapid decrease in the cost of storing data and the growth in
distributed systems has led to an explosion of machine-generated data. This includes
data from applications, sensors, mobile devices, and more. Semi-structured data
formats such as JSON, XML, and others have become the actual form in which this
data is sent and stored. Semi-structured data is easy for these applications to create and
capable of representing a wide array of information. In this scenario, structured data
(SQL) stores data in database or actually is a database, for passed data from server to
client or passed to webpage for JavaScript to consume and present to the user, the data
need to be reconstructed to semi-structured to store data as text file for save the data. In
addition, data that need to be share between applications or system have to be organize
in a complex manner that makes it sophisticated to access and analysis. However, it
may have information associated in it for allowing elements contained to be addressed.
Thus, develop this tools system will help user to convert structured data (SQL) to semi-
structured data (XML or JSON).
2
1.2 Problems Statement
Processes for making comparison system requires a data that can be either easier to
access and process and able to be addressed. Structured data may be logical structure
to ensure clear flow of control and easier to update or fix the database. However, for
sharing data information it can lead to:
I. Lack of information hiding
II. Lack of Encapsulation
III. Difficult to track data changes if a single data structure in a program is
change even in a reasonable sized program.
In addition, semi-structured data is a better way of sharing data between applications as
it:
I. Possible to view structured data as semi-structured
II. Flexible, which the schema can easily be changed
III. Data is not constrained by a fixed schema
IV. Portable
Therefore, a converting tools system is needed to simplify user to converting structured
data SQL to semi-structured data XML or JSON in a short time rather than to develop
a new same type of data with a different file format.
3
1.3 Objectives
The main objectives of this project is to develop a tools that can convert structured data
(SQL) to semi-structured data (XML/JSON) for enable the ability to represent a wide
range of information. In order to succeed the above project, the objectives of this project
are:
I. Support for nested or hierarchical data which simplifies data models
representing complex relationships between entities.
II. For converting structure data to semi-structured data format to transmit data
between a server and web application.
III. To develop a tool that convert SQL files to XML or JSON file format which is
data interchange for enable data from one machine to be stored or processed on
another machine.
4
1.4 Scope
This project focus on developing a converting tools for user use based on their needs,
this tools include:
I. Target user :
a. User
1) Access system
2) Sing Up and Log in
3) Upload file to convert
4) Load from URL
5) Write SQL database
6) Convert SQL file to XML or JSON format
7) Update Converted file
8) Download converted file
9) Save File
10) Log out
5
1.5 Limitation of Work
Many user can access to the system at one time and convert SQL file to XML or JSON
files, but the system will only convert to one files format at a time whether XML files
or JSON files.
1.6 Expected Outcome
The following are expected outcome for this project:
I. A tools that convert SQL file format to XML and JSON file format.
II. A web service tools that give user accessibility to use based on their needs.
6
1.7 Milestone
Milestone Date Status
Topic discussion and determination Week 1 Completed
Project title proposal Week 2 Completed
Proposal writing-Introduction Week3 Completed
Proposal writing-Literature review Week 4 & 5 Completed
Proposal progress presentation & evaluation Week 6 Completed
Discussion, Correction proposal & Proposed
solution methodology
Week 7 Completed
Proposed solution methodology(continued) Week 8 Completed
Proof of concept Week 9 Completed
Drafting report of the proposal Week 10 &
11
Completed
Submit draft report to supervisor Week 12 Completed
Seminar presentation Week 13 Completed
Report correction Week 14 Completed
Final report submission Week 15 Completed
Table 1.1: Milestone
8
CHAPTER 2
LITERATURE REVIEW
2.1 Introduction
In this chapter, the study of translating the data using SQL Server Management Studio,
Adventure Works database, and Java is done. The aim for this review is to show an
overview of how the converter tools is built by using Java method.
2.2 Web Service
Web service commonly provides an object-oriented web-based interface to a database
server, utilized by another web server, or by a mobile app, that provides a user interface
to the end user. Web services describes a standardized way of integrating Web-
based applications using the XML, SOAP, WSDL and UDDI open standards over an
Internet protocol backbone.
I. XML is used to tag the data.
II. SOAP is used to transfer the data.
III. WSDL is used for describing the services available.
IV. UDDI is used for listing what services are available.
Web services allow different applications from different sources to communicate with
each other without time-consuming custom coding, and because all communication is
in XML, Web services are not tied to any one operating system or programming
language.
9
2.2.1 SOAP (Simple Object Access Protocol)
SOAP (originally Simple Object Access Protocol) is a messaging protocol specification
for exchanging structured information in the implementation of web services in
computer networks. Its purpose is to induce extensibility, neutrality and independence.
It uses XML Information Set for its message format, and relies on application layer
protocols, most often Hypertext Transfer Protocol (HTTP) or Simple Mail Transfer
Protocol (SMTP), for message negotiation and transmission. SOAP allows processes
running on disparate operating system (such as Windows and Linux) to communicate
using Extensible Mark-up Language (XML). Since Web protocols like HTTP are
installed and running on all operating system, SOAP allows clients to invoke web
services and receive responses independent of language and platforms.
2.2.2 WSDL (Web Service Description Language)
WSDL is an XML-based interface description language that is used for describing the
functionality offered by a web service. The acronym is also used for any specific WSDL
description of a web service (also referred to as a WSDL file), which provides a
machine-readable description of how the service can be called, what parameter it
expects, and what data structure it returns. Therefore, its purpose is roughly similar to
that of a type signature in a programming language.
10
2.3 Review of Tools Converter on Different Journal
Table 2.1 shows a review of converter tools concept with different perspective
Author/Journal
/Year
System Name Method Description
Markus
Forsberg &
Aarne Ranta /
December
2003
BNF Converter
Labelled
BNF
(LBNF)
This system is a multi-lingual
compiler. It can compile like
any other compiler but can
work in various languages
without the need to change
the code editor and use a
different compile.
Sanjay
Chatterji,
Subrangshu
Sengupta,
Bagadhi Gopal
Rao, and
Debarghya
Banerjee /
2014
A Tool for
Converting Different
Data Representation
Formats
This system converted
memory to textual
representation or vice versa.
Rajasekar
Krishnamurthy
/ 2004
XML-to-SQL Query
Translation
XML-to-
Relational
mapping
This system is to use
relational databases to store
and query existing XML
data, while in the latter,
existing relational data is
exported as XML.
Table 2.1: Table of comparison
11
CHAPTER 3
METHODOLOGY
3.1 Introduction
In this chapter is more about the description of the service’s process or the methodology
that was used when developing the tools. There are a lot of model that is based on the
Software Development Lifecycle (SDLC) Model such as waterfall model, spiral model,
iterative model and many more. But for this tools the method that was chosen was the
Iterative model which contains many phases and repeated cycles (iterative). Each of
them will be explain in this chapter.
3.2 Project Methodology
Iterative model were developed to overcome the weaknesses of the waterfall model. It
starts with an initial planning and ends with deployment with the cyclic interactions in
between. The basic idea behind this method is to develop a system through repeated
cycles (iterative) and in smaller portions at a time (incremental), allowing software
developers to take advantage of what was learned during the development of earlier
parts or versions of the system. It can consist of mini waterfalls or mini V-Shaped
model. The reason why this model was chosen is because this model can accommodate
some change requests between increments for focusing on customer value than the
linear approaches and can detect project issues and changes earlier.
12
Figure 3.1: Iterative model
1) Planning & Requirements: As with most any development project, the first
step is go through an initial planning stage to map out the specification
documents, establish software or hardware requirements, and generally
prepare for the upcoming stages of the cycle.
2) Analysis & Design: Once planning is complete, an analysis is performed to
nail down the appropriate business logic, database models, and the like that
will be required at this stage in the project. The design stage also occurs
here, establishing any technical requirements (languages, data layers,
services, etc.) that will be utilized in order to meet the needs of the analysis
stage.
13
3) Implementation: With the planning and analysis out of the way, the actual
implementation and coding process can now begin. All planning,
specification, and design docs up to this point are coded and implemented
into this initial iteration of the project.
4) Testing: Once this current build iteration has been coded and implemented,
the next step is to go through a series of testing procedures to identify and
locate any potential bugs or issues that have cropped up.
5) Evaluation: Once all prior stages have been completed, it is time for a
thorough evaluation of development up to this stage. This allows the entire
team, as well as clients or other outside parties, to examine where the project
is at, where it needs to be, what can or should change, and so on.
14
3.3 System Requirements
3.3.1 Software Requirements
Table 3.1: Software requirement
SOFTWARE DESCRIPTION
Windows 10 The operating system used to house all the
applications and tools
Microsoft Office 2013
This software is used to prepare documentation
and presentation of the report
XAMPP Server Manage connection between Apache and MySQL
in the local host server
Draw.io Create figures and diagram such as ERD, DFD,
Context Diagram and etc.
Web Browsers
(Google Chrome, Mozilla Firefox)
Software to run and display the system
Notepad++
Integrated Development Environment for writing
codes
15
3.3.2 Hardware Requirements
HARDWARE DESCRIPTION
Acer Aspire F 15 The computer used to develop the system with the
following specification:
- Intel Core i5-6200U 2.3GHz-2.8GHz
- NVIDIA GeForce 940MX + 4GB VRAM
Graphics
- Windows 10
- 128GB SSD + 1TB HDD
- 4GB DDR4 memory
Kingston Data Traveler
USB 3.0 (16GB)
Used to store and transfer data files
Printer Print out the documentation files
Table 3.2: Hardware requirement
16
3.4 System Design
Modelling and design are a diagram that built to scale which is represent the detail in
order to explain more about system function. System design is explained by Context
Diagram (CD), Data Flow Diagram (DFD), and Entity Relationship Diagram (ERD).
3.5 Context Diagram (CD)
Figure 3.2: Context Diagram (CD)
19
CHAPTER 4
IMPLEMENTATION AND TESTING
4.1 Introduction
This chapter describes the implementation of Structured to Semi-structured Converter
tools should be. Implementation and testing is to develop the product as proposed in
previous phase, to ensure that they are fully meet objectives. Implementation and testing
should be done before the tools is fully utilized. Implementation consider that the
hardware and software system components are installed, the selected software is
configured and tested, the software may be customized to meet local functional
requirement and becoming a fully operational production system. Testing depends on
the testing method employed, it can be implemented at any given time in the
development process. However, the more test effort is employed after the requirements
have been defined and coding process has completed.
20
4.2 Implementation
The tools is developed and produces a prototype which is about fully functioning. It
was developed using html as interface, php as programming language for function and
SQL as writing query for database.
Figure 4.1: Homepage
Figure 4.1 shows the index page of the system which is the home page for user. It has
button for directing to sign up and log in page.
21
Figure 4.2: Sign up page
Figure 4.2 shows the sign up page where first time user need to register first before
they can use the tools.
Figure 4.3: Log in page
Figure 4.3 shows the form for log in page where user will need to enter their
credentials in order to use the tools after completing in figure 4.2. User need to use
their own Username and Password that has been register in Sign up form.
22
Figure 4.4: Task Button
Figure 4.4 shows the task button of function that given to user for user explore and
performing a task with these tools.
23
Figure 4.5: User Homepage
Figure 4.5 shows the user homepage after login in, it shows the user username on top
left on the page and log out link for user log out of the user homepage and go back to
homepage in figure 4.1.
24
Figure 4.6 Input area
Figure 4.6 shows the input area where user write SQL query or insert using task button
function or insert input using copy and paste function before converting the
structured data to semi-structured data using this tools.
25
Figure 4.7: Output area
Figure 4.7 shows the output area where user get an output after perform a converting
process using converter function on the task button function that has been shown in
figure 4.4.
26
Figure 4.8: Load from URL function
Figure 4.8 shows the form of Load from URL function that has been shown in figure
4.4, where user can paste a link from other page to retrieve data to Input area.
27
CHAPTER 5
CONCLUSION
5.1 Introduction
This section concludes the documentation of this project in the aspect of planning,
designing, implementation and testing.
5.2 Project Contribution
Structured to Semi-structured Converter tools simplify user to generate database format
from structured data to semi-structured data format and also provides choices to user
for converting SQL files to XML format or JSON format. On the homepage or user
homepage, user will see a menu like button of preferences to be choose. User will be
given a choice for insert a data. This system can successfully convert SQL format to
XML or JSON format and able to update output content.
28
5.3 Overall Conclusions
Briefly, this project has been developed and carried out by following the
objectives that has been explained clearly in Chapter 1. This tool will help you
to convert your SQL String/Data to XML or JSON Format and simplify user for
generate database format from structured data to semi-structured data format in
one click. It also provides choices to user for selecting semi-structured data
format whether XML or JSON to be converted to that based on user needs. User
also given a choice pf function by able to upload, download, convert SQL file to
XML or JSON file format, and save progress.
29
REFERENCES
Balog, K. (2013). Semistructured Data Search. University of Stavanger: Promise Winter School
2013.
Bethke, U. (2017, August 19). Which is better for storing data? SQL, JSON, or XML? Retrieved
from Quora: https://www.quora.com/Which-is-better-for-storing-data-SQL-JSON-or-
XML
Kathirvalavakumar, R. P. (2014). Mining Intelligence and Knowledge Exploration. Cork, Ireland:
Second International Conference.
Massimo. (2009, August 6). Where are the differences using XML and MySQL database? Which
should i use? Retrieved from stackoverflow:
https://stackoverflow.com/questions/1238749/where-are-the-differences-using-
xml-and-mysql-database-which-should-i-use
Mburu, D. (2017, November 20). What are the advantages and disadvantages of structured
programming? Retrieved from Quora: https://www.quora.com/What-are-the-
advantages-and-disadvantages-of-structured-programming
Rouse, M. (2014, November). semi-structured data. Retrieved from WhatIs.com:
https://whatis.techtarget.com/definition/semi-structured-data