PTI - Manajemen Data
Transcript of PTI - Manajemen Data
TODAY’S LESSONS Basis Data, Data, dan Informasi
Definisi Data Definisi Informasi Definisi Basis Data
DBMS Structure Hierarchical Network Relational Multidimensional Object oriented
Database TypeOperational DatabaseData WarehouseAnalytical DatabaseDistributed DatabaseEnd-user DatabaseExternal DatabaseHypermedia Databases
DATA The term data refers to groups of
information that represent the qualitative or quantitative attributes of a variable or set of variables
Data (plural of "datum") are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables
Data are often viewed as the lowest level of abstraction from which information and knowledge are derived
Raw data refers to a collection of numbers, characters, images or other outputs from devices that collect information to convert physical quantities into symbols, that are unprocessed.
INFORMATION Information, in its most restricted technical
sense, is an ordered sequence of symbols. As a concept, however, information has many
meanings. Moreover, the concept of information is closely
related to notions of :ConstraintCommunicationControlFormInstructionKnowledge
MeaningPatternMental stimulusPerception, and Representation
BASIS DATA A database consists of an organized
collection of data for one or more uses, typically in digital form.
One way of classifying databases involves the type of their contents, for example: Bibliographic Document-text Statistical.
Digital databases are managed using database management systems, which store database contents, allowing data creation and maintenance, and search and other access.
HIERARCHICAL DBMS A DBMS is said to be hierarchical if the
relationships among data in the database are established in such a way that one data item is present as the subordinate of another one.
Here subordinate means that items have 'parent-child' relationships among them. Direct relationships exist between any two records that are stored consecutively.
The data structure "tree" is followed by the DBMS to structure the database. No backward movement is possible/allowed in the hierarchical database.
Most of the older DBMS such as Dbase, FoxPro etc. are hierarchical which are rarely used now a days.
NETWORK DBMS A DBMS is said to be a Network DBMS if the
relationships among data in the database are of type many-to-many.
The relationships among many-to-many appears in the form of a network. Thus the structure of a network database is extremely complicated because of these many-to-many relationships in which one record can be used as a key of the entire database.
A network database is structured in the form of a graph that is also a data structure.
Though the structure of such a DBMS is highly complicated however it has two basic elements i.e. records and sets to designate many-to-many relationships.
Mainly high-level languages such as Pascal, COBOL and FORTRAN etc. were used to implement the records and set structures.
RELATIONAL DBMS A DBMS is said to be a Relational DBMS
or RDBMS if the database relationships are treated in the form of a table.
A statical table that is composed of rows and columns is used to organize the database and its structure and is actually a two dimension array in the computer memory.
A number of RDBMS are available however the most popular are Oracle, Sybase, Ingress, Informix, Microsoft SQL Server, Microsoft Access and so on.
MULTIDIMENSIONAL DBMS The multidimensional structure is similar to
the relational model. The dimensions of the cube looking model
have data relating to elements in each cell. This structure gives a spreadsheet like view
of data. This structure is easy to maintain because
records are stored as fundamental attributes, the same way they’re viewed and the structure is easy to understand. Its high performance has made it the most popular database structure when it comes to enabling online analytical processing (OLAP).
OBJECT ORIENTED DATABASE The object oriented structure has the ability
to handle graphics, pictures, voice and text, types of data, without difficultly unlike the other database structures.
This structure is popular for multimedia Web-based applications. It was designed to work with object-oriented programming languages such as Java
OPERATIONAL DATABASE These databases store detailed data about the
operations of an organization. They are typically organized by subject matter,
process relatively high volumes of updates using transactions.
Essentially every major organization on earth uses such databases.
Examples include customer databases that record contact, credit, and demographic information about a business' customers, personnel databases that hold information such as salary, benefits, skills data about employees, manufacturing databases that record details about product components, parts inventory, and financial databases that keep track of the organization's money, accounting and financial dealings.
DATA WAREHOUSE Data warehouses archive modern data from
operational databases and often from external sources such as market research firms.
Often operational data undergoes transformation on its way into the warehouse, getting summarized, anonymized, reclassified, etc.
The warehouse becomes the central source of data for use by managers and other end-users who may not have access to operational data.
For example, sales data might be aggregated to weekly totals and converted from internal product codes to use UPC codes so that it can be compared with ACNielsen data.
Some basic and essential components of data warehousing include retrieving and analyzing data, transforming,loading and managing data so as to make it available for further use.
ANALYTICAL DATABASE Analysts may do their work directly
against a data warehouse, or create a separate analytic database for Online Analytical Processing.
For example, a company might extract sales records for analyzing the effectiveness of advertising and other sales promotions at an aggregate level.
DISTRIBUTED DATABASE These are databases of local work-
groups and departments at regional offices, branch offices, manufacturing plants and other work sites.
These databases can include segments of both common operational and common user databases, as well as data generated and used only at a user’s own site.
END-USER DATABASE These databases consist of data
developed by individual end-users. Examples of these are collections of
documents in spreadsheets, word processing and downloaded files, or even managing their personal baseball card collection.
EXTERNAL DATABASE These databases contain data
collected for use across multiple organizations, either freely or via subscription.
The Internet Movie Database is one example.
HYPERMEDIA DATABASES The Worldwide web can be thought of
as a database, albeit one spread across millions of independent computing systems.
Web browsers "process" this data one page at a time, while web crawlers and other software provide the equivalent of database indexes to support search and other activities.
DATABASE SECURITY Database security is the system,
processes, and procedures that protect a database from unintended activity.
Unintended activity can be categorized as authenticated misuse, malicious attacks or inadvertent mistakes made by authorized individuals or processes.
Database security is also a specialty within the broader discipline of computer security.
Traditionally databases have been protected from external connections by firewalls or routers on the network perimeter with the database environment existing on the internal network opposed to being located within a demilitarized zone.
Additional network security devices that detect and alert on malicious database protocol traffic include network intrusion detection systems along with host-based intrusion detection systems.
Database security is more critical as networks have become more open.
Databases provide many layers and types of information security, typically specified in the data dictionary, including: Access control Auditing Authentication Encryption Integrity controls
Database security can begin with the process of creation and publishing of appropriate security standards for the database environment. The standards may include specific controls for the various relevant database platforms; a set of best practices that cross over the platforms; and linkages of the standards to higher level polices and governmental regulations.
DATA MANAGEMENT Definition
The official definition provided by Data Management Association (DAMA) : "Data Resource Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise." This definition is fairly broad and encompasses a number of professions which may not have direct technical contact with lower-level aspects of data management, such as relational database management.
Alternatively, the definition provided in the DAMA Data Management Body of Knowledge (DAMA-DMBOK) is: "Data management is the development, execution and supervision of plans, policies, programs and practices that control, protect, deliver and enhance the value of data and information assets."
Topics Data Governance
Data asset Data governance Data steward
Data Architecture, Analysis and Design Data analysis Data architecture Data modeling
Database Management Data maintenance Database administration Database management system
Data Security Management Data access Data erasure Data privacy Data security
Data Quality Management Data cleansing Data integrity Data quality Data quality assurance
Reference and Master Data Management Data integration Master Data Management Reference data
Data Warehousing and Business Intelligence Management Business intelligence Data mart Data mining Data movement (extract, transform and load) Data warehousing
Document, Record and Content Management Document management system Records management
Meta Data Management Meta-data management Metadata Metadata discovery Metadata publishing Metadata registry
Contact Data Management Business continuity planning Marketing operations Customer data integration Identity management Identity theft Data theft ERP software CRM software Address (geography) Postal code Email address Telephone number
DATA WAREHOUSE Definition
A data warehouse is a repository (collection of resources that can be accessed to retrieve information) of an organization's electronically stored data, designed to facilitate reporting and analysis. More simply, a data warehouse is a collection of a large amount of data.
ArchitectureArchitecture, in the context of an
organization's data warehousing efforts, is a conceptualization of how the data warehouse is built. There is no right or wrong architecture, but rather there are multiple architectures that exist to support various environments and situations. The worthiness of the architecture can be judged from how the conceptualization aids in the building, maintenance, and usage of the data warehouse.
One possible simple conceptualization of a data warehouse architecture consists of the following interconnected layers: Operational database layer
The source data for the data warehouse — An organization's Enterprise Resource Planning systems fall into this layer.
Data access layerThe interface between the operational and
informational access layer — Tools to extract, transform, load data into the warehouse fall into this layer.
Metadata layerThe data directory — This is usually more
detailed than an operational system data directory. There are dictionaries for the entire warehouse and sometimes dictionaries for the data that can be accessed by a particular reporting and analysis tool.
Informational access layerThe data accessed for reporting and analyzing
and the tools for reporting and analyzing data — Business intelligence tools fall into this layer. The Inmon-Kimball differences about design methodology, discussed later in this article, have to do with this layer
Benefit A data warehouse provides a common data model
for all data of interest regardless of the data's source. This makes it easier to report and analyze information than it would be if multiple data models were used to retrieve information such as sales invoices, order receipts, general ledger charges, etc.
Prior to loading data into the data warehouse, inconsistencies are identified and resolved. This greatly simplifies reporting and analysis.
Information in the data warehouse is under the control of data warehouse users so that, even if the source system data are purged over time, the information in the warehouse can be stored safely for extended periods of time.
Because they are separate from operational systems, data warehouses provide retrieval of data without slowing down operational systems.
Data warehouses can work in conjunction with and, hence, enhance the value of operational business applications, notably customer relationship management (CRM) systems.
Data warehouses facilitate decision support system applications such as trend reports (e.g., the items with the most sales in a particular area within the last two years), exception reports, and reports that show actual performance versus goals.
DisadvantagesData warehouses are not the optimal
environment for unstructured data. Because data must be extracted,
transformed and loaded into the warehouse, there is an element of latency in data warehouse data.
Over their life, data warehouses can have high costs.
Data warehouses can get outdated relatively quickly. There is a cost of delivering suboptimal information to the organization.
There is often a fine line between data warehouses and operational systems. Duplicate, expensive functionality may be developed. Or, functionality may be developed in the data warehouse that, in retrospect, should have been developed in the operational systems.