Notes #6 - DBMS / RDBMS Concepts

download Notes #6 - DBMS / RDBMS Concepts

of 16

Transcript of Notes #6 - DBMS / RDBMS Concepts

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    1/16

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    2/16

    Security is generally low in a Data File Environment and sharing integrity is

    also low.

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    3/16

    Database Environment

    In a database environment, data is logically stored in tabular form and often

    possess relations and connections within such other tables.

    In database environment, all files (databases) are created can be opened /edited / deleted using same tool (DBMS Software), so file integrity is very

    high.

    Databases are broken down into smaller Data Files which is stored in

    memory at random locations on related server. Such Data Files are logically

    connected but physically scattered on servers storage device.

    Different usability and accessibility rights awarded to different level of users

    which ensures that the database environment remains very secure. Again, it

    is highly sharable since the core language of all database software are same

    (SQL)

    Data model (Database Models)

    A data model in software engineering is an abstract model that describes how data

    are represented and accessed. Data models formally define data elements and

    relationships among data elements for a domain of interest. According to Hoberman

    (2009), "A data model is a way finding tool for both business and IT professionals,

    which uses a set of symbols and text to precisely explain a subset of real

    information to improve communication within the organization and thereby lead toa more flexible and stable application environment." A data model explicitly

    determines the structure of data or structured data. Typical applications of data

    models include database models, design of information systems, and enabling

    exchange of data. Usually data models are specified in a data modeling language.

    A database model is a theory or specification describing how a database is

    structured and used. Several such models have been suggested. Common models

    include:

    Flat model: This may not strictly qualify as a data model.

    The flat (or table) model consists of a single, two-dimensional array of data elements, where all members of a

    given column are assumed to be similar values, and all

    members of a row are assumed to be related to one another.

    Q8/II (A2006

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    4/16

    Hierarchical model: In this model data is organized into a

    tree-like structure, implying a single upward link in each

    record to describe the nesting, and a sort field to keep the

    records in a particular order in each same-level list.

    Network model: This model organizes data using two

    fundamental constructs, called records and sets. Records

    contain fields, and sets define one-to-many relationships

    between records: one owner, many members.

    Relational model: is a database model based on first-

    order predicate logic. Its core idea is to describe a

    database as a collection of predicates over a finite set of

    predicate variables, describing constraints on the possible values and

    combinations of values.

    Object-relational model: Similar to a relational database

    model, but objects, classes and inheritance are directly

    supported in database schemas and in the query language.

    Concept Oriented Model: This is the conceptualstructuring of a database. Real structure may vary from this

    structuring as this widely depend upon system or database

    designer and may conceive a problem in different way than

    that is actually implemented.

    Star schema is the simplest style of data warehouse

    schema. The star schema consists of a few "fact tables"

    (possibly only one, justifying the name) referencing any

    number of "dimension tables". The star schema isconsidered an important special case of the snowflake schema.

    Properties of Databases (ACID)

    Atomicity

    Atomicity requires that database modifications must follow an all-or-nothing rule.

    Each transaction is said to be atomic if one part of the transaction fails, the entire

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    5/16

    transaction fails and database state is left unchanged. It is critical that the database

    management system maintains the atomic nature of transactions in spite of any

    application, DBMS, operating system or hardware failure. An atomic transaction

    cannot be subdivided, and must be processed in its entirety or not at all. Atomicity

    means that users do not have to worry about the effect of incomplete transactions.

    Transactions can fail for several kinds of reasons:

    Hardware failure: A disk drive fails, preventing some of the transaction's database

    changes from taking effect

    System failure: The user loses their connection to the application before providing

    all necessary information

    Database failure: E.g., the database runs out of room to hold additional data

    Application failure: The application attempts to post data that violates a rule that

    the database itself enforces, such as attempting to create a new account without

    supplying an account number

    Consistency

    The consistency property ensures that the database remains in a consistent state.

    More precisely, it says that any transaction will take the database from one

    consistent state to another consistent state.

    The consistency rule applies only to integrity rules that are within its scope. Thus, if

    a DBMS allows fields of a record to act as references to another record, then

    consistency implies the DBMS must enforce referential integrity: by the time any

    transaction ends, each and every reference in the database must be valid. If a

    transaction consisted of an attempt to delete a record referenced by another, each

    of the following mechanisms would maintain consistency:

    Abort the transaction, rolling back to the consistent, prior state

    Delete all records that reference the deleted record (this is known as cascade

    delete)

    Nullify the relevant fields in all records that point to the deleted record.

    Isolation

    Isolation refers to the requirement that other operations cannot access or see data

    that has been modified during a transaction that has not yet completed. Each

    transaction must remain unaware of other concurrently executing transactions,

    except that one transaction may be forced to wait for the completion of another

    transaction that has modified data that the waiting transaction requires.

    Durability

    Durability is the DBMS's guarantee that once the user has been notified of a

    transaction's success, the transaction will not be lost. The transaction's data

    changes will survive system failure, and that all integrity constraints have been

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    6/16

    satisfied, so the DBMS won't need to reverse the transaction. Many DBMSs

    implement durability by writing transactions into a transaction log that can be

    reprocessed to recreate the system state right before any later failure. A

    transaction is deemed committed only after it is entered in the log.

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    7/16

    Deeper into Database modeling language

    Hierarchical model

    o A hierarchy can link entities either directly or indirectly, and either

    vertically or horizontally. The only direct links in a hierarchy, in so faras they are hierarchical, are to one's immediate superior or to one of

    one's subordinates, although a system that is largely hierarchical can

    also incorporate alternative hierarchies. Indirect hierarchical links can

    extend "vertically" upwards or downwards via multiple links in the

    same direction, following a path.

    o Degree of branching

    Degree of branching refers to the number of direct subordinates or

    children an object has (equivalent to the number of vertices a node

    has). Hierarchies can be categorized based on the "maximum degree",the highest degree present in the system as a whole. Categorization in

    this way yields two broad classes: linear and branching.

    In a linear hierarchy, the maximum degree is 1. In other

    words, all of the objects can be visualized in a lineup, and each

    object (excluding the top and bottom ones) has exactly one

    direct subordinate and one direct superior. Note that this is

    referring to the objects and not the levels; every hierarchy has

    this property with respect to levels, but normally each level can

    have an infinite number of objects. An example of a linear

    hierarchy is the hierarchy of life.

    In a branching hierarchy, one or more objects have a degree

    of 2 or more (and therefore the maximum degree is 2 or higher).

    For many people, the word "hierarchy" automatically evokes an

    image of a branching hierarchy. Branching hierarchies are

    present within numerous systems, including organizations and

    classification schemes. The broad category of branching

    hierarchies can be further subdivided based on the degree.

    A flat hierarchy is a branching hierarchy in which the

    maximum degree approaches infinity, i.e., with a wide span.

    Most often, systems intuitively regarded as hierarchical have at

    most a moderate span. Therefore, a flat hierarchy is often not

    viewed as a hierarchy at all at first blush. For example,

    diamonds and graphite is a flat hierarchy of numerous carbon

    atoms which can be further decomposed into subatomic

    particles.

    Q2 (C2007

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    8/16

    An overlapping hierarchy is a branching hierarchy in which at

    least one objects has two parent objects. For example, a

    graduate student can have two co-supervisors to whom they

    report directly and equally, and who have the same level of

    authority within the university hierarchy (i.e., they have the

    same position or tenure status).

    Network model

    o The network model is a database model conceived as a flexible way ofrepresenting objects and their relationships. Its distinguishing feature

    is that the schema, viewed as a graph in which object types are nodes

    and relationship types are arcs, is not restricted to being a hierarchy or

    lattice.

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    9/16

    o

    Object model

    o A collection of objects or classes through which a program can

    examine and manipulate some specific parts of its world. In other

    words, the object-oriented interface to some service or system. Such

    an interface is said to be the object model of the represented service

    or system.

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    10/16

    Relational model

    o Its central idea is to describe a database as a collection of predicates

    over a finite set of predicate variables, describing constraints on the

    possible values and combinations of values. The content of the

    database at any given time is a finite (logical) model of the database,

    i.e. a set of relations, one per predicate variable, such that all

    predicates are satisfied. A request for information from the database (a

    database query) is also a predicate.

    o The purpose of the relational model is to provide a declarative method

    for specifying data and queries: we directly state what information the

    database contains and what information we want from it, and let the

    database management system software take care of describing data

    structures for storing the data and retrieval procedures for gettingqueries answered.

    Inverted lists and other methods are also used. A given database management

    system may provide one or more of the four models. The optimal structure depends

    on the natural organization of the application's data, and on the application's

    requirements (which include transaction rate (speed), reliability, maintainability,

    scalability, and cost).

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    11/16

    The dominant model in use today is the ad hoc one embedded in SQL, despite the

    objections of purists who believe this model is a corruption of the relational model,

    since it violates several of its fundamental principles for the sake of practicality and

    performance. Many DBMSs also support the Open Database Connectivity API that

    supports a standard way for programmers to access the DBMS.

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    12/16

    DBMS Concepts

    Relations are the total table in which data are inserted and maintained. One or

    more such tables may be linked using

    different types of keys to form a

    database. Such a link helps in relationalintegrity (all related areas are updated

    when a common field is updated) and

    data sufficiency (low redundancy and

    multiplicative errors).

    A relation is again logically divided into

    rows and columns. The columns represent different attributes of the table, one of

    which is generally a primary key (used to decrease redundancy). The rows,

    frequently referred to as tuples in database terminology, are complete information

    on a single item which is indexed (linked / for which the table is actually made) in

    the relation.

    Keys in DBMS

    Primary key: The attribute or combination of attributes that uniquely identifies a

    row or record.

    Foreign Key: an attribute or combination of attributes in a table whose value

    matches a Primary key in another table.

    Composite key: A primary key that consists of two or more attributes is known as

    composite key

    Candidate key: is a column in a table which has the ability to become a primary

    key.

    Alternate Key: Any of the candidate keys that are not part of the primary key is

    called an alternate key. An alternate key is any candidate key which is not selected to be theprimary key.

    Super key - A super key is defined in the relational model as a set of attributes of a

    relation variable for which it holds that in all relations assigned to that variablethere are no two distinct tuples (rows) that have the same values for the attributes

    in this set. Equivalently a super key can also be defined as a set of attributes of a

    variable upon which all attributes of the relation are functionally dependent.

    Secondary key: alternate of primary key.

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    13/16

    DBMS Terminologies

    Database management system (DBMS): Software for establishing,

    updating, and querying (e.g., managing) a database

    Database: Organizing files into related units which are then viewed as a

    single storage. The data in the database are generally made available to a

    wide range of users through sharing and mentioning different rights and roles

    to different classes of users.

    SQL (Structural Query Language): This is the core language of all

    databases and this is also the common platform for different database

    engines to interact.

    Data warehouse: This is a physical repository where relational data are

    organized to provide clean, enterprise-wide data in a standardized format.

    Data warehouse is a huge database that stores current and historical data of

    potential interest to decision makers throughout the company. These data

    originates in different TPS and through other external entry methods.

    Data Marts: These are the subsets of a data warehouse in which a

    summarized and highly focused portion of the organizations data is placed in

    a separate database for a specific set of users. Companies often build

    enterprise-wide warehouses where a central data warehouse serves the

    entire organization; or they create small decentralized warehouses called

    data marts.

    Entity: An entity may be defined as a thing which is recognized as being

    capable of an independent existence and which can be uniquely identified.

    Entities carries attributes to get it uniquely identified.

    Relationship: Two different entities possessing some logical associations

    are physically connected using relationships. Relationships may also have

    attributes attached to it.

    Attributes: These are the features or uniquely identifiable characteristic of

    an element (entity or Relationship).

    Relevance of relational design in DSS

    Multidimensional problem solving: in DSS architecture, problem solving

    requires multiple ways of evaluation of the problem and collecting requisite

    information towards each different evaluation.Q1 (A2005Q2 (A

    Q2 (B2007

    Q5 (B

    Q2 (A

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    14/16

    Critical queries: DBMS and RDBMS can handle complex queries and

    information search which is very useful in DSS.

    Referentially integrated inputs: RDBMS and Relational structuring of data

    helps in connecting related fields and information of a single item or object.

    Data warehousing support: RDBMS can remotely connect to different

    servers to fetch data from and span across boundaries to create a centralized

    data access medium which eventually gives rise to data warehouses.

    Data mart support: RBDMS, through its access rights and different views to

    the same data can create data marts for high involvement decision making

    Sharability and scalability of information: Since a database accepts

    concurrent access, multiple users can log on to the same screen at different

    geographical locations or at different decision points. Information stored in

    the database is highly scalable to offer flexibility at the information

    searchers end.

    Database Normalization

    Normalization is the scientific method of breaking down complex table structures

    into simple table structures using certain rules. This method is used to reduce

    redundancy in table and eliminate the problems of inconsistency and disk space

    usage. The normalization theory is based on the fundamental notion of functional

    dependency. (Given a Relation / Table R, Attribute A is functionally dependent on

    attribute B if each value of A in R is associated with precisely one value of B.

    E.g., >>

    Not Normalized Form

    The relation is kept without any normalization rules and guidelines. E.g., >>

    ECODE DEPT DEPTHEA

    D

    PROJCODE HOURS

    E101 Systems E901 P27

    P51

    P20

    90

    101

    60E305 Sales E906 P27 109

    Code Name CityE1 Mac DelhiE2 Sandra CAE3 Henry Paris

    Q8/I (B2006

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    15/16

    P22 98E508 Admin E908 P51

    P27

    NULL

    72

    First Normal Form (1NF)

    A table is said to be in 1NF if each cell of the table contains precisely one value.

    E.g., >>

    ECODE DEPT DEPTHEA

    D

    PROJCODE HOURS

    E101 Systems E901 P27 90E101 Systems E901 P51 101E101 Systems E901 P20 60

    E305 Sales E906 P27 109E305 Sales E906 P22 98E508 Admin E908 P51 NULLE508 Admin E908 P27 72

    Second Normal Form (2NF)

    A table is said to be in 2NF when it is in 1NF and every attribute in the row is

    functionally dependent on the whole key, and is not just a part of the key.

    Guidelines to convert a table to 2NF:

    Find and remove attributes that are functionally dependent on only a part ofthe key and not on the whole key. Place them in a different table.

    Group the remaining attributes.

    E.g., >>

    ECODE DEPT DEPTHEA

    DE101 Systems E901E305 Sales E906E508 Admin E908

    Third Normal Form (3NF)

    ECODE PROJCOD

    E

    HOURS

    E101 P27 90E101 P51 101E101 P20 60

    E305 P27 109E305 P22 98E508 P51 NULLE508 P27 72

  • 8/9/2019 Notes #6 - DBMS / RDBMS Concepts

    16/16

    A table is said to be in 3NF when it is in 2NF and every non-key attribute is

    functionally dependent only on the primary key.

    Guidelines to convert a table to 3NF:

    Find and remove non-key attributes that are functionally dependent on

    attributes that are not primary key. Place them in a different table containingsame properties

    Group the remaining attributes

    E.g., >>

    ECODE DEPTE101 Systems

    E305 SalesE402 FinanceE508 AdminE607 Finance

    E608 FinanceE104 Systems

    Boyce Codd Normal Form

    A relation is in BCNF only if every determinant is a candidate key.

    Guidelines to convert a table to BCNF

    Find and remove the overlapping candidate keys. Place the part of candidate

    key and the attribute it is functionally dependent on, in another table.

    Group the remaining items into a table.

    E.g., >>

    ECODE NAME PROJCODE HOURSE1 Veronica P2 48E2 Anthony P5 100E3 Mac P6 15E4 Susan P2 250E4 Susan P5 75E1 Veronica P5 40

    DEPT DEPTHEA

    DSystems E901Sales E906Admin E908Finance E909

    ECODE PROJCOD

    E

    HOURS

    E1 P2 48E2 P5 100E3 P6 15E4 P2 250E4 P5 75E1 P5 40