Database Design & System Analysis
-
Upload
hemanatchiyar -
Category
Documents
-
view
223 -
download
6
Transcript of Database Design & System Analysis
Course of Database Design and
System Analysis
Database System
Databases• Database
– A database is a collection of data, typically describing the activities of one or more related organizations. For example, a university database might contain information about the following:
– Entities such as students, faculty, courses, and classrooms.
– Relationships between entities, such as students' enrollment in courses, faculty
– teaching courses, and the use of rooms for courses.
• Databases are useful– Many computing applications deal with large
amounts of information.– Database systems give a set of tools for storing,
searching and managing this information.
Introduction
• Data: refer to what actually stored in the database.• Field: Group of characters with specific meaning• Record: Logically connected fields that describe a
person, place, or thing• File: Collection of related records• Information: it refer to the meaning of that data as
understood by some users.• Single-User System: is a system in which at most one
user can access the database at any given time.• Multi-User System: is a system in which many users can
access the database at the same time.A major objective of Multi-User Systems is to allow each user
to behave as if he or she were working with a single-user system instead.
The data in the database will be both integrated and shared.
Database Management System(DBMS)
• Collection of interrelated data• Is a layer of software between the data as physically stored
and users of system• Manages very large amounts of data.• Set of programs to access the data • DBMS contains information about a particular enterprise
• Database Applications:– Banking: all transactions– Airlines: reservations, schedules– Universities: registration, grades– Sales: customers, products, purchases– Manufacturing: production, inventory, orders, supply chain– Human resources: employee records, salaries, tax
deductions• Databases touch all aspects of our lives
Database Management System
(DBMS)
Database Management System(DBMS)
Database
End users
Application Programs
Figure (1): Simplified Picture of a database system
Why Use a DBMS?
• Data independence and efficient access.• Reduced application development time.• Data integrity and security.• Uniform data administration.• Concurrent access, recovery from
crashes.
Database Models• Collection of logical constructs used to
represent data structure and relationships within the database
• Implementation Database Models– Hierarchical – Network – Relational
Hierarchical Database Model• Logically represented by an upside
down tree– Each parent can have many children– Each child has only one parent
Network Database Model• Each record can have multiple parents
– Composed of sets– Each set has owner record and member record– Member may have several owners
Relational Database Model
• Perceived by user as a collection of tables for data storage
• Tables are a series of row/column intersections
• Tables related by sharing common entity characteristic(s)
Relational Database Model (con’t.)
Data Models
• A data model is a collection of concepts for describing data.
• A schema is a description of a particular collection of data, using the given data model.
• The relational model of data is the most widely used model today.– Main concept: relation, basically a table with rows
and columns.– Every relation has a schema, which describes the
columns, or fields.
Levels of Abstraction• Many views, single
conceptual (logical) schema and physical schema.– Views describe how users
see the data.
– Conceptual schema defines logical structure
– Physical schema describes the files and indexes used.
Physical Schema
Conceptual Schema
View 1 View 2 View 3
Example: University Database• Conceptual schema:
– Students (sid: string, name: string, login: string, age: integer, gpa:real(
– Courses (cid: string, cname:string, credits:integer) – Enrolled (sid:string, cid:string, grade:string)
• Physical schema:– Relations stored as unordered files. – Index on first column of Students.
• External Schema (View): – Course_info(cid:string,enrollment:integer)
Instance of Students Relation
Students( sid: string, name: string, login: string ,age: integer, gpa: real(
sidname login age gpa53666Jonesjones@cs 18 3.4
53688Smithsmith@ee 18 3.253650Smithsmith@math 19 3.8
The Entity-Relationship Model
Introduction to Entity-Relationship (E-R) Modeling
• Notation uses three main constructs– Data entities– Attributes– Relationships
• Entity-Relationship (E-R) Diagram– A detailed, logical representation of the
entities, associations and data elements for an organization or business
Entity-Relationship (E-R) Modeling
Key Terms• Entity
– A person, place, object, event or concept in the user environment about which the organization wishes to maintain data
– Represented by a rectangle in E-R diagrams• Entity Type
– A collection of entities that share common properties or characteristics
• Attribute– A named property or characteristic of an entity that is
of interest to an organization
Entity-Relationship (E-R) Modeling
Key Terms• Candidate keys and identifiers
– Each entity type must have an attribute or set of attributes that distinguishes one instance from other instances of the same type
– Candidate key• Attribute (or combination of attributes) that
uniquely identifies each instance of an entity type
Entity-Relationship (E-R) Modeling
Key Terms• Identifier
– A candidate key that has been selected as the unique identifying characteristic for an entity type1. Avoid using intelligent keys
– Selection rules for an identifier1. Choose a candidate key that will not change its value2. Choose a candidate key that will never be null3. Consider substituting single value surrogate keys for
large composite keys
Notation Guide
• ENTITY TYPE
• WEAK ENTITY TYPE
• RELATIONSHIP TYPE
• IDENTIFYING RELATIONSHIP TYPE
…Notation Guide
• ATTRIBUTE
• KEY ATTRIBUTE
• MULTIVALUED ATTRIBUTE
• DERIVED ATTRIBUTE
• COMPOSITE ATTRIBUTE
_____
. . .
E1 R E2
E1 R E21 N
E2(min,max)
…Notation Guide
• TOTAL PARTICIPATION OF E2 IN R
• CARDINALITY RATIO 1:N FOR E1:E2 IN R
• STRUCTURAL CONSTRAINT (min, max) ON PARTICIPATION OF E IN R (Alternative Notation)
R
ER Diagram Basics
Relationship
Attributes
Entity
Product
Keeps
Store
descrip
qty
price
pname
manager
Locations
snameEntity
Product
Keeps
Store
descrip
qty
price
pname
manager
Locations
sname
Entity
Real-world object distinguishable from other objects (e.g a student, car, job, subject, building ...)
• An entity is described using a set of attributes
– In the Company database, an employee’s car is of lesser importance– In the Department of Transportation’s registration database, cars may be the most important concept– In both cases, cars will be represented as entities; but with different levels of detail
Entity Sets
A collection of similar entities (e.g. all employees)• All entities in an entity set have the same set of
attributes• Each entity set has a key• Each attribute has a domain• Can map entity set to a relation easily
SSN NAME SAL321-23-3241 Kim 23,000645-56-7895 Jones 45,000
EMPLOYEES
Entity TypeDefines set of entities that have
the same attributes (e.g. EMPLOYEE)
• Each Entity Type is described by its NAME and attributes
• The Entity Type describes the “Schema” or “Intension” for a set of entities
• Collection of all entities of a particular entity type at a given point in time is called the “Entity Set” or “Extension” of an Entity Type
• Entity Type and Entity Set are customarily referred to by the same name
EMPLOYEE
salSSN
name
Notation
EMPLOYEE
salSSN
name
Notation
Attributes
• Key Attributes• Value Sets of Attributes• Null Valued Attributes• Attribute Types
– Composite Vs. Simple Attributes– Single-valued Vs. Multi-valued Attributes– Derived Vs. Stored Attributes
Notation
Key Attributes: Identifier
• Key (or uniqueness) constraints are applied to entity types
• Key attribute’s values are distinct for each individual entity in the entity set
• A key attribute has its name underlined inside the oval
• Key must hold for every possible extension of the entity type
• Multiple keys are possible
EMPLOYEE
SSN
Composite Vs. Simple AttributesComposite attributes can be divided into
smaller parts which represent simple attributes with independent meaning
• Simple Attribute: Aircraft-Type• Complex Attribute: Aircraft-Location
which is comprised of :Aircraft-LatitudeAircraft-LongitudeAircraft-Altitude
Notation
… There is no formal concept of “compositeattribute” in the relational model
Simple attributes can either be single-valuedor multi-valued• Single-valued: Gender = F Notation
• Multivalued: Degree = {BSc, MInfTech} Notation
… An “attribute” in the relational model is always single valued - Values are atomic!
Single Vs. Multivalued Attributes
Derived Vs. Stored Attributes
Some attribute values can be derived fromrelated attribute values:• Age ® Date - B-day• Y-Sal ® 12 * M-Sal
EMPLOYEE
M-salB-days Y-sal
Age
Notation
Derived Vs. Stored Attributes
• Some attribute values can be derived from attributed values of related entities
• total-value ® sum (qty * price)
Order
Item price
qty
Total-Value
Representing Attributes
• Parenthesis ( ) for composite attributes• Brackets { } for multi-valued attributes
Assume a person can have more than one residence and each residence can have multiple telephones
{AddressPhone ({ Phone ( AreaCode,PhoneNum ) }, Address (StreetAddresss (Number, Street, AptNo),
City,State,PostalCode) ) }
Example of Elements of E-R Model
Entity Sets Departments Professors Students Administrators
Attributes Name of Departments, Phone No., Address... Name, SSN, Address of Professors...
Relationship Students and Professors are under a certain
department Admin manage the campus/ departments
Key Definitions• Primary Key:
– One attribute whose value can uniquely identify a complete record (one row of data) within an entity.
• Composite Primary Key– A primary key that consists of two or more
attribute within an entity.• Foreign Key
– A copy of a primary key that exists in another entity for the purpose of forming a relationship between the entities involved.
Degrees of a Relationship
Man Woman
Customer Order
Course Subject
One-to-one (1:1)
One-to-many (1:n)
Many-to-many (n:m)
NOTE: Every many to many relationship consists of two one to many relationships working in opposite directions
1 M
1 1
M M
Degrees of relationship, alternative representation
Man Woman
Customer Order
Course Subject
One-to-one (1:1)
One-to-many (1:n)
Many-to-many (n:m)
NOTE: Every many to many relationship consists of two one to many relationships working in opposite directions
A person must own at least one car. A car doesn’t have to be owned by a person, but if it is, it is owned by at least one person. A person may own many cars.
Notation for optional attributes
CarPerson
mandatory relationshipoptional relationship
1 M
A Sample ER Diagram
A Student Record Entity Diagram
Student
Course Subject
Example of the 3 elements in E/R Diagram
Entity B
Relationship
Attribute
Entity C
Entity A
Attribute
Attribute
E-R Diagram : Examples
• Add some attributes to entities here• Courses may have another course as pre-requisite
E-R Diagram : Examples
• Add some attributes to entities here• Courses may have another course as pre-requisite
Relationship Degree
• The degree of a relationship type is the number of participating entity types– 2 entities: Binary Relationship 3 entities: Ternary Relationship n entities: N-ary Relationship– Same entity type could participate in multiple relationship types
Part
Supplier Supply Project
Employees
Departments
Works_In
Assigned_to
Ternary
Multiple
Binary
StarsStars-in
year
Studios
Movies
length
Title AddressName
ownsfileType
Name Address
E-R Diagram : Examples
Kinds of Constraints
What kind of constraints can be defined in the ER Model?
• Cardinality Constraints• Participation ConstraintsTogether called “Structural Constraints”
Constraints are represented byspecific notation in the ER diagram
• The “Cardinality Ratio” for a binary relationship specifies the number of relationship instances that an entity can participate in– Works-In is a binary relationship– Participating entities are DEPARTMENT : EMPLOYEE– One department can have Many employees - Cardinality Ratio is 1 : N
Employees
Works_In
Departments
Possible Cardinality Ratios
One-to-Many A film is directed by at most one director A director can direct any number of films
Directorid
name
Directed Film title
Director Directed Film
Many-to-Many A film is directed by any number of directors A director can direct any number of films
Directorid
name
Directed Film title
Director Directed Film
One-to-One A film is directed by at most one director A director can direct at most one film
Directorid
name
Directed Film title
Director Directed Film
Another Example
Personid
name
age
FatherOf
Where would you put the arrow?
father
child
Example Cardinality Constraints
How many Employees can work in a Department? One employee can work in only one departmentHow many Employees can be employed by a Department? One department can employ many employeesHow many managers can a department have? One department can have only one managerHow many departments can an employee manage? One employee can have manage only one department
Normalisation
Introduction
• Normalization: is the process of efficiently organizing
data in a database with two goals in mind.
• First goal: eliminate redundant data
– for example, storing the same data in more than one
table
• Second Goal: ensure data dependencies make sense
– for example, only storing related data in a table
Benefits of Normalization
• Less storage space
• Quicker updates
• Less data inconsistency
• Clearer data relationships
• Easier to add data
• Flexible Structure
Redundancy and Data Anomalies
Example: We have the following relation that contains staff and department details:
staffNo job dept dname citySL10 Salesman 10 Sales Stratford SA51 Manager 20 Accounts BarkingDS40 Clerk 20 Accounts BarkingOS45 Clerk 30 Operations Barking
Redundant data is where we have stored the same ‘information’ more than once. i.e., the redundant data could be removed without the loss of information.
Insert Anomaly: We can’t insert a dept without inserting a member of staff that works in that department
Update Anomaly: We could change the name of the dept that SA51 works in without simultaneously changing the dept that DS40 works in.
Deletion Anomaly: By removing employee SL10 we have removed all information pertaining to the Sales dept.
Such ‘redundancy’ could lead to the following ‘anomalies’
Repeating GroupsA repeating group is an attribute (or set of attributes) that can have more than one value for a primary key value.
Example: We have the following relation that contains staff and department details and a list of telephone contact numbers for each member of staff.
staffNo job dept dname city contact numberSL10 Salesman 10 Sales Stratford 018111777, 018111888, 079311122 SA51 Manager 20 Accounts Barking 017111777DS40 Clerk 20 Accounts BarkingOS45 Clerk 30 Operations Barking 079311555
Repeating Groups are not allowed in a relational design, since all attributes have to be ‘atomic’ - i.e., there can only be one value per cell in a table!
Functional Dependency
Formal Definition: Attribute B is functionally dependant upon attribute A (or a collection of attributes) if a value of A determines a single value of attribute B at any one time.
Formal Notation: A B This should be read as ‘A determines B’ or ‘B is functionally dependant on A’. A is called the determinant and B is called the object of the determinant.
staffNo job dept dname SL10 Salesman 10 SalesSA51 Manager 20 AccountsDS40 Clerk 20 AccountsOS45 Clerk 30 Operations
Example:
staffNo jobstaffNo deptstaffNo dnamedept dname
Functional Dependencies
Dependencies: Definitions
• Partial Dependency – when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key.
CUSTOMER
Cust_ID Name Order_ID
101 AT&T 1234
101 AT&T 156
125 Cisco 1250
Partial Dependency
Dependencies: Definitions• Transitive Dependency – when a non-key
attribute determines another non-key attribute.
EMPLOYEE
Emp_ID F_Name L_Name Dept_ID Dept_Name
111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Transitive Dependency
Example: Table 1
Title Author1 Author2 ISBN Subject Pages Publisher
Database System
Concepts
Abraham Silberschatz
Henry F. Korth
0072958863 MySQL, Computers
1168 McGraw-Hill
Operating System
Concepts
Abraham Silberschatz
Henry F. Korth
0471694665 Computers 944 McGraw-Hill
Table 1 problems
• This table is not very efficient with storage. • This design does not protect data integrity.
• Third, this table does not scale well.
First Normal Form
• In our Table 1, we have two violations of First Normal Form:
• First, we have more than one author field, • Second, our subject field contains more
than one piece of information. With more than one value in a single field, it would be very difficult to search for all books on a given subject.
First Normal Table
• Table 2
Title Author ISBN Subject Pages Publisher
Database System Concepts
Abraham Silberschatz
0072958863 MySQL 1168 McGraw-Hill
Database System Concepts
Henry F. Korth 0072958863 Computers 1168 McGraw-Hill
Operating System Concepts
Henry F. Korth 0471694665 Computers 944 McGraw-Hill
Operating System Concepts
Abraham Silberschatz
0471694665 Computers 944 McGraw-Hill
• We now have two rows for a single book. Additionally, we would be violating the Second Normal Form…
• A better solution to our problem would be to separate the data into separate tables- an Author table and a Subject table to store our information, removing that information from the Book table:
Subject_ID Subject
1 MySQL
2 Computers
Author_ID Last Name First Name
1 Silberschatz Abraham
2 Korth Henry
ISBN Title Pages Publisher
0072958863 Database System Concepts
1168 McGraw-Hill
0471694665 Operating System Concepts
944 McGraw-Hill
Subject Table
Author Table
Book Table
• Each table has a primary key, used for joining tables together when querying the data. A primary key value must be unique with in the table (no two books can have the same ISBN number), and a primary key is also an index, which speeds up data retrieval based on the primary key.
• Now to define relationships between the tables
Relationships
ISBN Author_ID
0072958863 1
0072958863 2
0471694665 1
0471694665 2
ISBN Subject_ID
0072958863 1
0072958863 2
0471694665 2
Book_Author TableBook_Subject Table
Second Normal Form
• As the First Normal Form deals with redundancy of data across a horizontal row, Second Normal Form (or 2NF) deals with redundancy of data in vertical columns.
• As stated earlier, the normal forms are progressive, so to achieve Second Normal Form, the tables must already be in First Normal Form.
• The Book Table will be used for the 2NF example
2NF Table
Publisher_ID Publisher Name
1 McGraw-Hill
ISBN Title Pages Publisher_ID
0072958863 Database System Concepts
1168 1
0471694665 Operating System Concepts
944 1
Publisher Table
Book Table
2NF
• Here we have a one-to-many relationship between the book table and the publisher. A book has only one publisher, and a publisher will publish many books. When we have a one-to-many relationship, we place a foreign key in the Book Table, pointing to the primary key of the Publisher Table.
• The other requirement for Second Normal Form is that you cannot have any data in a table with a composite key that does not relate to all portions of the composite key.
Third Normal Form
• Third normal form (3NF) requires that there are no functional dependencies of non-key attributes on something other than a candidate key.
• A table is in 3NF if all of the non-primary key attributes are mutually independent
• There should not be transitive dependencies
Boyce-Codd Normal Form
• BCNF requires that the table is 3NF and only determinants are the candidate keys
Stages of Normalisation
Unnormalised (UDF)
First normal form(1NF)
Remove repeating groups
Second normal form(2NF)
Remove partial dependencies
Third normal form(3NF)
Remove transitive dependencies
Boyce-Codd normalform (BCNF)
Remove remaining functional dependency anomalies
Fourth normal form(4NF)
Remove multivalued dependencies
Fifth normal form(5NF)
Remove remaining anomalies
DISTRIBUTED DATABASE SYSTEM
DISTRIBUTED DATABASESWHAT IS A DISTRIBUTED DATABASE?
Distributed: Deals with Physical distribution of data over multiple sites.
A distributed database system is a collection of logically related databases that co-operate in a transparent manner.
DISTRIBUTED DATABASES Stores logically related database over
physically independent sites
• Reduced Communication Overhead Most data access is local, less expensive and performs better.
• Improved Processing Power Instead of one server handling the full database, we now have a collection of machines handling the same database. • Removal of Reliance on a Central Site If a server fails, then the only part of the system that is affected is the relevant local site. The rest of the system remains functional and available.
DISTRIBUTED DATABASESADVANTAGES
• Expandability It is easier to accommodate increasing the size of the global (logical) database.
• Local autonomy The database is brought nearer to its users. This can effect a cultural change as it allows potentially greater control over local data .
DISTRIBUTED DATABASESADVANTAGES
Homogeneous & Heterogeneous DDBMSs
Homogeneous: All sites use same DBMS product. Much easier to design and manage. Approach provides incremental growth Allows increased performance.
Heterogeneous: Sites may run different DBMS products, underlying data models. • Sites implemented their own databases-integration considered later. • Translations required to allow for
• Typical solution is to use gateways.
Homogeneous & Heterogeneous DDBMSs
• Different hardware. • Different DBMS products.• Different hardware and DBMS products.
DISTRIBUTED DATABASES Fragmentation
Why fragment?Usage:
- Apps work with views rather than entire relations.
Efficiency:- It’s more efficient if data is close to where it is frequently used.
Security:- Data not required by local applications is not stored at the local site.
Parallelism:- It is possible to run several ‘sub-queries’ in tandem.
DISTRIBUTED DATABASES Fragmentation
Four types of fragmentation:1. Horizontal2. Vertical3. Mixed4. Derived
DISTRIBUTED DATABASESHORIZONTAL DATA FRAGMENTATION
333.00STRATFORDKHAN456500.00BARKINGONO400340.14BARKINGGREEN350
23.17STRATFORDSMITH345200.00BARKINGGRAY324
1000.00STRATFORDJONES200
BALANCEBRANCHCUSTOMERACCOUNT
Horizontal Fragmentation: Consists of a Restriction on a Relation.
e.g., ( branch = ‘Stratford’ Account)
DISTRIBUTED DATABASESHORIZONTAL DATA FRAGMENTATION
STRATFORDSTRATFORDSTRATFORD
333.00KHAN45623.17SMITH345
1000.00JONES200
BALANCEBRANCHCUSTOMERACCT NO.
BARKINGBARKINGBARKING
500.00ONO400340.14GREEN350200.00GRAY324
BALANCEBRANCHCUSTOMERACCT NO.
STRATFORD BRANCH
BARKING BRANCH
DISTRIBUTED DATABASESVERTICAL DATA FRAGMENTATION
KJTR78KHA456T0208-500-5821STRATFORDKHAN456
ZZEE56GRA324S0208-545-7528BARKINGGRAY324
XXYY22JON200T0208-500-9000STRATFORDJONES200
PASSWORDLOGINPHONE NOSITENAMES#
Vertical Fragmentation: Consists of a Projection on a Relation.
e.g., ( S#, NAME, SITE, PHONE NO Student)
DISTRIBUTED DATABASESVERTICAL DATA FRAGMENTATION
STRATFORDBARKING
STRATFORD
KHAN456GRAY324
0208-500-5821
0208-545-7528
0208-500-9000JONES200
PHONE NO.SITENAMES#
KJTR78ZZEE56XXYY22
KHA456T456GRA324S324JON200T200
PASSWORDLOGIN-IDS#
STUDENT ADMINISTRATION
NETWORK ADMINISTRATION
•Horizontal Fragmentation
•Rows split : Sal > 20K
•Vertical Fragmentation
Columns split : Primary Key retained
Id Name Sal Dept100 A 10K D1200 B 20K D2300 C 30K D3
Id Name Sal Dept100 A 10K D1200 B 20K D2
Id Name Sal Dept300 C 30K D3
Id Name
100 A
200 B
300 C
Id Sal Dept100 10K D1
200 20K D2
300 30K D3
Structured Query Language SQL
Introduction to SQL
• SQL is a standard language for accessing and manipulating databases
• What is SQL?– SQL stands for Structured Query Language – SQL lets you access and manipulate databases – SQL is an ANSI (American National Standards
Institute) standard
What Can SQL do?
• SQL can execute queries against a database • SQL can retrieve data from a database • SQL can insert records in a database • SQL can update records in a database • SQL can delete records from a database • SQL can create new databases • SQL can create new tables in a database • SQL can create stored procedures in a database • SQL can create views in a database • SQL can set permissions on tables, procedures, and
views
SQL DML and DDL
• SQL can be divided into two parts: The Data Manipulation Language (DML) and the Data Definition Language (DDL)
• The query and update commands form the DML part of SQL:– SELECT - extracts data from a database – UPDATE - updates data in a database – DELETE - deletes data from a database – INSERT INTO - inserts new data into a database
SQL DML and DDL
• The DDL part of SQL permits database tables to be created or deleted. It also define indexes (keys), specify links between tables, and impose constraints between tables. The most important DDL statements in SQL are:– CREATE DATABASE - creates a new database – ALTER DATABASE - modifies a database – CREATE TABLE - creates a new table – ALTER TABLE - modifies a table – DROP TABLE - deletes a table – CREATE INDEX - creates an index (search key) – DROP INDEX - deletes an index
SQL SELECT Statement
• The SELECT statement is used to select data from a database.
• The result is stored in a result table, called the result-set.
• SQL SELECT Syntax SELECT column_name(s(
FROM table_nameand SELECT * FROM table_name
SQL SELECT Example The "Persons" table:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes2 Svendson Tove Borgvn 23 Sandnes3 Pettersen Kari Storgt 20 Stavanger
Now we want to select the content of the columns named "LastName" and "FirstName" from the table above.
We use the following SELECT statement:
SELECT LastName, FirstName FROM Persons
The result-set will look like this:
LastName FirstNameHansen Ola
Svendson TovePettersen Kari
SELECT * Example
• Now we want to select all the columns from the "Persons" table.• We use the following SELECT statement:
SELECT * FROM Persons
Tip: The asterisk (*) is a quick way of selecting all columns!• The result-set will look like this:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
SQL SELECT DISTINCT Statement • In a table, some of the columns may contain duplicate values. This is
not a problem, however, sometimes you will want to list only the different (distinct) values in a table.
• The DISTINCT keyword can be used to return only distinct (different) values.
• SQL SELECT DISTINCT Syntax SELECT DISTINCT column_name(s(
FROM table_name
• SELECT DISTINCT ExampleThe "Persons" table:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
SQL SELECT DISTINCT Statement • Now we want to select only the distinct values from the
column named "City" from the table above.• We use the following SELECT statement: SELECT DISTINCT City FROM PersonsThe result-set will look like this:
CitySandnesStavanger
For numeric values:
This is correct:SELECT * FROM Persons WHERE Year=1965This is wrong:SELECT * FROM Persons WHERE Year='1965'
SQL WHERE Clause
• The WHERE clause is used to filter records.• The WHERE clause is used to extract only those records
that fulfill a specified criterion.
SQL WHERE Syntax
SELECT column_name(s)
FROM table_name
WHERE column_name operator value
WHERE Clause ExampleThe "Persons" table:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
Now we want to select only the persons living in the city "Sandnes" from the table above.
We use the following SELECT statement:
SELECT * FROM PersonsWHERE City='Sandnes'
WHERE Clause ExampleThe result-set will look like this:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
SQL uses single quotes around text values (most database systems will also accept double quotes).Although, numeric values should not be enclosed in quotes.For text values:
This is correct:SELECT * FROM Persons WHERE FirstName='Tove'This is wrong:SELECT * FROM Persons WHERE FirstName=Tove
SQL WHERE Clause
For numeric values:
This is correct:
SELECT * FROM Persons WHERE Year=1965
This is wrong:
SELECT * FROM Persons WHERE Year='1965'
Operators Allowed in the WHERE Clause
Operator Description
= Equal
<> Not equal
> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
BETWEEN Between an inclusive range
LIKE Search for a pattern
IN If you know the exact value you want to return for at least one of the columns
SQL AND & OR Operators• The AND & OR operators are used to filter records
based on more than one condition.• The AND operator displays a record if both the first
condition and the second condition is true.• The OR operator displays a record if either the first
condition or the second condition is true.
• AND Operator Example
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
Now we want to select only the persons with the first name equal to "Tove" AND the last name equal to "Svendson":
SQL AND & OR Operators
We use the following SELECT statement:
SELECT * FROM Persons WHERE FirstName='Tove' AND LastName='Svendson‘
The result-set will look like this:
P_Id LastName FirstName Address City
2 Svendson Tove Borgvn 23 Sandnes
SQL AND & OR Operators• OR Operator Example
• Now we want to select only the persons with the first name equal to "Tove" OR the first name equal to "Ola":
• We use the following SELECT statement:
SELECT * FROM PersonsWHERE FirstName ='Tove'OR FirstName ='Ola‘
The result-set will look like this:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10
Sandnes
2 Svendson Tove Borgvn 23 Sandnes
You can also combine AND and OR (use parenthesis to form complex expressions).Now we want to select only the persons with the last name equal to "Svendson" AND the first name equal to "Tove" OR to "Ola":
We use the following SELECT statement:
SELECT * FROM Persons WHERELastName='Svendson'AND (FirstName='Tove' OR FirstName='Ola')
The result-set will look like this:
P_Id LastName FirstName
Address City
2 Svendson Tove Borgvn 23 Sandnes
Combining AND & OR
SQL ORDER BY Keyword
• The ORDER BY keyword is used to sort the result-set by a specified column.
• The ORDER BY keyword sort the records in ascending order by default.
• If you want to sort the records in a descending order, you can use the DESC keyword.
• SQL ORDER BY Syntax
SELECT column_name(s)FROM table_nameORDER BY column_name(s) ASC|DESC
ORDER BY Example
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
4 Nilsen Tom Vingvn 23 Stavanger
Now we want to select all the persons from the table above, however, we want to sort the persons by their last name.
We use the following SELECT statement:
SELECT * FROM Persons ORDER BY LastName
ORDER BY ExampleThe result-set will look like this:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
4 Nilsen Tom Vingvn 23 Stavanger
3 Pettersen Kari Storgt 20 Stavanger
2 Svendson Tove Borgvn 23 Sandnes
ORDER BY DESC Example:
Now we want to select all the persons from the table above, however, we want to sort the persons descending by their last name.
ORDER BY ExampleWe use the following SELECT statement:
SELECT * FROM PersonsORDER BY LastName DESC
P_Id LastName FirstName Address City
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
4 Nilsen Tom Vingvn 23 Stavanger
1 Hansen Ola Timoteivn 10 Sandnes
The result-set will look like this:
SQL INSERT INTO Statement
The INSERT INTO statement is used to insert new records in a table.The first form doesn't specify the column names where the data will be
inserted, only their values:
INSERT INTO table_name VALUES (value1, value2, value3,...)
The second form specifies both the column names and the values to be inserted:
INSERT INTO table_name (column1, column2, column3,...)VALUES (value1, value2, value3,...)
SQL INSERT INTO ExampleWe have the following "Persons" table
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
Now we want to insert a new row in the "Persons" table.
We use the following SQL statement:
INSERT INTO PersonsVALUES (4,'Nilsen', 'Johan', 'Bakken 2', 'Stavanger')
SQL INSERT INTO Example
The "Persons" table will now look like this:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
4 Nilsen Johan Bakken 2 Stavanger
Insert Data Only in Specified Columns• It is also possible to only add data in specific columns.• The following SQL statement will add a new row, but only add
data in the "P_Id", "LastName" and the "FirstName" columns:
INSERT INTO Persons (P_Id, LastName, FirstName)VALUES (5, 'Tjessem', 'Jakob')
The "Persons" table will now look like this:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
4 Nilsen Johan Bakken 2 Stavanger
5 Tjessem Jakob
SQL UPDATE Statement• The UPDATE statement is used to update existing records in a
table.
• SQL UPDATE Syntax
UPDATE table_nameSET column1=value, column2=value2,...WHERE some_column=some_value
Note: Notice the WHERE clause in the UPDATE syntax. The WHERE clause specifies which record or records that should be updated. If you omit the WHERE clause, all records will be updated!
SQL UPDATE ExampleThe "Persons" table:
P_Id LastName FirstName Address City1 Hansen Ola Timoteivn 10 Sandnes2 Svendson Tove Borgvn 23 Sandnes3 Pettersen Kari Storgt 20 Stavanger4 Nilsen Johan Bakken 2 Stavanger5 Tjessem Jakob
Now we want to update the person "Tjessem, Jakob" in the "Persons" table.We use the following SQL statement:
UPDATE PersonsSET Address='Nissestien 67', City='Sandnes'WHERE LastName='Tjessem' AND FirstName='Jakob'
SQL UPDATE ExampleThe "Persons" table will now look like this:
P_Id LastName FirstName Address City1 Hansen Ola Timoteivn 10 Sandnes2 Svendson Tove Borgvn 23 Sandnes3 Pettersen Kari Storgt 20 Stavanger4 Nilsen Johan Bakken 2 Stavanger5 Tjessem Jakob Nissestien 67 Sandnes
SQL UPDATE WarningBe careful when updating records. If we had omitted the WHERE clause in the example above, like this:
UPDATE PersonsSET Address='Nissestien 67', City='Sandnes'
SQL UPDATE Example
The "Persons" table would have looked like this:
P_Id LastName FirstName Address City
1 Hansen Ola Nissestien 67 Sandnes
2 Svendson Tove Nissestien 67 Sandnes
3 Pettersen Kari Nissestien 67 Sandnes
4 Nilsen Johan Nissestien 67 Sandnes
5 Tjessem Jakob Nissestien 67 Sandnes
SQL DELETE Statement
The DELETE statement is used to delete records in a table.
SQL DELETE Syntax
DELETE FROM table_nameWHERE some_column=some_value
Note: Notice the WHERE clause in the DELETE syntax. The WHERE clause specifies which record or records that should be deleted. If you omit the WHERE clause, all records will be deleted!
SQL DELETE Example
The "Persons" table:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
4 Nilsen Johan Bakken 2 Stavanger
5 Tjessem Jakob Nissestien 67 Sandnes
Now we want to delete the person "Tjessem, Jakob" in the "Persons" table.
We use the following SQL statement:
DELETE FROM PersonsWHERE LastName='Tjessem' AND FirstName='Jakob'
SQL DELETE ExampleThe "Persons" table will now look like this:
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
4 Nilsen Johan Bakken 2 Stavanger
Delete All RowsIt is possible to delete all rows in a table without deleting the table. This means that the table structure, attributes, and indexes will be intact:
DELETE FROM table_nameorDELETE * FROM table_name
Note: Be very careful when deleting records. You cannot undo this statement!
The AVG() FunctionThe AVG() function returns the average value of a numeric column.
SQL AVG() Syntax
SELECT AVG(column_name) FROM table_name
SQL AVG() ExampleWe have the following "Orders" table:
O_Id OrderDate OrderPrice Customer1 2008/11/12 1000 Hansen2 2008/10/23 1600 Nilsen3 2008/09/02 700 Hansen4 2008/09/03 300 Hansen5 2008/08/30 2000 Jensen6 2008/10/04 100 Nilsen
Now we want to find the average value of the "OrderPrice" fields.We use the following SQL statement:
SELECT AVG(OrderPrice) AS OrderAverage FROM Orders
The result-set will look like this:
OrderAverage950
Now we want to find the customers that have an OrderPrice value higher than the average OrderPrice value.
We use the following SQL statement:
SELECT Customer FROM OrdersWHERE OrderPrice>(SELECT AVG(OrderPrice) FROM Orders)
The result-set will look like this:
CustomerHansenNilsenJensen
The AVG() Function
The COUNT() function returns the number of rows that matches a specified criteria.
(1) SQL COUNT(column_name) Syntax The COUNT(column_name) function returns the
number of values (NULL values will not be counted) of the specified column:
SELECT COUNT(column_name) FROM table_name
SQL COUNT() Function
SQL COUNT() Function
(2) SQL COUNT(*) Syntax The COUNT(*) function returns the number of records in a
table:
SELECT COUNT(*) FROM table_name
(3) SQL COUNT(DISTINCT column_name) SyntaxThe COUNT(DISTINCT column_name) function returns the
number of distinct values of the specified column:
SELECT COUNT(DISTINCT column_name) FROM table_name
(1) SQL COUNT(column_name) We have the following "Orders" table:
SQL COUNT() Function Example
O_Id OrderDate OrderPrice Customer1 2008/11/12 1000 Hansen2 2008/10/23 1600 Nilsen3 2008/09/02 700 Hansen4 2008/09/03 300 Hansen5 2008/08/30 2000 Jensen6 2008/10/04 100 Nilsen
Now we want to count the number of orders from "Customer Nilsen".
We use the following SQL statement:
SELECT COUNT(Customer) AS CustomerNilsen FROM OrdersWHERE Customer='Nilsen'
SQL COUNT() Function ExampleThe result of the SQL statement above will be 2, because the customer
Nilsen has made 2 orders in total:
CustomerNilsen2
(2) SQL COUNT(*) ExampleIf we omit the WHERE clause, like this:
SELECT COUNT(*) AS NumberOfOrders FROM Orders
The result-set will look like this:
NumberOfOrders6
which is the total number of rows in the table.
SQL COUNT() Function Example(3) SQL COUNT(DISTINCT column_name) Example Now we want to count the number of unique customers in the
"Orders" table.We use the following SQL statement:
SELECT COUNT(DISTINCT Customer) AS NumberOfCustomers FROM Orders
The result-set will look like this:
NumberOfCustomers3
which is the number of unique customers (Hansen, Nilsen, and Jensen) in the "Orders" table.
SQL INNER JOIN KeywordThe INNER JOIN keyword return rows when there is at least one match in both tables.
SQL INNER JOIN Syntax
SELECT column_name(s)FROM table_name1INNER JOIN table_name2ON table_name1.column_name=table_name2.column_name
SQL INNER JOIN ExampleThe "Persons" table:
P_Id LastName FirstName Address City1 Hansen Ola Timoteivn 10 Sandnes2 Svendson Tove Borgvn 23 Sandnes3 Pettersen Kari Storgt 20 Stavanger
The "Orders" table:
O_Id OrderNo P_Id1 77895 3
2 44678 3
3 22456 1
4 24562 1
5 34764 15
SQL INNER JOIN KeywordNow we want to list all the persons with any orders.
We use the following SELECT statement:
SELECT Persons.LastName, Persons.FirstName, Orders.OrderNoFROM PersonsINNER JOIN OrdersON Persons.P_Id=Orders.P_IdORDER BY Persons.LastName
The result-set will look like this:
LastName FirstName OrderNo
Hansen Ola 22456Hansen Ola 24562
Pettersen Kari 77895Pettersen Kari 44678
The INNER JOIN keyword return rows when there is at least one match in both tables. If there are rows in "Persons" that do not have matches in "Orders", those rows will NOT be listed.