Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase...
Transcript of Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase...
![Page 1: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/1.jpg)
Databases101
MatthewJ.GrahamCACR
MethodsofComputationalScienceCaltech, 3May2011
matthew graham
![Page 2: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/2.jpg)
fliptrap
matthew graham
![Page 3: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/3.jpg)
fliptrap
matthew graham
![Page 4: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/4.jpg)
fliptrap
matthew graham
![Page 5: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/5.jpg)
fliptrap
matthew graham
![Page 6: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/6.jpg)
whatisadatabase?
A structuredcollectionofdataresidingonacomputersystemthatcanbeeasilyaccessed, managedandupdated
Dataisorganisedaccordingtoadatabasemodel
A DatabaseManagementSystem(DBMS) isasoftwarepackagedesignedtostoreandmanagedatabases
matthew graham
![Page 7: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/7.jpg)
whyuseadbms?
dataindependence
efficientandconcurrentaccess
dataintegrity, securityandsafety
uniformdataadministration
reducedapplicationdevelopmenttime
dataanalysistools
matthew graham
![Page 8: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/8.jpg)
scaleofdatabases
"DBs own the sweet spot of1GB to 100TB" (Gray&Hey,2006)
SQLite
MySQL,PostgreSQL
SQLServer, Oracle
∗Hive, HadoopDB
matthew graham
![Page 9: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/9.jpg)
datamodels
A collectionofconceptsdescribinghowstructureddataisrepresentedandaccessed
Withinadatamodel, the schema isasetofdescriptionsofaparticularcollectionofdata
Theschemaisstoredina datadictionary andcanberepresentedinSQL,XML,RDF,etc.
Insemanticsadatamodelisequivalenttoanontology-"aformal, explicitspecificationofasharedconceptualisation"
matthew graham
![Page 10: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/10.jpg)
fliptrap
matthew graham
![Page 11: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/11.jpg)
flat(file)model
Datafilesthatcontainrecordswithnostructuralrelationships
Additionalinformationisrequiredtointerpretthesefilessuchasthefileformatproperties
Hollerith1889patent"ArtofCompilingStatistics"describeshoweveryUS residentcanberepresentedbyastringof80charactersandnumbers
Examples: delimited-separateddata, HTML table
matthew graham
![Page 12: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/12.jpg)
relationalmodel
Dataisorganizedasrelations, attributesanddomains
A relationisatablewithcolumns(attributes)androws(tuples)
Thedomainisthesetofvaluesthattheattributesareallowedtotake
Withintherelation, eachrowisunique, thecolumnorderisimmaterialandeachrowcontainsasinglevalueforeachofitsattributes
ProposedbyE.F.Coddin1969/70
matthew graham
![Page 13: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/13.jpg)
transactions
Anatomicsequenceofactions(read/write)inthedatabase
Eachtransactionhastobeexecuted completely andmustleavethedatabaseinaconsistentstate
Ifthetransactionfailsorabortsmidway, thedatabaseis"rolledback"toitsinitialconsistentstate
Example:AuthorisePaypaltopay $100formyeBaypurchase:-Debitmyaccount $100-Credittheseller'saccount $100
matthew graham
![Page 14: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/14.jpg)
acid
Bydefinition, adatabasetransactionmustbe:
Atomic: allornothing
Consistent: nointegrityconstraintsviolated
Isolated: doesnot interferewithanyothertransaction
Durable: committed transaction effectspersist
matthew graham
![Page 15: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/15.jpg)
concurrency
DBMS ensuresthatinterleavedtransactionscomingfromdifferentclientsdonotcauseinconsistenciesinthedata
Itconvertstheconcurrenttransactionsetintoanewsetthatcanbeexecutedsequentially
Beforereading/writinganobject, eachtransactionwaitsforalock ontheobject
Eachtransactionreleasesallitslockswhenfinished
matthew graham
![Page 16: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/16.jpg)
locks
DMBS cansetandholdmultiplelockssimultaneouslyondifferentlevelsofthephysicaldatastructure
Granularity: atarowlevel, page(abasicdatablock), extent(multiplearrayofpages)orevenanentiretable
Exclusivevs. shared
Optimisticvs. pessimistic
matthew graham
![Page 17: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/17.jpg)
logs
Ensuresatomicityoftransactions
Recoveringafteracrash, effectsofpartiallyexecutedtransactionsareundoneusingthelog
Logrecord:
-- Header(transactionID,timestamp, ...)-- ItemID-- Type-- Oldandnewvalue
matthew graham
![Page 18: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/18.jpg)
partitions
Horizontal: differentrowsindifferenttables
Vertical: differentcolumnsindifferenttables(normalisation)
Range: rowswherevaluesinaparticularcolumnareinsideacertainrange
List: rowswherevaluesinaparticularcolumnmatchalistofvalues
Hash: rowswhereahashfunctionreturnsaparticularvalue
matthew graham
![Page 19: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/19.jpg)
structuredquerylanguage
Appearedin1974fromIBM
Firststandardpublishedin1986; mostrecentin2008
SQL92istakentobedefaultstandard
Differentflavours:
Microsoft/Sybase Transact-SQLMySQL MySQLOracle PL/SQLPostgreSQL PL/pgSQL
matthew graham
![Page 20: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/20.jpg)
create
CREATE DATABASE databaseNameCREATE TABLE tableName (name1type1, name2type2, . . . )
CREATE TABLE star (name varchar(20), ra float, dec float, vmag float)
Datatypes:• boolean, bit, tinyint, smallint, int, bigint;• real/float, double, decimal;• char, varchar, text, binary, blob, longblob;• date, time, datetime, timestamp
CREATE TABLE star (name varchar(20) not null, ra float default 0, ...)
matthew graham
![Page 21: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/21.jpg)
keysCREATE TABLE star (name varchar(20), ra float, dec float, vmag float,
CONSTRAINT PRIMARY KEY (name))
A primarykeyisauniqueidentifierforarowandisautomaticallynotnull
CREATE TABLE star (name varchar(20), ..., stellarType varchar(8),CONSTRAINT stellarType_fk FOREIGN KEY (stellarType)REFERENCES stellarTypes(id))
A foreignkeyisareferentialconstraintbetweentwotablesidentifyingacolumninonetablethatreferstoacolumninanothertable.
matthew graham
![Page 22: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/22.jpg)
insertINSERT INTO tableName VALUES(val1, val2, . . . )
INSERT INTO star VALUES('Sirius', 101.287, -16.716, -1.47)
INSERT INTO star(name, vmag) VALUES('Canopus', -0.72)
INSERT INTO starSELECT ...
matthew graham
![Page 23: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/23.jpg)
delete
DELETE FROM tableName WHERE conditionTRUNCATE TABLE tableNameDROP TABLE tableName
DELETE FROM star WHERE name = 'Canopus'
DELETE FROM star WHERE name LIKE 'C_n%'
DELETE FROM star WHERE vmag > 0 OR dec < 0
DELETE FROM star WHERE vmag BETWEEN 0 and 5
matthew graham
![Page 24: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/24.jpg)
update
UPDATE tableName SET columnName =val1WHERE condition
UPDATE star SET vmag = vmag + 0.5
UPDATE star SET vmag = -1.47 WHERE name LIKE 'Sirius'
matthew graham
![Page 25: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/25.jpg)
select
SELECT selectionList FROM tableList WHERE conditionORDER BY criteria
SELECT name, constellation FROM star WHERE dec > 0ORDER by vmag
SELECT * FROM star WHERE ra BETWEEN 0 AND 90
SELECT DISTINCT constellation FROM star
SELECT name FROM star LIMIT 5ORDER BY vmag
matthew graham
![Page 26: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/26.jpg)
joins
Innerjoin: combiningrelatedrows
SELECT * FROM star s INNER JOIN stellarTypes t ON s.stellarType = t.id
SELECT * FROM star s, stellarTypes t WHERE s.stellarType = t.id
Outerjoin: eachrowdoesnotneedamatchingrow
SELECT * from star s LEFT OUTER JOIN stellarTypes t ON s.stellarType = t.id
SELECT * from star s RIGHT OUTER JOIN stellarTypes t ON s.stellarType = t.id
SELECT * from star s FULL OUTER JOIN stellarTypes t ON s.stellarType = t.id
matthew graham
![Page 27: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/27.jpg)
aggregatefunctions
COUNT,AVG,MIN,MAX,SUM
SELECT COUNT(*) FROM star
SELECT AVG(vmag) FROM star
SELECT stellarType, MIN(vmag), MAX(vmag) FROM starGROUP BY stellarType
SELECT stellarType, AVG(vmag), COUNT(id) FROM starGROUP BY stellarTypeHAVING vmag > 14
matthew graham
![Page 28: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/28.jpg)
views
CREATE VIEW viewName AS . . .
CREATE VIEW region1View ASSELECT * FROM star WHERE ra BETWEEN 150 AND 170
AND dec BETWEEN -10 AND 10
SELECT id FROM region1View WHERE vmag < 10
CREATE VIEW region2View ASSELECT * FROM star s, stellarTypes t WHERE s.stellarType = t.id
AND ra BETWEEN 150 AND 170 AND dec BETWEEN -10 AND 10
SELECT id FROM regionView2 WHERE vmag < 10 and stellarType LIKE 'A%'
matthew graham
![Page 29: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/29.jpg)
indexes
CREATE INDEX indexName ON tableName(columns)
CREATE INDEX vmagIndex ON star(vmag)
A clusteredindexisoneinwhichtheorderingofdataentriesisthesameastheorderingofdatarecords
Onlyoneclusteredindexpertablebutmultipleunclusteredindexes
TypicallyimplementedasB+treesbutalternatetypessuchasbitmapindexforhighfrequencyrepeateddata
matthew graham
![Page 30: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/30.jpg)
cursorsDECLARE cursorName CURSOR FOR SELECT ...OPEN cursorNameFETCH cursorName INTO ...CLOSE cursorName
A cursorisacontrolstructureforsuccessivetraversalofrecordsinaresultset
Slowestwayofaccessingdata
matthew graham
![Page 31: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/31.jpg)
cursorsexample
Foreachrowintheresultset, updatetherelevantstellarmodel
DECLARE @name varchar(20)DECLARE @mag floatDECLARE starCursor CURSOR FOR
SELECT name, AVG(vmag) FROM starGROUP BY stellarType
OPEN starCursorFETCH starCursor INTO @name, @magEXEC updateStellarModel @name, @mag / CALL updateStellarModel(@name, @mag)
CLOSE starCursor
matthew graham
![Page 32: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/32.jpg)
triggersCREATE TRIGGER triggerName ONtableName ...
A trigger is procedural code that isautomatically executed in response tocertaineventsonaparticulartable:• INSERT• UPDATE• DELETE
CREATE TRIGGER starTrigger ON star FOR UPDATE ASIF @@ROWCOUNT = 0 RETURNIF UPDATE (vmag) EXEC refreshModels
GO
matthew graham
![Page 33: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/33.jpg)
storedprocedures
CREATE PROCEDURE procedureName @param1type, . . .AS ...
CREATE PROCEDURE findNearestNeighbour @starName varchar(20) ASBEGIN
DECLARE @ra, @dec floatDECLARE @name varchar(20)SELECT @ra = ra, @dec = dec FROM star WHERE name LIKE @starNameSELECT name FROM getNearestNeighbour(@ra, @dec)
END
EXEC findNearestNeighbour 'Sirius'
matthew graham
![Page 34: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/34.jpg)
normalisation
Firstnormalform: norepeatingelementsorgroupsofelementstablehasauniquekey(andnonullablecolumns)
Secondnormalform: nocolumnsdependentononlypartofthekey
StarName Constellation Area
Thirdnormalform: nocolumnsdependentonothernon-keycolumns
StarName Magnitude Flux
matthew graham
![Page 35: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/35.jpg)
programming
Java
import java.sql.*...String dbURL = "jdbc:mysql://127.0.0.1:1234/test";Connection conn = DriverManager.getConnection(dbUrl, "mjg", "mjg");Statement stmt = conn.createStatement();ResultSet res = stmt.executeQuery("SELECT * FROM star");...conn.close();
matthew graham
![Page 36: Databases 101 - California Institute of Technologygeorge/aybi199/Graham_DB1.pdf · ADatabase Management System (DBMS)is a software package designed to store and manage databases matthew](https://reader030.fdocuments.us/reader030/viewer/2022040103/5ebe66e692a32811154e018a/html5/thumbnails/36.jpg)
programming
Python:
import MySQLdbCon = MySQLdb.connect(host="127.0.0.1", port=1234, user="mjg",
passwd="mjg", db="test")Cursor = Con.cursor()sql = "SELECT * FROM star"Cursor.execute(sql)Results = Cursor.fetchall()...Con.close()
matthew graham