CS 541 Lecture Slides Sunil Prabhakar CS541 Database Systems.
-
Upload
lynn-mckenzie -
Category
Documents
-
view
215 -
download
0
Transcript of CS 541 Lecture Slides Sunil Prabhakar CS541 Database Systems.
CS 541 Lecture SlidesCS 541 Lecture Slides
Sunil PrabhakarSunil Prabhakar
CS541 Database Systems
April 21, 2023 Sunil Prabhakar 2
Instructor
Sunil PrabhakarSunil Prabhakar LWSN 2142CLWSN 2142C Office Hours: catch me or by appointmentOffice Hours: catch me or by appointment [email protected]@cs.purdue.edu http://www.cs.purdue.edu/homes/sunil/http://www.cs.purdue.edu/homes/sunil/
Teaching Assistant: Yasin SilvaTeaching Assistant: Yasin Silva [email protected]@cs.purdue.edu Office hours: TBAOffice hours: TBA Assignments and ProjectsAssignments and Projects
April 21, 2023 Sunil Prabhakar 3
Course Information
Web page:Web page: http://www.cs.purdue.edu/homes/sunil/syllabi/http://www.cs.purdue.edu/homes/sunil/syllabi/
CS541_Fall2004.htmlCS541_Fall2004.html Projects, Assignments, Solutions, SlidesProjects, Assignments, Solutions, Slides
Email aliasEmail alias Announcements: IMPORTANTAnnouncements: IMPORTANT [email protected]@cs.purdue.edu mailer add me to cs541mailer add me to cs541
WebCTWebCT GradesGrades Check that you can log inCheck that you can log in
April 21, 2023 Sunil Prabhakar 4
Course Description
Introductory graduate course on databasesIntroductory graduate course on databases Fundamental concepts & internalsFundamental concepts & internals Some coverage of use of databases (Oracle Some coverage of use of databases (Oracle
projects)projects) Will not teach use of databases!!!Will not teach use of databases!!! Focus on Relational DatabasesFocus on Relational Databases
April 21, 2023 Sunil Prabhakar 5
Topics
DBMS Concepts and ArchitectureDBMS Concepts and Architecture Relational Database Model Relational Database Model Relational Languages (Algebra, Calculus, SQL)Relational Languages (Algebra, Calculus, SQL) Storage and IndexingStorage and Indexing Query ProcessingQuery Processing Query OptimizationQuery Optimization Transaction ProcessingTransaction Processing
Concurrency ControlConcurrency Control RecoveryRecovery
Advanced Topics: TBD (Mining, Indexing, Advanced Topics: TBD (Mining, Indexing, Sensors, …)Sensors, …)
April 21, 2023 Sunil Prabhakar 6
Pre-Requisites
Data StructuresData Structures Notions of trees, hashing, linked lists etc.Notions of trees, hashing, linked lists etc.
Operating Systems Operating Systems I/OI/O
JavaJava Project 3 will be done in Java Project 3 will be done in Java RMIRMI Simple GUISimple GUI
April 21, 2023 Sunil Prabhakar 7
Text
Database System Concepts (4th Edition)Database System Concepts (4th Edition) Silberschatz, Korth, SudarshanSilberschatz, Korth, Sudarshan ISBN: 0-07-228363-7ISBN: 0-07-228363-7 McGraw HillMcGraw Hill
Supplemental Text:Supplemental Text: Concurrency Control and Recovery in Database Concurrency Control and Recovery in Database
SystemsSystems Bernstein, Hadzilacos, Goodman.Bernstein, Hadzilacos, Goodman. Out of Print: Avaliable free on the InternetOut of Print: Avaliable free on the Internet Link from course web page.Link from course web page.
April 21, 2023 Sunil Prabhakar 8
Grading Policy
TentativeTentative Written Assignments (2) Written Assignments (2) 20%20% Programming Projects (3-4)Programming Projects (3-4) 40%40% Mid-term ExamMid-term Exam 20%20% Final ExamFinal Exam 20%20%
Final not comprehensiveFinal not comprehensive Grading is curvedGrading is curved No extra credit assignmentsNo extra credit assignments
April 21, 2023 Sunil Prabhakar 9
Academic Integrity
CS PolicyCS Policy IMPORTANT: visit, read and accept!!!IMPORTANT: visit, read and accept!!! https://portals.cs.purdue.edu/studenthttps://portals.cs.purdue.edu/student Need CS login and password.Need CS login and password.
Cheating will be taken very seriously.Cheating will be taken very seriously. Make sure that you are familiar with what CS Make sure that you are familiar with what CS
considers to be cheating!!considers to be cheating!! You may discuss the problems, but the final You may discuss the problems, but the final
solution must be your own.solution must be your own.
April 21, 2023 Sunil Prabhakar 10
Course Policy
NO LATE SUBMISSIONSNO LATE SUBMISSIONS NO LATE SUBMISSIONSNO LATE SUBMISSIONS NO EXTENSIONSNO EXTENSIONS NO EXTENSIONSNO EXTENSIONS
******
Only on Documented Medical Reasons or Family Only on Documented Medical Reasons or Family emergency.emergency.
April 21, 2023 Sunil Prabhakar 11
Databases
What is a database?What is a database? S/w to manage data.S/w to manage data.
Why do we need a database?Why do we need a database? Ease of development,Ease of development, EfficiencyEfficiency ConcurrencyConcurrency ReliabilityReliability Ease of administrationEase of administration Data independenceData independence
Importance of databases?Importance of databases? Increasing or decreasing? What is changing?Increasing or decreasing? What is changing?
April 21, 2023 Sunil Prabhakar 12
What is interesting?
Essential to modern applications?Essential to modern applications? Data is a valuable commodity.Data is a valuable commodity.
Is there anything challenging?Is there anything challenging? Encompass PL, OS, Logic, Theory, …Encompass PL, OS, Logic, Theory, … Novel solutions with wider applicability: Transactions, Novel solutions with wider applicability: Transactions,
Locking, …Locking, … What remains to be done?What remains to be done?
Modern applications: Multimedia, Sensors, Streams, Modern applications: Multimedia, Sensors, Streams, Data Warehouses, Data Mining, Privacy and Security, Data Warehouses, Data Mining, Privacy and Security, Knowledge, Data on the Web, XML, ….Knowledge, Data on the Web, XML, ….
April 21, 2023 Sunil Prabhakar 13
Abstraction
How to provide a generic, application-How to provide a generic, application-independent solution?independent solution?
Data ModelsData Models Abstract view of dataAbstract view of data Database efficiently supports this modelDatabase efficiently supports this model Examples: Network, Relational, OO, O-R, …Examples: Network, Relational, OO, O-R, … Most successful model: RELATIONALMost successful model: RELATIONAL
Users access the database as a black box that Users access the database as a black box that supports the model.supports the model.
Languages are used to interact with this Box:Languages are used to interact with this Box: Relational Algebra, SQL, Relational Algebra, SQL,
April 21, 2023 Sunil Prabhakar 14
Independence
Databases allow applications and users to be Databases allow applications and users to be shielded from the internal details:shielded from the internal details: Physical data independencePhysical data independence
How data is stored (bits, pages, formats, etc.)How data is stored (bits, pages, formats, etc.) Compare with Flat file alternativeCompare with Flat file alternative
Logical data independenceLogical data independence How data is structured logically.How data is structured logically. Allows applications to make changes to the logical Allows applications to make changes to the logical
organization of data without have to rebuild applicationsorganization of data without have to rebuild applications
April 21, 2023 Sunil Prabhakar 15
Concurrency Control & Recovery
Two highly desirable requirements:Two highly desirable requirements: Enable multiple users to access the data at the same Enable multiple users to access the data at the same
time.time. Automatic recovery from crashes.Automatic recovery from crashes.
Challenge:Challenge: How to do this in an application-independent manner?How to do this in an application-independent manner?
Solution:Solution: TransactionsTransactions ““Contract” between the DB Black Box and users.Contract” between the DB Black Box and users.
April 21, 2023 Sunil Prabhakar 16
Performance
Critical for databasesCritical for databases Research focus for many yearsResearch focus for many years Must be transparent to the usersMust be transparent to the users Query processing & OptimizationQuery processing & Optimization Indexing, storage organization (data Indexing, storage organization (data
independence)independence) Challenge:Challenge:
How to optimize without understanding the semantics How to optimize without understanding the semantics of an application?of an application?
Solution:Solution: Relation data model -- clean mathematical abstraction, Relation data model -- clean mathematical abstraction,
allows for alternative equivalent evaluationsallows for alternative equivalent evaluations
April 21, 2023 Sunil Prabhakar 17
This course
Study the relational model, ER model, Study the relational model, ER model, languages.languages.
TransactionsTransactions Concurrency ControlConcurrency Control RecoveryRecovery
Storage and File StructuresStorage and File Structures Indexing and HashingIndexing and Hashing Query Processing and OptimizationQuery Processing and Optimization Advanced TopicsAdvanced Topics
New data types, applications, multi-dimensional data, New data types, applications, multi-dimensional data, data warehousing, data mining, design, …data warehousing, data mining, design, …