Database Career Opportunities -...

18
Database Career Opportunities Industry Application Development

Transcript of Database Career Opportunities -...

Page 1: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Database Career OpportunitiesIndustry Application Development

Page 2: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Database Career OpportunitiesLeading Database Industry Research• MicroSoft

• Oracle

• IBM

• Teradata

• EMC

• SAP

• Vertica Columnar

Page 3: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Database Career OpportunitiesLeading Big Data Industry Research• Google

• FaceBook

• Yahoo

• Twitter

• LinkedIn

• Amazon

• Ebay

• And So Many Others

Page 4: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

What to Study: Advanced Database Topics• Operational RDBMS Based:

• Modeling and Design

• Normalization, Functional Dependency Theory

• Database Performance Tuning

• Index, Index, Index !!

• Query Optimization

• Database Security

• Data Analytics:• Parallel Data Warehouse and OLAP

• Star Scheme OLAP Query Optimization

• Columnar Database

Page 5: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Advanced Database Research Topics

• How to Build a Database Server

== (How to Build an Expert System of Artificial Intelligence)• How To Build a Query Processor

• Optimization Techniques of Performance of Data Processing

• Index Strategy Development

• Parallel Data Processing Techniques

• In Memory Database Processing and Optimization

• Columnar Database

• Concurrency Control Techniques

• Database Security

Page 6: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Data Warehousing and OLAP

• Introduction• Decision Support Technology

• On Line Analytical Processing

• Star Schema

• Relational Aggregation Operators• Data Cube

• Roll Up, Drill Down

• Papers to Cover• An Overview of Data Warehousing and OLAP Technology by Surajit Chaudhuri (Microsoft)

and Umeshwar Dayal (HP Labs) , in the proceedings of IEEE 1995

• Data Cube: A Relational Aggregation Operator Generalizing Group By, Cross Tab, and SubTotals by Jin Gray (Microsoft), et al, in the proceedings of IEEE 1996

S.Chung-CIS611_Lecture_Notes

Page 7: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

What to Study: Advanced Database TopicsMassively Parallel Big Data Processing Systems

• Map Reduce and Hadoop

• NoSQL Database Processing Systems

- Mongo DB

- HBase

- Hive

- Pig Latin

Page 8: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Semistructured Data vs Structured Data• Structured Data

• Relational Database, Data Warehouse

• SQL

• Semistructured Data• XML, HTML, JSON

• XQuery, XPath

• Unstructured DataText, Web Data (Mixed of Text, Image, Audio, Video)

• Problems and Solutions of Processing Semistructured/Unstructured Data

• Introduction of Mark Up Languages:

XML, HTML

S.Chung-CIS611_Lecture_Notes

Page 9: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Semi Structured Data and Big Data

• Introduction of XML

• XML Schema, Semantics, Protocol

• XQuery, XPath

• XML Query Processing• XML 1.0(DOM and SAX2 APIs)

• XQuery 1.0 and XPath 2.0 Semantics

• XSLT

• JSON

S.Chung-CIS611_Lecture_Notes

Page 10: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Information Retrieval: How Google Search Engine Works

• Web data Processing

• Building Index

• Relevancy Metric for Search Engine

• How Google Search Engine Works

S.Chung-CIS611_Lecture_Notes

Page 11: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Map Reduce and Apache Hadoop

• Introduction of Google Map Reduce

• Introduction of Apache Hadoop

• HDFS Architecture

• Parallel programming and the MapReduce programming model

• Algorithms of Map Reduce/Apache Hadoop

• Papers to Cover• MapReduce: Simplified Data Processing on Large Clusters by Jeffrey Dean (Google) and Sanjay Ghemawat

(Google) in the proceedings of OSDI 2004

//labs.google.com/papers/mapreduce-osdi04.pdf

• Lammal, Ralf. Google's MapReduce Programming Model Revisited.http://www.cs.vu.nl/~ralf/MapReduce/paper.pdf

• Open Source MapReduce http://lucene.apache.org/hadoop/

• Apache Hadoop in White Papers by Apache, Yahoo

S.Chung-CIS611_Lecture_Notes

Page 12: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Data Warehouseon Parallel Processing

• Architecture of Parallel Processing

• Flow of Data Processing in Data Warehouse on Parallel Processing Architecture

- Retrieval

- Aggregation

• RDBMS For Integration with HDFS

• Example: • Teradata Parallel Architecture

• Oracle Parallel Architecture

S.Chung-CIS611_Lecture_Notes

Page 13: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Data Warehouse and OLAP for Business Intelligence• What is Business Intelligence ?

Big Trend for Every Business for Current and Next Generation

• Data Analytics on Web Transaction Data of Business

• MicroSoft Data Analytic Service System, Multi-Dimensional OLAP

• R, MapR, Phython, So Many Other Systems and Tools

S.Chung-CIS611_Lecture_Notes

Page 14: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Data Analytics / Data Mining

• Data Cleaning, Preprocessing, Transformation

• Data Mining Algorithms

• Machine Learning Algorithms

• Optimization Techniques of Data Analytics Algorithm

S.Chung-CIS611_Lecture_Notes

Page 15: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Big Data: Web Data Processing

• Problems of Big Data Processing

• Solutions

• Processing Web Transaction Data(Unstructured Data) on Data Warehouse (Structured Data Processing Server)

S.Chung-CIS611_Lecture_Notes

Page 16: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Real World Examples

• Face Book

• Papers to Cover• Data Warehousing and Analytics Infrastructure at Facebook by Ashish Thusoo

(Facebook), et al, in the proceedings of Sigmod 2013

• Ebay• Data Warehousing and Analytics at Ebay

S.Chung-CIS611_Lecture_Notes

Page 17: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

How To Get an Access to Study These Advanced Database Topics For Development

• Major Database Industry (Microsoft, IBM, Oracle, Teradata) Offer

Database Administration Exams and Supporting Materials

in 3-4 different levels of Proficiency – Every Company Takes it as

Credentials and Every Company Encourages (Pay For it) their Database

Employees to Pass the Exams

• Graduate Level Advanced Database Courses Covers These Topics For the Exams on

Database Server, Query Processing, Data Processing Optimization, Database Tuning

Page 18: Database Career Opportunities - csuohio.educis.csuohio.edu/~sschung/cis430/DatabaseCareerOpportunities.pdf · Database Career Opportunities ... • Database Performance Tuning ...

Advanced Database Topics For Research and Advanced Industry Developments

• To Learn How to Build a Database Server :• How To Build a Query Processor• Optimization Techniques of Performance of Data Processing• Index Strategy Development• Parallel Data Processing Techniques • In Memory Database Processing and Optimization• Columnar Database• Concurrency Control Techniques• Database Security

• Graduate Level Advanced Database Courses and Data Analytics

Courses Covers All These Advanced Topics:

• CIS 611 Enterprise Database Systems and Data Warehouse

• CIS 612 Big Data and Parallel Database Processing Systems

• CIS 660 Data Mining