Computing for Bioinformatics Introduction to databases What is a database? Database system...
-
Upload
vivien-atkinson -
Category
Documents
-
view
216 -
download
2
Transcript of Computing for Bioinformatics Introduction to databases What is a database? Database system...
Computing for Bioinformatics
Introduction to databases
• What is a database?• Database system components• Data types• DBMS architectures• DBMS systems available• Microsoft Access
What databases are not
• unstructured piles of data (including heaps of web pages)
• spreadsheets such as Excel tables• text files with neatly tabulated data• data collected for one kind of analysis only
Why are these things not databases?
Spreadsheets versus databases (1)
• A spreadsheet is typically viewed as an entire table of cells which may contain– numbers (data)
– text (labels)
– formulae (calculations producing results)
• A database may be structured in various ways, usually so that a small subset of the data is presented as the result of a search
Spreadsheets - databases (2)
Spreadsheets
• Can be used immediately with little preparation (or thought)
• Data is visible
• Data entry is simple
Databases
• Require planning
• Data is hidden
• May require a program to help you enter or retrieve data
Spreadsheets - databases (3)
Spreadsheets
• Little checking is carried out
• Tables and graphs can be produced
• Single user
Databases
• Extensive integrity checks can be arranged
• Reports can be programmed
• Searches can be made
• Can be multi-user
• Can be put on the Web with a suitable user interface program
What a database is
• Data is stored separately from any application programs which might use it
• Multiple uses of the data are envisaged• Designed for retrieval in various anticipated and
unanticipated forms
What are they used for?
Biology:• data about species• details of publicationsBiodiversity:• data about biological specimens• data about areas, places, sampling sites, habitats etc.
(sometimes in Geographical Information Systems (GIS)Bioinformatics:• results of experiments• molecular sequences, protein structures• gene frequencies, gene expression data, etc.
DBMS types (database internal structure)
What are the main types of database design? (The internal mechanics, not the information stored or the appearance of the database as seen by the user.)
• “Free text” - records not divided into fields
• “Flat-file” - records have fields (one table with columns like a spreadsheet), common and easy to understand, often inefficient
• Hierarchical, Network - now obsolete
• Relational - several tables, usually the choice of the professional (solid, boring)
• Object-oriented - for the adventurous (cutting edge)
Database system components (1)
A database management system (DBMS) has the following essential components:
• Data tables (the data itself)• Database “engine” (stores data to and retrieves
data from the tables)• User interface (for humans to enter, view and edit
data)Some commercial general-purpose DBMSs, such as
Microsoft Access, make the engine and the interface appear as one (although Access can use other engines)
Stand-alone and client-server systems
• Some database systems are integrated (“stand-alone”): the engine and the interface are combined (MS Access)– the data may also be on the same machine
• “Client-server” database systems put the data tables and the storage engine on a remote “server” computer– the user accesses the remote database server using a
local database client program
Accessing the data in the database
• A user can use a built-in user interface to search, edit, etc. (e.g. in Microsoft Access)
• A user can use a separate or even third-party general-purpose client program, especially in the case of client-server systems such as MySQL, Oracle, etc.
• Such clients often use the SQL language (pronounced either “ess-cue-ell” or “sequel”) as a (fairly) standard way to formulate search requests, data editing instructions, etc.
• Special-purpose client programs may also be written (in Perl, Java, PHP, etc.) to perform such access, using SQL “embedded” in the program
Database system components (2)
A DBMS is usually also associated with: • Database “drivers”, import & export modules, etc.
(for programs to store, retrieve and alter data)• Application programs, which use drivers to
connect to the database, send SQL commands to it and do useful things, sometimes called “business logic”; may be general-purpose or specialised)
• Report writer (a specialised application program)• Utilities (ditto, for back-ups, integrity checking,
etc.)
Smallest ever guide to SQL
• Database table definition: column names, data types, indexes, etc.
• Data records may be inserted, altered or deleted• Data retrieval is based on the idea of selecting
columns and rows to obtain a subset of a larger stored table, e.g.– SELECT name, salary FROM Employee WHERE
name LIKE ‘Smith%’;
• Data may be retrieved from two or more tables using “joins” on linking data fields (keys)