Topics in Informatics
description
Transcript of Topics in Informatics
![Page 1: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/1.jpg)
Topics in Topics in InformaticsInformaticsSpring 2005, SJCSpring 2005, SJC
![Page 2: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/2.jpg)
About the InstructorAbout the Instructor• Instructor Dr. Hong Zhou• Office McDonough 317• Office Hours MWF 10:00 – 11:00am• Email: [email protected], Phone: 231-5826• Syllabus
• You can all me either:– Hong– Dr. Hong– Dr. Zhou
![Page 3: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/3.jpg)
What is Informatics?What is Informatics?• Search for ‘What is informatics’ at
http://www.google.com, we got different definitions.
• Basically, the study and application of the knowledge and skills of data/information flow and manipulation (including storage, retrieval, analysis, and construction/deriving, etc).
![Page 4: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/4.jpg)
InformaticsInformatics• Data obtaining• Data flow and control• Data representation (records) and
storage• Data retrieval/mining• Data analysis• Data derivation (generating new data
from existing data via analysis)
![Page 5: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/5.jpg)
What Will You LearnWhat Will You Learn• Obtaining reliable data.• Data Management (Data Storage and
Representation, Retrieval) database.
• Introduction to Bioinformatics.• Introduction to Health Informatics.
![Page 6: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/6.jpg)
Part IPart I Obtaining Reliable Obtaining Reliable DataData
• Complex and precise communication is something distinguishing us from non-human.
• The world development is somehow the development of our understanding, i.e. information of the universe including our social systems.
• Information and its uses are the center of such development.
![Page 7: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/7.jpg)
Information vs DataInformation vs Data• What is in your mind when we talk
about “INFORMATION”?• Is information touchable, visible?• To my understanding, data is the
description of information, and information is the interpretation of data.
• So, let’s deal with the description data, in this class.
![Page 8: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/8.jpg)
What is Data?What is Data?• When we talk about data, the first
image in our mind might be numbers such as 5, 87, 98.34, etc.
• However, are the numbers 5, 87, 98.34 meaningful/informative?
![Page 9: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/9.jpg)
Data with ContextData with Context• Pure numbers are meaningless for us.• Numbers with context are meaningful,
however.• For example, 5 pounds of sugar.• So, in this class, we are talking about
meaningful data and ignore all meaningless data. (Are we meaningful persons?)
![Page 10: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/10.jpg)
Quick QuestionsQuick QuestionsAre following ‘data’ meaningful?• 20• 20 years• A 20 years old girl• A 20 years old girl named Amie• A 20 years old girl named Amie who
is a SJC student.
![Page 11: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/11.jpg)
Data TargetData Target• Data is used to describe a subject.• For example, age, height, weight,
gender, profession, are description of a person.
• Medical record is a description of a patient
![Page 12: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/12.jpg)
Quick QuestionQuick Question
Ford V6 White 4 door Sedan
Amie Female
9/1986
CSC SJC
What are the targets of the following two rows of data?
![Page 13: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/13.jpg)
What is RELIABLE?What is RELIABLE?• When we talk about reliable data,
what does that mean?• Let’s discuss this issue at two levels:
– Individual level– Group/population level (statistics)
![Page 14: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/14.jpg)
Individual LevelIndividual Level• Reliable data means that the data is
‘closely’ related to the individual (or event) and ‘precisely’ describes the individual (or event).
• A computer of 3.2 ghz CPU, 512 mb RAM, 512 kb cache, etc.
![Page 15: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/15.jpg)
Group/Population LevelGroup/Population Level• ‘Reliable’ is more meaningful at the
group level.• Can a specific medical diagnose of a
patient be representative of all patients with the same symptom?
• Probably not.
![Page 16: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/16.jpg)
Statistical ThinkingStatistical Thinking• One powerful approach to analyze data is
statistics.• We measure the reliability (significance) of
data in the sense of statistics.• Statistical thinking is to use data to build
our understanding, gain insights, and draw conclusions or make inferences.
• Not drawing conclusion from an incident.
![Page 17: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/17.jpg)
Principles in Statistical Principles in Statistical ThinkingThinking
• Count on data instead of an incident• Where the data is from matters.• Lurking variables• Variation is everywhere• Conclusions are not absolutely
certain
![Page 18: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/18.jpg)
Count on large amount of data Count on large amount of data instead of a few incidentsinstead of a few incidents
• Famous fortune teller• The thumb of a monk
![Page 19: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/19.jpg)
Where data is fromWhere data is from• Group data can be collected from
surveys or observations, or obtained from experiments.
• When collecting data, where the data come from is important. For example, once there is a question “If you had it to do over again, would you have children?” 70% from the written responses are NO. Is this piece information reliable?
![Page 20: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/20.jpg)
Lurking VariablesLurking Variables• Is music practice improving test
scores?• What is behind?
![Page 21: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/21.jpg)
The Importance of RANDOMThe Importance of RANDOM• The key factor in data collection is the
RANDOM concept, i.e. the data has to be randomly collected with no bias.
• Suppose that you are doing a survey of 2004 election prediction from 10000 people in USA, how are you going to pick the 10000 persons? Only in schools? Only in New York? Only women? Avoid as much as bias as you can.
![Page 22: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/22.jpg)
ExperimentsExperiments• Some reliable data can only be produced
by experiments, especially in science.• For example, in biology, to pin down the
function of a gene, you have to knock out the gene or depress it and check the phenotype changes. After that, you have to recover the gene and verify if the phenotype also recovers. Such experiments are very convincing, but expensive.
![Page 23: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/23.jpg)
Another ExperimentAnother Experiment• It once was believed that women who take
hormones after menopause reduce the risk of heart attack. The belief was resulted from the studies that simply compared women who were taking hormones with others who were not. Are such study results reliable?
• Such experiments lack proper Controls, which are the essential in all experiments.
• How are you going to design an experiment for this study?
![Page 24: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/24.jpg)
Reliable Data cont’dReliable Data cont’d• It is not a simple task to obtain
reliable data, it requires extensive consideration and design.
• Some experiment results may look convincing at some time, but may lose their reliability over time or when the environment changes. For example, the third stop light of cars.
![Page 25: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/25.jpg)
DiscussionDiscussion• Is absence of evidence the evidence
of absence?
![Page 26: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/26.jpg)
Project 1Project 1• Write a paragraph to discuss the claim “Absence
of evidence is evidence of absence”. Please make your own judgment as the grading is based on your argument.
• Design a simple survey to collect opinions about terminating death penalty. Be aware of the importance of “RANDOM”. Write a short paragraph to argument that the data collected by your survey is reliable.
• Points: 100.• Due Date: Feb 1st, 2005. • Submit your work in the digital drop box in
Blackboard.
![Page 27: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/27.jpg)
Part II Data StoragePart II Data Storage• Can all information be recorded as
data? Let’s start the discussion.– Feeling– Knowledge– Intelligence
![Page 28: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/28.jpg)
Personal IdeasPersonal Ideas• My understanding: Yes, just some of
them are too complicated or too difficult to manifest precisely.
• And that is whey we have IQ test, MQ (motivational quotient), EQ, etc.
![Page 29: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/29.jpg)
Where to storeWhere to store• Data is stored somewhere.
– Minds– Books (paper documents)– Computers– Etc
• Let’s compare the three storage methods, which one you think more lasting or appropriate?
![Page 30: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/30.jpg)
Passing WordsPassing Words• In ancient time, knowledge is passed in words
generation by generation.• Here is a story about passing by words:
– General called the captain telling “tonight at 7:00pm, the Halley comet will pass your camp in the sky. Organize your soldiers to watch”.
– Captain informed his lieutenant: “Tonight at 7:00pm, the Halley comet will pass our camp in the sky and the general is coming to watch with our soldiers.”
– The lieutenant informed the sergeant: “Tonight at 7:00pm, the general will accompany Halley comet passing over our camp, organize the soldiers”
– The sergeant to soldiers: “Tonight at 7:00pm, general Halley will pass over our camp in sky and we are going to watch that”.
![Page 31: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/31.jpg)
Data StorageData Storage• Paper storage:
– Size and cost– Transportation
• Computer:– Signature legal effect– Hacking– What if computers are
down?• However, if data is not
organized, it is difficult to make use of. So, data storage strategy is important.
• In this class, we talk about data storage by using computer technology.
![Page 32: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/32.jpg)
Ways to storeWays to store• Data storage is a big, and probably
the largest issue related to computer data manipulation.
• Different database structures, different database managements, online storage, etc.
![Page 33: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/33.jpg)
Chapter 1. File structureChapter 1. File structure• Hierarchical
structure• Easy to deal with the
hierarchical relationships.
• For example, the administration is a hierarchical structure.
• Let me use the DOD/NIMA VPF structure as an example
root
Folder 1 Folder 2 files
subfolders files
![Page 34: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/34.jpg)
VPF StructureVPF Structure• DOD (Department of Defense) and
NIMA (National Image and Mapping Agency) sponsored the VPF development (Vector Product Format) Nickname: very poor format
• It is used to store the earth ground information and provide a digital map.
![Page 35: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/35.jpg)
VPF structureVPF structure
Database
Library • Library
Coverage Coverage Coverage
File1 File1 File1 File1 File1
![Page 36: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/36.jpg)
Navigation in Hierarchical Navigation in Hierarchical StructureStructure
What is the purpose of Index?
![Page 37: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/37.jpg)
Project 2Project 2• Create a hierarchical file structure to
store some your works in SJC. • This is the way I prefer: organize your
works based on the classes you take.• If you have other ways, that is ok as
long as they are organized well.• Show me in class what you have done.• Points: 100
![Page 38: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/38.jpg)
Chapter 2 XMLChapter 2 XML• Extensible Markup Language• Purpose:
– Data transportation– Data representation– Data storage
• Why we should talk about it here? Because the data inside a XML file is hierarchical
![Page 39: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/39.jpg)
What XML Promises?What XML Promises?• Data portability• Programming language Java promises the
portability of programs.• However, programs are working on data.
Before XML, data is not portable, communication among systems, agencies are extremely difficult.
• XML allows systems to communicate using a standard means of data representation.
![Page 40: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/40.jpg)
HTML?HTML?• HTML is the portable language for
browsers.• It is a standard.• However, it governs how information
is displayed in a browser with defined formats and defined tags.
![Page 41: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/41.jpg)
The Difficulties XML facesThe Difficulties XML faces• XML has some defined formats• But doesn’t have defined tags.• User defined tags• Unlimited types of data.
![Page 42: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/42.jpg)
Solution (Partially)Solution (Partially)• Make the information self-explained.• You have to invent your own tags!
![Page 43: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/43.jpg)
A Simple ExampleA Simple Example<person>
<lastname>Fonship</lastname><firstname>Michele</firstname><gender>female</gender><education type=“elementary”>
<start-date>9/1980</start-date><stop-date>5/1985</stop-date><school>Badley school</school>
</education></person>
![Page 44: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/44.jpg)
Tips about XML formatTips about XML format• A tag is case sensitive• A starting tag must have a closing
tag to match• All XML elements must be properly
nested.• All XML documents must have a root
element.• Attribute values must always be
quoted.
![Page 45: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/45.jpg)
Comments in XMLComments in XML• Comments in XML• The syntax for writing comments in
XML is similar to that of HTML.• <!-- This is a comment --> • A sample XML file.
![Page 46: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/46.jpg)
XML Element NamingXML Element Naming• Names can contain letters, numbers,
and other characters • Names must not start with a number
or punctuation character • Names must not start with the letters
xml (or XML or Xml ..) • Names cannot contain spaces.
![Page 47: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/47.jpg)
Is it valid or not?Is it valid or not?
<students>
<one student>
<first name>Rose</first name>
<last name>Washington</last name>
</one student>
</students>
![Page 48: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/48.jpg)
Element ContentElement Content• An XML element is everything from
(including) the element's start tag to (including) the element's end tag.
• An element can have element content, mixed content, simple content, or empty content. An element can also have attributes.
![Page 49: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/49.jpg)
Is this valid?Is this valid?
<food>
<vegetable></vegetable>
<fruit>apple</fruit>
</food>
![Page 50: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/50.jpg)
Child Elements vs. AttributesChild Elements vs. Attributes
<person sex="female">
<firstname>Anna</firstname> <lastname>Smith</lastname>
</person>
<person>
<sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname>
</person>
![Page 51: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/51.jpg)
Disadvantages of AttributesDisadvantages of Attributes• attributes cannot contain multiple values (child
elements can) • attributes are not easily expandable (for future
changes) • attributes cannot describe structures (child
elements can) • attributes are more difficult to manipulate by
program code • attribute values are not easy to test against a
Document Type Definition (DTD) - which is used to define the legal elements of an XML document
![Page 52: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/52.jpg)
Using Child Elements?Using Child Elements?• So, it is a good idea to use Child
Elements other than Attributes.• Check this out. Tell which way you
prefer.• Can this file work? What is wrong?
![Page 53: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/53.jpg)
A case for AttributeA case for Attribute• What is metadata? Data about data.
For example, your SJC student ID is a metadata about you since it does not describe you.
<publisher id=“p1”>
<name>O’Reilly</name>
<address>somewhere</address>
</publisher>
![Page 54: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/54.jpg)
Is this valid?Is this valid?
<teacher>
<course>Eng100</course>
<course id=5>Math100</course>
<office Hour>2:00-3:00pm<office Hour>
<office>McDonough Hall 211 </Office>
</teacher>
What are the errors?
![Page 55: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/55.jpg)
More about XMLMore about XML• Now we have so called “XML database” whose
basic element is XML document. It is not very successful yet.
• Remember that XML does not really do anything except describing data.
• We have to interpret whatever it is describing. In the sense of computer software, the user has to develop software to interpret.
• What are DTD and XML schema?• What are the disadvantages of XML? Please
discuss about it.
![Page 56: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/56.jpg)
Analyze the XML fileAnalyze the XML file• Example XML file• Let’s discuss the weakness of this
file.• What do you suggest?• How do you think about my solution?
![Page 57: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/57.jpg)
In class exerciseIn class exercise• Given the data shown in Access
database, can we store the same data in XML format? Please try it in class. Thanks.
![Page 58: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/58.jpg)
Useful Sites about XMLUseful Sites about XML• http://www.w3schools.com/xml/• http://www.xml.org
![Page 59: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/59.jpg)
XML in Uses?XML in Uses?• BBC topic news are also available
online via XML. Example.• XML at work.• XML in commerce?• What is GML and SGML?
![Page 60: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/60.jpg)
Project 3Project 3• Here are the requirements, which are
also available in Blackboard.• Discussion: will XML really be the
standard of data transportation or data storage?
![Page 61: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/61.jpg)
Part 3 DatabasePart 3 Database• Instead of listing it as Chapter 3, it is
listed as Part 3, which shows that this is a big issue.
![Page 62: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/62.jpg)
Chapter 1Chapter 1 Database HistoryDatabase History
• Hierarchical database• Network database• Relational database• Object-oriented database• Object-oriented relational database• XML database• etc
![Page 63: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/63.jpg)
Relational DatabaseRelational Database• The major database in use.• Based on the relations between data
items.• Key element: tables.• Available relational databases: Oracle,
DB2, Sybase, MS SQL Server, Access, MySQL, etc.
• A site about evaluation.• The instructor’s database work.
![Page 64: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/64.jpg)
Records and AttributesRecords and Attributes• A table has multiple records, each
has multiple values.• For example.• The attributes define the data types.
All data in that column must conform to the given data types.
![Page 65: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/65.jpg)
Primary KeyPrimary Key• The primary key of a relational table uniquely
identifies each record in the table. It can either be a normal attribute that is guaranteed to be unique (such as Social Security Number in a table with no more than one record per person) or it can be generated by the DBMS (such as a globally unique identifier, or GUID, in Microsoft SQL Server). Primary keys may consist of a single attribute or multiple attributes in combination
• For example, in the table example, the primary key is “Student#”.
• Every table must have Primary Key defined.
![Page 66: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/66.jpg)
Primary Key (2)Primary Key (2)• Guess what would be the Primary
Key in the SJC database for students?• Will it be ok to use your name (last
name and first name) as the primary key?
![Page 67: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/67.jpg)
Create a table for …Create a table for …• Smith, Jack, male, 8/15/1989, 421865241, Forrest,
Shoplifting, Linda Luke, (860)321-9086, 105.• Marsa, Rose, female, 7/1/1988, 3245691877, Jones, Dog
fighting, Nancy Charles, (860)321-9088, 106.• Lese, Sam, male, 3/21/1986, 425423785, Hartford, Dwell
breaking, Linda Luke, (860)321-9086, 105.• Haly, Rachel, female, 3/25/1989, 423671841, Hartford,
misconduct, Linda Luke, (860)321-9086, 105.• Horse, James, male, 11/2/1987, 765213456, Lama,
misconduct, Nancy Charles, (860)321-9088, 106.• Lincoln, George, male, 10/5/1988, 324342342, Jones,
fighting, Linda Luke, (860)321-9086, 105.• Doom, Jade, female, 9/9/1988, 423213495, Hartford,
misconduct, Nancy Charles, (860)321-9088, 106.
![Page 68: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/68.jpg)
TableTableSS• Surely we will deal with multiple
database tables concerning any complete datasets.
• When dealing with complicate datasets, first thing is to categorize the data into groups with each group represented by a table.
• The second thing is to find and build the relationships between the tables.
![Page 69: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/69.jpg)
Analyze the dataAnalyze the data• How many categories we have? • Let’s use UML to clear the data
relationship!• UML is Unified Modeling Language
which arises in 1990’s. It derived from the three greatest minds of system modeling.
• It is the standard language used to analyze system design.
![Page 70: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/70.jpg)
PracticePractice• The UML diagram• What tables you would construct for
the data in the XML file?• Do this exercise in class.
![Page 71: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/71.jpg)
RelationshipRelationship• Now let’s talk about the relationship
types One-to-OneOne-to-many vs many-to-oneMany-to-many
![Page 72: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/72.jpg)
One-OneOne-One• SSN – Person
SSN Person1
1
![Page 73: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/73.jpg)
One-ManyOne-Many• Bank accounts person (one
person can have multiple accounts, but one account belongs to one person/family).
Bank account person*
1
![Page 74: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/74.jpg)
Many-ManyMany-Many• Course-Student. A student may take
multiple courses, and a course may be taken by multiple students.
![Page 75: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/75.jpg)
Foreign KeyForeign Key• A foreign key is a relationship or
link between two tables which ensures that the data stored in a database is consistent.
• The foreign key link is set up by matching columns in one table (the child) to the primary key columns in another table (the parent).
• Referential Integrity
![Page 76: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/76.jpg)
Foreign Key Example 1Foreign Key Example 1
studentID
First name
Last name
Major
SSN
Gender
DOB
Table Students
playerID
First name
Last name
Position number
Basket Ball playersPK
PK
parent child
One-to-one
![Page 77: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/77.jpg)
Foreign Key Example 2Foreign Key Example 2• Given a table about instructors
whose columns are ID, first name and last name.
• Suppose the basic information of a offered course is the instructor and the course name.
![Page 78: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/78.jpg)
Cont’dCont’d
ID
First name
Last name
Course name
Instructor
1 *
One-Many
![Page 79: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/79.jpg)
Cont’dCont’d• Look at this example.
![Page 80: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/80.jpg)
Exercise in DepthExercise in Depth• UML diagram of the exercise.• Now, how to define the tables that
can properly represent the UML diagram?
![Page 81: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/81.jpg)
Common RulesCommon Rules• One object (entity) one table• One attribute one column• Additional PK – optional in some
cases.
![Page 82: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/82.jpg)
How to define Relations How to define Relations between Tables?between Tables?
• First of all, we have to know that Parents come before children. Tables that can be built without referencing other tables/data could be used as parent table.
• For example, student table vs basket ball player table.
![Page 83: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/83.jpg)
Relations cont’dRelations cont’d• In case of One-One relation, the
parent table is the table that can be built without referencing any data in the child table. The child table must be the table that references data in the parent table.
![Page 84: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/84.jpg)
ExampleExample
studentID
First name
Last name
Major
SSN
Gender
DOB
playerID
First name
Last name
Position number
![Page 85: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/85.jpg)
Relation cont’dRelation cont’d• In case of One-Many, the One must
be the parent table, and the Many must be the child table
![Page 86: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/86.jpg)
ExampleExample
BookID
Title
Author
Publish year
Publisher ID
PublisherID
Name
Address
parentchild
Many-One
![Page 87: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/87.jpg)
Many-ManyMany-Many• It is pretty hard to express the Many-
Many relations between two tables. • For example, students Courses
relationship.• How are we going to do it?
![Page 88: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/88.jpg)
SolutionSolution• Make use of another table! In this
case, we have three tables. One for students only, one for courses only, and one to link students with courses.
![Page 89: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/89.jpg)
Solution ExampleSolution Example
studentID
lastname
firstname
DOB
gender
Course#
Title
LocationStudentID
Course#
![Page 90: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/90.jpg)
The full table constructionThe full table construction• Let’s work on this data to build the
whole tables!
• Now, let’s do this project 4!
![Page 91: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/91.jpg)
Sword, a real applicationSword, a real application• Publicly information about Sword.• A success story of data
representation, storage and management in Mississippi.
• Please form 2 or 3 groups for the coming projects since they are kind of complicated. Inform me of the group members in the next class. Thanks.
![Page 92: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/92.jpg)
Discussion of Sword in ClassDiscussion of Sword in Class• The sword data scenario
![Page 93: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/93.jpg)
Chapter 2 Access Basics IChapter 2 Access Basics I• Please form 2 or 3 groups to for the
coming projects since they are kind of complicated. Inform me of the group members in the next class. Thanks.
• Every student is supposed to collect at least 2 restaurant menus of the Hartford area. Keep them for later use.
![Page 94: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/94.jpg)
Basics (1)Basics (1)• Open and save an Access database.• Create a table in Design View.• To create good tables, we need to
understand our data first. Let’s have a look at the existing data in next slide.
![Page 95: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/95.jpg)
TryTry• Create a table to hold the information below?• Smith, Jack, male, 8/15/1989, 421865241, Forrest,
Shoplifting, Linda Luke, (860)321-9086, 105.• Marsa, Rose, female, 7/1/1988, 3245691877, Jones, Dog
fighting, Nancy Charles, (860)321-9088, 106.• Lese, Sam, male, 3/21/1986, 425423785, Hartford, Dwell
breaking, Linda Luke, (860)321-9086, 105.• Haly, Rachel, female, 3/25/1989, 423671841, Hartford,
misconduct, Linda Luke, (860)321-9086, 105.• Horse, James, male, 11/2/1987, 765213456, Lama,
misconduct, Nancy Charles, (860)321-9088, 106.• Lincoln, George, male, 10/5/1988, 324342342, Jones,
fighting, Linda Luke, (860)321-9086, 105.• Doom, Jade, female, 9/9/1988, 423213495, Hartford,
misconduct, Nancy Charles, (860)321-9088, 106.
![Page 96: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/96.jpg)
Try cont’dTry cont’d• First, primary key!• Continue the building of one table for
all the data.• After done, save the work and give
the table a sensible name.
![Page 97: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/97.jpg)
Create Table: wizardCreate Table: wizard• Let’s explore the table creation
function of Access: we can create table by Wizard, i.e. with templates.
![Page 98: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/98.jpg)
Create Multiple TablesCreate Multiple Tables• Based on the UML diagram of the
data, let’s create multiple tables.
Student
NameDOBGenderMajor
Course
Course #Location
Instructor
NameOfficeGender
N : MN
M
N
1
![Page 99: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/99.jpg)
NormalizationNormalization• Normalization in database means to
remove the redundant data to improve data storage efficiency, data integrity and scalability.
• It is essential• Good online explanation
![Page 100: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/100.jpg)
3 Level Normalization3 Level Normalization• The first level of normalization removes
redundant data horizontally, i.e. no repeated columns.
• The second level of normalization removes redundant data vertically, i.e. no repeated data in rows.
• The third level of normalization organize data that does not depend on the primary key into another table.
![Page 101: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/101.jpg)
NormalizationNormalization• Totally there are 5 levels of
normalization.• It is absolutely necessary to apply
the 1st and 2nd levels of normalization.
• The 3rd level is applied sometimes.• Don’t bother with the 4th or 5th levels
of normalization
![Page 102: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/102.jpg)
ExerciseExercise• What is the normalization level of the
database constructed?
Student
NameDOBGenderMajor
Course
Course #Location
Instructor
NameOfficeGender
N : MN
M
N
1
![Page 103: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/103.jpg)
Basics (2): Simple QueryBasics (2): Simple Query• Based on the constructed table, let’s have
some fun with Query.• Query is a programming language called
SQL (structured query language).• SQL is a standard interactive and
programming language for getting information from and updating a database.
• Click here to learn more?
![Page 104: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/104.jpg)
![Page 105: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/105.jpg)
Create QueryCreate Query• Create Query in design view• Create Query by using wizard• View the result sheet.
![Page 106: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/106.jpg)
Query SyntaxQuery Syntax• Though we now know how to create
simple queries graphically, we still need to understand the syntax.
• SELECT sth FROM somethere.
Select * from classes;
Select ID from classes;
Select classes.ID, lastname from classes;
![Page 107: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/107.jpg)
Set ConditionsSet Conditions• SELECT something FROM somewhere
WHERE conditions-are-met– Select * from students where gender=0;– Select * from students where
lastname=‘Smith’;– Select * from students where DOB
between #1/1/1988# and #1/1/1990#
![Page 108: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/108.jpg)
Set ConditionsSet Conditions• Select * from students where
lastname like ‘Smi*’;• Select * from students where
lastname like “*smi*”;• SELECT * FROM students WHERE
gender=0 AND lastname like “*smi*”;
• Be aware in standard SQL, LIKE ‘%smi%’;
![Page 109: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/109.jpg)
JOINJOIN• In many cases, we need to fetch data from
multiple tables. Thus, we need to bind together the data from the tables. The binding is based on some keys, usually the primary key or some other unique data items.
• Good online material (but be aware that this is for standard SQL, not for Access!)
• FOR Microsoft inquiry, please go to: http://msdn.microsoft.com/
![Page 110: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/110.jpg)
Join in AccessJoin in Access• Select sth1, sth2 from table1 INNER JOIN
table2 ON table1.key1 = table2.key2.• For example:Select students.* from (students INNER JOIN
studentscourses ON students.ID = studentscourses.studentID) where studentscourses.courseNum=‘Comp200’
![Page 111: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/111.jpg)
Other JOINSOther JOINS• There are two different types of JOIN:
– INNER JOIN– OUTER JOIN
• LEFT JOIN• RIGHT JOIN
Let’s not deal with OUTER JOIN in this class to make it simple.
![Page 112: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/112.jpg)
INNER JOININNER JOIN• INNER JOIN only join the records that
both tables have the corresponding key!.
• See the MSDN explanation
![Page 113: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/113.jpg)
Sort the ResultsSort the Results• You can order the results in
ascending or descending order.
1. Select * from students order by studentID desc;
2. Select * from students order by lastname; (if it is ascending order, you don’t need to specify it)
![Page 114: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/114.jpg)
SubquerySubquery• Inside a query, we can have another
query to provide some information for a condition, i.e. we have subquery(s) inside a query.
Select * from students where studentID in (select studentID from studentscourses where coursenumber=“Comp200”);
![Page 115: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/115.jpg)
FunctionsFunctions• Access query could use built-in
functions. For example, MAX, MIN, COUNT, etc. Let’s experience COUNT.
Question: how to find the number of students who are taking courses currently in the school?
![Page 116: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/116.jpg)
OthersOthers• SO far, we have been dealing with SELECT
queries. There are other types:• CREATE – create tables• INSERT – insert rows• DROP – drop tables• DELETE – delete rows• ALTER - change the table structures• Etc.
![Page 117: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/117.jpg)
Sample DatabaseSample Database• Here is a sample database with some
queries constructed. Might be useful as references.
• Remember that this class is not only for database, so we cannot go very deep into database issues. If you have more interests in database, I may be able to offer a class specifically on database.
![Page 118: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/118.jpg)
Project 5Project 5• Project 5 requests you to construct a
database for a group of restaurant. Please use UML diagram to analyze the data first, then construct your database. Also, please provide some queries. -- Imaging that you are provide a hotline services for customer inquiries about food services in Hartford area.
![Page 119: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/119.jpg)
Part 4Part 4 BioinformaticsBioinformatics• What is bioinformatics?The study of the application of computer and
statistical techniques to the management of biological information
The science of creating and managing biological databases to keep track of, and eventually simulate, the complexity of living organisms.
There exist different definitions, though.
![Page 120: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/120.jpg)
The Possible Role of The Possible Role of Bioinformatics?Bioinformatics?
• Look over the history of biology, different approaches are used over the time.
• Initially “Guessing” Observation Dissection.
• Mendal started genetic experiments.• Biochemists used organics to clear out the
metabolic pathways.• Molecular biology is another approach now
used to decode the life secrets.• Is it the time for bioinformatics as another
approach?
![Page 121: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/121.jpg)
Several Foundations of Several Foundations of BioinformaticsBioinformatics
• Lives are from the same ancestors, ‘either evolved or created’. That means that knowledge obtained on one form of life may be applied to other forms. In fact, molecular biology started from bacteria, then yeast, then mammal. database
• Publicly available data resources.• Human Genome Project
![Page 122: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/122.jpg)
Publicly ResourcesPublicly Resources• I am not sure how many biological
research laboratories we have in the world, it must be MANY MANY.
• No other science has equal or even close amount of research laboratories.
• The largest amount of research funds from government, states, private corporations, etc.
![Page 123: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/123.jpg)
Most famous AgenciesMost famous Agencies• NIH (National Institute of Health)• WHO (World Health Organization)• Others …
![Page 124: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/124.jpg)
Huge Amount of InformationHuge Amount of Information• All the scientists in the world generated
large amount of scientific information, and it is likely much of them is repeated.
• Communication among scientists become extremely important.
• That is why there are so many publicly available biological resources.
• Internet plays a critical role in the information sharing.
![Page 125: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/125.jpg)
Internet’s InformationInternet’s Information1. Access to information for anyone with an
Internet browser.2. The data stored in centralized database
us redundant by a factor of about 2.5, which provides a quality control.
3. Information from yeast (for example) could be helpful in finding/understanding homologous genes/pathways in humans (comparative genomics).
![Page 126: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/126.jpg)
Human Genome ProjectHuman Genome Project• HGP.• Without HGP, there is no real
Bioinformatics.• Bioinformatics boosted up after large
amount of Human Genome are decoded how to use these DNA information? Computer technologies!
![Page 127: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/127.jpg)
Bioinformatics and EvolutionBioinformatics and Evolution
Ancestor
Child A Child B Child C
Child Aa Child Ab Child Ba Child Bb
ChildAa-1 ChildBa-1
Mutations
![Page 128: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/128.jpg)
MutationsMutations• Mutations that occur in germ cells
will be passed on to the next generation, like any other DNA sequences.
• So, as time and generations go by, a DNA sequence will acquire more and more mutations and resemble less and less the original DNA sequence.
![Page 129: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/129.jpg)
Need to know where fromNeed to know where from• From an evolutionary perspective, we
cannot know where we are going unless we know where we have been. Before, the study of human evolution was largely the province of paleoanthropologists who studied the fossil record.
• However, gene comparisons now become the major and more accurate techniques using computer technologies/bioinformatics
![Page 130: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/130.jpg)
Do you know …Do you know …• We all started from Africa?• Using the Mitochondrial DNA analysis
among women from different nations, it is found that African people have larger variations in DNA sequence oldest group has the greatest genetic diversity African is the oldest population the ancestor.
![Page 131: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/131.jpg)
Bioinformatics with AIDSBioinformatics with AIDS• Analysis of the human genome
guides AIDS research. Some persons long-infected with HIV have not shown any symptoms of the disease. Studies found that these people possess a variant of a receptor CCR5 Rarely in Asian and African guess it may come to European in 14th century.
![Page 132: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/132.jpg)
Tools of BioinformaticsTools of Bioinformatics• Gene Predication Software• Sequence Alignment Software• Molecular Phylogenetics• Molecular Modeling and 3-D
Visualization.
![Page 133: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/133.jpg)
NCBINCBI• National Center for Biotechnology
Information.– PubMed (Medline)– Entrez– BLAST– OMIM– Books– TaxBrowser– Structure
![Page 134: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/134.jpg)
PubMedPubMed• Access to the Medline database largest
biomedical literature source.• Medline database contains citations and
abstracts from more than 4600 biomedical journals published in USA and other countries.
• Searches are commonly conducted using a keyword(s), author names, publication date, and/or journal titles.
![Page 135: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/135.jpg)
EntrezEntrez• A search and retrieval system that
integrates all of the databases available at NCBI. These databases include nucleotide sequences, protein sequences, genomes, molecular structure and PubMed.
• GenBank, DNA DataBank of Japan, European Molecular Biology Laboratory make up the International Nucleotide Sequence Database Collaboration. These organizations exchange data every day.
• Search for Bcl2 as an example.
![Page 136: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/136.jpg)
BLASTBLAST• Basic Local Alignment Search Tool.
Sequence 1…AGTTCGATAGCTAAGGTCGG…Sequence 2…AGTTCGATAGCTATGGTCGG…
![Page 137: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/137.jpg)
BLASTBLAST
Sequence 3…AGTTCGATAGCTAAGGTCGG…Sequence 4…AGTTCGATAGCTAGGTCGGG…
![Page 138: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/138.jpg)
BLAST – Another LookBLAST – Another Look
Sequence 3…AGTTCGATAGCTAAGGTCGG…Sequence 4…AGTTCGATAGCTA–GGTCGG…
![Page 139: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/139.jpg)
Use BLASTUse BLAST• Click here.• Let’s choose blastn.• Now, let’s practice its uses.
![Page 140: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/140.jpg)
OMIMOMIM• Online Mendelian Inheritance in Man• It is a database containing
information about human genes and genetic disease. This resources is often used by physicians and researchers interested in genetic diseases.
![Page 141: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/141.jpg)
BooksBooks• NCBI collaborates with authors and
publishers to create a virtual bookshelf.
![Page 142: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/142.jpg)
TaxBrowserTaxBrowser• The taxonomy site contains a
classification of all the organisms that are represented by sequences in the public databases, including model organisms commonly used in molecular biology.
![Page 143: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/143.jpg)
StructureStructure• The structure site features the
Molecular Modeling Database (MMDB), which contains macromolecular 3-D structures as well as tools to analyze them. Included in the MMDB are experimentally determined structures obtained from the protein data bank.
![Page 144: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/144.jpg)
Cn3D4.1Cn3D4.1• You can download it.• It reads MMDB instead of PDB file.
This is because MMDB will ensures the correctness of the read PDB file.
• The Link
![Page 145: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/145.jpg)
Applications of Applications of BioinformaticsBioinformatics
• Forensic Science• Agriculture• Medicine• Pharma/Biotechnology• Environmental Science• Ethical Legal, and Social ISsues
![Page 146: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/146.jpg)
Forensic ScienceForensic Science• Minisatellites consists of short DNA
sequences that repeat in tandem. The number of repeats & the sequence within each repeat can exhibit wide variation in a population. Techniques based on this were developed to identify individuals. E.g. FBI established Combined DNA Index System (CODIS) that contains profiles of convicted offenders.
![Page 147: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/147.jpg)
Forensic ScienceForensic Science• DNA testing is now the standard
technique for confirm paternity.• Is also a technique to identify
criminals and victims.• Computer technology is essential to
search through the database for the identification.
![Page 148: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/148.jpg)
AgricultureAgriculture• Genome projects for major crop
plants are well underway:Pest controlSeed qualityPlant micronutrients (golden rice)Etc.
![Page 149: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/149.jpg)
MedicineMedicine• The ability to correlate genetic data
with medical records promises to improve our understanding of disease and improve treatments.
• Microarray cancer classification• Associating SNPs with disease helps
scientists to identify genes that play roles in disease progression.
![Page 150: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/150.jpg)
Pharma/BiotechnologyPharma/Biotechnology• Bioinformatics is providing a complete list
of candidate genes for drug discovery. The tools of functional genomics are being used to establish the metabolic roles played by the candidate gene products.
• Pharmaceutical companies are using bioinformatics to search for new antibiotics.
![Page 151: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/151.jpg)
Cont’dCont’d• Advances in genomics are expanding
the range of drug targets and are shifting the discovery effort from direct screening programs to rational target-based drug designs.
![Page 152: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/152.jpg)
Environmental SciencesEnvironmental Sciences• Global biodiversity.• Global Biodiversity Information
Facility (GBIF)• How to analyze these diversity and
make use of them.• Computer software to monitor
environmental changes, via birds and other animals’ behaviors.
![Page 153: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/153.jpg)
Ethical, Legal and Social Ethical, Legal and Social IssuesIssues
• Anonymous databases include nonidentifiable genetic data.
• Non-anonymous databases its data could be linked to individuals.
• An ethical concern most relevant to non-anonymous databases is Informed Consent.
![Page 154: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/154.jpg)
Informed ConsentInformed Consent• Informed consent is the ethical
practice of respecting individual autonomy and protecting an individual from harm. It refers to a process whereby an individual freely and knowingly weighs the risks and benefits of donating a tissue or DNA sample for research purposes.
![Page 155: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/155.jpg)
Privacy & ConfidentialityPrivacy & Confidentiality• Personal privacy is an important
aspect of informed consent. Privacy is the right to control access to information about oneself.
• Confidentiality is the obligation for those who obtain information about individuals to protect the privacy of that information.
![Page 156: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/156.jpg)
MoreMore• If society is to gain the most from
genomic biology, then the public must be able to rationally consider scientific issues. They should not place a blind trust in scientists, nor should they dismiss new technologies out of hand.
![Page 157: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/157.jpg)
In-Class ExerciseIn-Class Exercise• Human Genome is sequenced via the
“shortgun” approach in which human chromosomes were randomly cut into pieces.
• Each DNA pieces are sequenced separately.
• Computer technology is then used to find the overlap and construct the contiguous sequence.
![Page 158: Topics in Informatics](https://reader033.fdocuments.us/reader033/viewer/2022051517/56815578550346895dc34074/html5/thumbnails/158.jpg)
In-Class ExericesIn-Class Exerices• Group 1• Group 2• Group 3• Each group will constitute two
fragments and all groups work together for the final sequences.
• For simplicity, we are dealing with only one strand for simplicity.