Bringing Data Science, Xinformatics and Semantic eScience into the Graduate Curriculum (solicited)
EGU2012-11224 (EOS 6/ ESSI2.3)April 25, 2012, Vienna
Peter Fox (RPI) [email protected] World Constellation
tw.rpi.edu
Themes
Future Web•Web
Science•Policy•Social
Xinformatics•Data Science
•Semantic eScience
•Data Frameworks
Semantic Foundations•Knowledge Provenance
•Ontology Engineering Environments•Inference, Trust
Hendler
Fox
McGuinness
Multiple depts/schools/programs ~ 35 (Post-doc, Staff, Grad, Ugrad)
Application Themes
Govt. Data•Open
•Linked•Apps
Env. Informatics•Ecosystems
•Sea Ice•Ocean imagery
•Carbon
Health Care/ Life Sciences•Population Science•Translational Med
•Health Records
Hendler/ Erickson
Fox
McGuinness/LucianoPlatforms:
Bio-nano tech centerExp. Media and Perf. Arts Ctr.Comp. Ctr. Nano. Innov.
Data Intensive
http://tw.rpi.edu/web/Courses
4
Data Information Knowledge
Context
PresentationOrganization
IntegrationConversation
CreationGathering
Experience
Data Science Xinformatics Semantic eScienceWeb Science
Also at RPI
• Data Science Research Center and Data Science Education Center
• http://www.rpi.edu/about/inside/issue/v4n17/datacenter.html– Over 35 research faculty, 5 post-docs, ? grad
students• Data is one of Rensselaer Plans’ five thrusts• Other key faculty
– Fran Berman (VPR)– Jim Myers (Director CCNI)
Curriculum
• Web Science and IT – undergrad, and MSc. and PhD. (with science concentrations)
• Environmental Science with Geoinformatics concentration
• Bio, geo, chem, astro, materials - informatics• GIS for Science• Master of Science – Data Science (pending)• Multi-disciplinary science program (2012) PhD
in Data and Web Science
E.g. IT with Env. Sci.
• ERTH-1200 Geology II (4 credits) - spring• CHEM-2250 Organic Chemistry I (4 credits) - spring • ERTH-2210 Field Methods (2 credits) - fall• IENV-1920 Environmental Seminar (2 credits) - spring • BIOL-2120 Intro. to Cell and Molecular Biology (4 credits)
- spring• IENV-4500 Global Environmental Change (4 credits) - fall• ERTH-4180 Environmental Geology (4 credits) – spring• ERTH-4963 Xinformatics (4 credits) – spring• IENV-4700 One Mile of the Hudson River (4 credits) - fall
Geoinformatics concentration
• CSCI1000 - Computer Science I• CSCI1200 - Data Structures• CSCI2300 - Introduction to Algorithms or
ERTH 4750 - Geographic Information Systems in the Sciences
• CSCI4380 – Databases• CSCI4961 - Data Science• CSCI4960 – Xinformatics• ERTH 4980 – Senior Thesis
Web Science Learning Objectives
• Students will demonstrate knowledge and be able to explain the three different "named" generations of the web (a/k/a Web 1.0, Web 2.0, and Web 3.0) from mathematical, engineering, and social perspectives
• Students will demonstrate the ability to use the dynamic programming language Python to develop programs relating to Web applications and the analysis of Web data.
• Students will be able to understand and analyze key Web applications including search engines and social networking sites.
• Students will be able to understand and explain the key aspects of Web architecture and why these are important to the continued functioning of the World Wide Web.
• Students will be able to analyze and explain how technical changes affect the social aspects of Web-based computing.
• Students will be able to develop "linked data" applications using Semantic Web technologies.
Data Science Objectives
• To instruct future scientist how to sustainably generate/ collect and use data for their research as well as for others: data science.
• To instruct future technologists how to understand and support essential data and information needs of a wide variety of producers and consumers
• For both to know tools, and requirements to properly handle data and information
• Will learn and be evaluated on the full life-cycle of data and relevant methods, technologies and best practices.
10
Learning Objectives
• Develop and demonstrate skill in data collection and management
• Know how to develop and apply data models and metadata models
• Demonstrate knowledge of data standards• Develop and demonstrate the application of skill in
data science tool use and evaluation• Demonstrate the application of data life-cycle
principles and data stewardship• Demonstrate proficiency in data and information
product generation11
Xinformatics Objectives
• To instruct future information architects how to sustainably generate information models, designs and architectures
• To instruct future technologists how to understand and support essential data and information needs of a wide variety of producers and consumers
• For both to know tools, and requirements to properly handle data and information
• Will learn and be evaluated on the underpinnings of informatics, including theoretical methods, technologies and best practices.
12
Learning Objectives
• Through class lectures, practical sessions, written and oral presentation assignments and projects, students should:– Develop and demonstrate skill in development and
management of multi-skilled teams in the application of informatics
– Demonstrate ability to develop conceptual and logical information models and explain them to non-experts
– Demonstrate knowledge and application of informatics standards
– Demonstrate skill in informatics tool use and evaluation
13
Modern informatics enables a new scale-free framework approach
Semantic eScience Objectives
• Ontology Development, Merging and Validation• Semantic Language and Tool Use and
Evaluation• Use Case Development and Elaboration• Semantic eScience Implementation and
Evaluation via Use Cases• Semantic Application Development and
Demonstration• Group Project and Team Development, Use
Case Implementation and Evaluation
Discussion…
• Science and interdisciplinary from the start!– Not a question of: do we train scientists to be technical/data
people, or do we train technical people to learn the science– It’s a skill/ course level approach that is needed
• Education and research semi-coupled• We must teach methodology and principles over
technology *• Data science must be a skill, and natural like using
instruments, writing/using codes• Team/ collaboration aspects are key **• Foundations and theory must be taught ***
18
Progression after progression
IT CyberInfrastructure
Cyber Informatics
Core Informatics
Science Informatics
Science, Societal Benefit Areas
Informatics
Example:
•CI = OPeNDAP server running over HTTP/HTTPS
•Cyberinformatics = Data (product) and service ontologies, triple store
•Core informatics = Reasoning engine (Pellet), OWL
•Science (X) informatics = Use cases, science domain terms, concepts in an ontology
Requirements
Top Related