Office of Research and Development Office of Science Information Management, Information Management...

Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2” H x 6.93” W and can be masked by a collage strip of one, two or three images. The photo image area is located 3.19” from left and 3.81” from top of page. Each image used in collage should be reduced or cropped to a maximum of 2” high, stroked with a 1.5 pt white frame and positioned edge-to-edge with accompanying images. Lynne Petterson, Ph.D. ORD Office of Science Information Management/IMSB EPA Quality Management Conference May 13, 2009 ORD’s Scientific Data Management Strategy (SDM): First Steps
  • date post

  • Category


  • view

  • download


Transcript of Office of Research and Development Office of Science Information Management, Information Management...

Page 1: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”

Office of Research and DevelopmentOffice of Science Information Management, Information Management Support Branch (IMSB)

Photo image area measures 2” H x 6.93” W and can be masked by a collage strip of one, two or three images.

The photo image area is located 3.19” from left and 3.81” from top of page.

Each image used in collage should be reduced or cropped to a maximum of 2” high, stroked with a 1.5 pt white frame and positioned edge-to-edge with accompanying images.

Lynne Petterson, Ph.D.ORDOffice of Science Information Management/IMSBEPA Quality Management Conference May 13, 2009

ORD’s Scientific Data Management Strategy (SDM): First Steps

Page 2: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”


ORD Scientific Records Management (RM)

• Create Guidance• Retention schedules (contracts vs. scientific records)• Formats• Data created under Contracts • Consistency in storing email records (LAN vs. C drive)

–Paper req’t => technical direction• Management training – “scientific data ARE Records” • Newbies and Old Codgers (entrance & exit)

• Develop ORD Scientific Records Training–Excitement & interest–Annual COR re-certification

• Leverage taxonomy effort–Facets => file folder headings–Potential pilot test

Page 3: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”


ORD Tools and Processes for Sci Data Mgmt • Create Guidance

• Format• Access rights• Meta data standards and key words• File naming conventions• Versioning • Location (centralized repository? decentralized?)• Stewardship• Knowledge capture

–Post mortem ‘look back’• Reusability

–Role of originator in approving re-use• Retirement requirements

• Where start?– Other research entities – Federal– Existing policies – holes– Talk to researchers

Page 4: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”


ORD’s Taxonomy Effort

• What is ‘Taxonomy’?

– One or more vocabularies of controlled terms of key concepts organized into loose hierarchies

– Can provide context to support machine processing

• How will it benefit ORD?

• Enterprise content management• Web content management• Scientific records management• Automated meta data generation• Scientific data management• Collaborative research• Access and use of scientific data and information • Search engines • Keywords for project clearance

Page 5: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”


ORD Taxonomy Current Activities

• Mined key terms

– Research plans– ORD websites and National Program Plans– Science Inventory • More than 2600 unique• Approximately 1600 descriptors• Approximately 1000 acronyms, abbreviations, or variants• Normalized variants (plural vs. singular), separate proper nouns

• Reviewing terms with Subject Matter Experts– Add, delete, separate/join

• Identified major categories (‘facets’) for organization of terms– ‘Human construct’• Ease of review

Page 6: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”


Examples of Potential Facets ‘v19’

• Substance– Biological substance

• Human: Activity (farming), Body part/system (liver), disease (asthma), characteristic (female, smoker)

• Non-human animal: Activity (migration), characteristic (web feet), role (predator; disease vector)• Plant: Characteristic (leaf mold), body part (root), role (food; invasive plant, parasite)• Microorganism: Body part (cell membrane/wall), role (pathogen)

– Chemical substance

• Environmental Event (clean up/remediation, climate change*, disaster preparedness)

• Research– Project life cycle (planning, lit review, experiment/obs/modeling, analysis, clearance)

• Applied Research (engineering, environmental technology – AP Control, WW Treatment)

• Built environment (transportation, telecommunication, energy)• Natural environment (atmosphere, terrestrial, aquatic, oceanic, ecosystem**, nat’l processes)

• Research (life cycle – plan, lit review, experiment/observation, modeling/simulation, analysis, clearance, etc)• Research resources (methods, protocols, models & simulation tools)• EPA Operations (policies, business processes)• Laws, Regulations, Treaties• etc

Page 7: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”


Revelations (or ‘fun facts’)

• ORD uses LOTS of acronyms (AND multiple meanings)

• Definitions – set versus evolving (ecosystem services, sustainability)

• Missing terms & definitions (global climate change)• Multidisciplinary (terms located in other/multiple locations/categories)

• Examples of limited shelf life (alternative energy sources)– Terms (‘alternative energy sources’)• Legacy access

– Proper names, organizational names, laws/regulations • Separate management & legacy access

• ORD and EPA are interested in ‘systems’ or framework approaches – Multidisciplinary areas => develop frameworks to serve as model for

research• Provides different views on same terms• Ability to pull out terms that relate to frameworks, while maintaining

term in ‘native facet’• Leads to different ontologies & how overlay on terminology base

Page 8: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”


ORD Taxonomy Near-Term Steps

• Continue refining ‘facets’ or categories

• Subject Matter Expert review of terms within ‘facets’

• Begin organizing terms within facets into high-level hierarchies

• Test and extend terminology–Sample of records from Science Inventory –Use cases -> paper prototypes

• Not only ‘what’ buy ‘how’–Duplicative processes/guidance – beginning collaborative

teams– Open source versus copyrighted

Page 9: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”


ORD Taxonomy Next Steps

• September 30: New contract vehicle (begin January 2010)

• November 30: Draft proposal for terminology maintenance, governance, and stewardship

–How terms will be added, deleted, modified–Roles and responsibilities–Final decision authority

• December 15: Analysis of terminology use cases

• December 30: Version 1.0 ORD Taxonomy and Draft Final Paper –Summary of issues and proposed resolution–Findings from use cases –Putting the taxonomy/ontology to use and extending it to deeper


Page 10: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”


Next Steps (continued)

• January 2010: Phase 2

–Add test cases in additional scientific areas

–Populate lower levels of hierarchy and verify with Subject Matter Experts

– Identify proposed replicable process & guidance

–Draft outreach and training approach, materials, and proposed schedule • Publicize documented benefits• Test & assess proposed replicable process & guidance

Page 11: Office of Research and Development Office of Science Information Management, Information Management Support Branch (IMSB) Photo image area measures 2”


Special Thank You• Gail Hodge IIA

• Michael Pendleton and Cynthia Dickinson, OEI, Data Standards Branch

• OSIM Branch Chiefs– Brenda Young, Branch Chief, Application Support Branch (ASB/OSIM) – Lynnann Hitchens, Branch Chief, Information Management Support Branch


• OSIM Collaboators– Laura Doyle – ORD Lead RLO (Records Liaison Officer) (IMSB/OSIM)– Valerie Brandon – ORD Lead Documentum/ECMS (ASB/OSIM)– Jacques Kapuscinski – ORD Lead Science Inventory (IMSBOSIM)

• ORD managers, researchers, QA staff, and collaborative partners who have provided invaluable input concerning terminology ‘portfolios,’ terminology facets, existing glossaries, and potential sources.