Markup in the mobilization of biodiversity literature

9
Daniel Mietchen Markup in the mobilization of biodiversity literature Museum für Naturkunde Berlin

description

A contribution to the pro-iBiosphere Final Conference on June 12, 2014 in Meise, Belgium. More info via http://wiki.pro-ibiosphere.eu/wiki/Final_Conference .

Transcript of Markup in the mobilization of biodiversity literature

Page 1: Markup in the mobilization of biodiversity literature

Daniel Mietchen

Markup in the mobilization of biodiversity literature

Museum für Naturkunde Berlin

Page 2: Markup in the mobilization of biodiversity literature

● hundreds of millions of pages○ ca. 20k treatments of new taxa per year○ 50-100k re-descriptions annually○ scattered across thousands of journals

and books

Biodiversity literature

Page 3: Markup in the mobilization of biodiversity literature

● geared towards the human reader● not machine-readable (scans/ PDF)● accumulated over three centuries● includes much of what is published

today

Legacy literature

Page 4: Markup in the mobilization of biodiversity literature

● digital > paper-only● open access > hidden● with > without open data● soon: machine readable > PDF● these biases may skew analyses

Use & citation

Page 5: Markup in the mobilization of biodiversity literature

● identifying concepts● linking them using controlled vocabularies● integrating with other sources of information

Markup

Page 7: Markup in the mobilization of biodiversity literature

● automated markup of prospective literature● crowdsourced markup of legacy literature● semi-automated markup with expert

assistance

Scaling up

Page 8: Markup in the mobilization of biodiversity literature

● mark up taxonomic publications henceforth● focus on revisionary works (biotas) ● adjust granularity to concrete use cases● follow standards● automate workflows

Recommendations