Maximizing the Impact of Institutional Knowledge Using DSpace
-
Upload
aims-agricultural-information-management-standards -
Category
Technology
-
view
46 -
download
0
Transcript of Maximizing the Impact of Institutional Knowledge Using DSpace
Maximizing the impact of institutional knowledge using
DSpaceAlan Orth
Nairobi, Kenya - July 28, 2015Webinar for AIMS@FAO
Overview
● Why we use DSpace● How we use DSpace● Organizational tips for using DSpace● Technical tips for DSpace deployments
DSpace helps make information “F.A.I.R”
● Free: no subscriptions, “paywalls”, etc● Accessible: is publicly available● Indexed: can be found in search engines● Reusable: has a permissive license
Addresses both the moral and legal imperatives… aka the “carrot” and the “stick”.
History of DSpace at ILRI
● Before: InMagic, physical library● 2009: ILRI launches Mahider (“repository” in
Amharic)● 2010: Other CGIAR research centers and
programs join our platform and share hard / soft costs
● 2011: Rebranded as “CGSpace”● 2015: 9 CGIAR centers, ~50,000 items, ~200k
hits/month
“CGSpace” in July, 2015
How we use DSpace
● Primary location for institutional outputs!● (No posting PDFs on corporate website!)● Content people embedded in each department
help capture results (presentations, papers, brochures, etc)
● Integrate with website and blogs via RSS feeds● (Direct ALL traffic to DSpace!)● For data sets, videos, etc we make a metadata-
only accession with a link to eg YouTube
● Communities, sub-communities, and collections● Tempting to model after organization hierarchy!● (we did)● … but organization hierarchies change!
DSpace hierarchies
Mostly organized by output type now...
Metadata
● Standard Dublin Core is available● No AGROVOC!● You can create custom controlled vocabularies in
arbitrary namespaces, eg: cg.subject.ilri● Display custom fields selectively in the XMLUI
item list and view pages
Custom metadata displayed on ILRI item page
“Discovery” facets
● Context-aware metadata summaries
● Great for content people and users alike
● Side effect: helps spot metadata inconsistencies!
● … Open Access, Open access, open Access, etc.
● DSpace 4+, XMLUI
Search engine optimization (SEO)
Help Google Scholar consume your content...
1. XML sitemaps (see DSpace manual)2. Submit sitemap to Google Webmaster Tools to
control indexing, see stats, etc.3. Single, consistent domain name, ie:
cgspace.cgiar.org4. Persistent links for resources (“Handle”)5. Website speed and HTTPS both a plus6. Bing, Yahoo, and Yandex less important
SEO: crawling vs consuming
● Traditionally search engines basically “stumble” upon your content
● Using XML sitemaps they can consume it in a structured way
● Google discontinued the use of OAI for discovering site content in 2008!
Drinking from the firehose!
Sitemap view in Google Webmaster Tools
Meteoric rise in Google’s indexes
Importance of persistent links
● Website addresses change…● mahider.ilri.org -> cgspace.cgiar.org● But resources stay the same!
http://hdl.handle.net/10568/67073
● “Handle” service from handle.net● Everything under prefix 10568 is CGSpace● Default DSpace handle prefix is 123456789!
dc.identifier.uri: persistent universal resource identifier
Getting data INTO DSpace
● Day-to-day submission is manual (by a small army of editors)
● One-time batch uploads of items from other systems in CSV format (InMagic!)
● OAI-PMH for metadata only● OAI-ORE for metadata + bitstreams (eg, from
another DSpace, Sharepoint, etc)● SWORD (haven't tried)● REST API (DSpace 5+, haven't tried)
Getting data OUT OF DSpace
● REST API for structured JSON or XML 👍● OAI-PMH for metadata● OAI-ORE for metadata + bitstreams (PDFs, etc)● RSS feeds for websites / blogs● XML sitemaps for search engines
CCAFS website, powered by Drupal + DSpace APIs
“Latest outputs” on ILRI homepage, via DSpace RSS
“Latest outputs” on project blog, via DSpace RSS
CGSpace technology stack
- NGINX 1.8 HTTP server- TLS termination, SPDY, redirects, virtual hosts
- Tomcat 7 servlet engine- runs DSpace, bound to localhost
- Ubuntu 14.04 GNU/Linux OS- long-term support release, good mix of stable / new
Skills needed in your organization
Besides content people(!)...
● Prioritize: Linux systems administration experience (Tomcat, httpd, PostgreSQL, DNS, SSH, git)
● General: computer science background● Web developers a diverse bunch...● Java development experience doesn't hurt
Extra considerations
● Item mapping● Maintenance tasks (background batch jobs)● Backups of assetstore and PostgreSQL!● Altmetrics tracks social media mentions● Separate production / development
environments● CGSpace server is $80/month● ~20GB of PDFs, ~8GB of Solr data
Getting help
● “DSpace Tech” mailing list● “dspace” tag on StackOverflow website● [email protected]
This presentation has a Creative Commons licence. You are free to re-use or distribute this work for non-commercial purposes, provided credit is given to ILRI.
better lives through livestock
ilri.org
Box 30709, Nairobi 00100, KenyaPhone +254 20 422 3000Fax +254 20 422 3001Email [email protected]
ilri.orgbetter lives through livestock
ILRI is a member of the CGIAR consortium
ILRI has offices in:Central America • East Africa
South Asia • Southeast and East AsiaSouthern Africa • West Africa