DSpace Basic Tutorial Stuart Lewis & Chris Yates support@rsp.ac.uk.

Post on 23-Dec-2015

220 views 0 download

Tags:

Transcript of DSpace Basic Tutorial Stuart Lewis & Chris Yates support@rsp.ac.uk.

DSpace Basic Tutorial

Stuart Lewis & Chris Yatessupport@rsp.ac.uk

Information

• Details:– Requires tutorial CD

• DSpace 1.4.2• CD includes DSpace 1.5 alpha

• CD and workbook created by:– Chris Yates

• chris.yates@aber.ac.uk

– Stuart Lewis• stuart.lewis@aber.ac.uk

Information

• Tutorial created by:– Repositories Support Project– http://www.rsp.ac.uk/– support@rsp.ac.uk

• Funded by:– JISC as part of the RepositoryNet

Contents

1. Introduction to DSpace

2. The tutorial CD

3. DSpace technical architecture

4. Users and groups

5. Item structure

6. Metadata and item input, workflows

7. Search and browse

8. Import / export / harvest

Introduction to DSpace

• “DSpace captures your data in any format in text, video, audio, and data. It distributes it over the web. It indexes your work, so users can search and retrieve your items. It preserves your digital work over the long term. DSpace provides a way to manage your research materials and publications in a professionally maintained repository to give them greater visibility and accessibility over time.”– www.dspace.org

*Introduction to the tutorial CD

• Intended to be used with tutorial• DSpace version 1.4.2 (includes 1.5 alpha)• Shouldn’t affect your PC• Installs no software• Can be reused• DOES NOT SAVE DATA!• Disclaimer…

DSpace Technical Architecture

• Written in Java– Can be run on any platform that supports

Java• Most installation on Unix (Linux* / Solaris)• Runs on Windows / Mac OS X

– Sun JDK (not GNU)– 1.4 for <= version 1.4.2– 1.5* for >= version 1.5

DSpace Technical Architecture

• Database:– Same machine, or database server

• Postgres*• Oracle

• Web application server– Tomcat*– Jetty– Other

DSpace file layout

• Download– [dspace-src]

• Edit config/dspace.cfg• Build

• Installed– [dspace] (often /dspace/)

• [dspace]/assetstore/• [dspace]/upload/• [dspace]/logs/• [dspace]/bin/• [dspace]/search/

*Create the database

• Create a database user– Who will own the database– Called dspace

• Create the database– Called dspace– UNICODE encoding– Owned by the dspace database user

Create the database

• Create a database user– Double click on the ‘Terminal icon’– ‘su - postgres’ password is ‘postgres’– ‘createuser -U postgres -d -P dspace’– password is ‘postgres’

• Create the database– ‘createdb -U dspace -E UNICODE dspace’

*Build DSpace

• DSpace needs to be compiled• Uses ‘ant’ build system• Inserts default data into the database

– Table structures– Dublin Core metadata schema– Bitstream formats

• Builds package for the web server• Configuration can be changed

Build DSpace

• ‘cd /dspace142-src/’• ‘gedit config/dspace.cfg’

– Change dspace.name to your name– Save and quit

• ‘ant fresh_install’

• ‘chmod 777 /dspace142/upload’• ‘chmod 777 /dspace142/assetstore’

*Deploy to the web server

• .war files are packaged applications

• Tomcat is a Java web application server– Tomcat automatically unpacks .war files

• Two web applications– DSpace & DSpace OAI interface

• Copy .war files to Tomcat directory

• Start Tomcat

Deploy to the web server

• ‘cp build/*.war /var/lib/tomcat5.5/’

• ‘sudo /etc/init.d/tomcat5.5 start’

• Load Firefox– Go to http://localhost:8080/dspace/

DSpace users and groups

• Administrator– bin/create-administrator

• Create more users:– Web user interface– Administrator

• Other authentication methods:– LDAP (LDAP or Active Directory)– Plugable and stackable authentication

DSpace users and groups

• Groups– Members can be users of other groups– E.g. Dept group made up of research

group groups

• User defined or

• Automatically generated for collections

*Create DSpace users

• Create an administrator

• Log in

• Log out

• Create a normal user

• Modify a group

Create DSpace users

• Create first administrator– ‘bin/create-administrator’– Answer questions

• Create another user– Administrator pages, ‘E-People’– ‘Add EPerson’

• Promote new user to administrator– Administrator pages, ‘Groups’– Edit ‘Administrator’ group

*Communities & collections

• Communities– Often used to represent organisational

units– Can have sub-communities and collections– Can be branded (logo) and have own

policies

• Collections– Holds items

Communities & collections

• ‘Communities / Collections’ link

• ‘Create Top-Level Community’– Enter name, short description– Press ‘Create’ button

• Create collection– Enter name, short description– Add the Administrator group to submitters

DSpace items

• Metadata– One or more metadata schemas– User-entered– System-generated (e.g. accessioned date)

• Files– In bundles– Special bundles (e.g. extracted text,

licences)

DSpace Items

• Items can be mapped across collections– E.g. appear in central and departmental e-

theses collections– Same as a file system symbolic link

• Submissions controlled by input forms– config/input-forms.xml– Input forms

• Controlled vocabularies

DSpace Items

• input-forms.xml<form-map>

<name-map collection-handle="default" form-name="traditional" />

<name-map collection-handle=”2160/56” form-name=”etheses" />

<name-map collection-handle=”2160/802" form-name=”lawdepartment" />

</form-map>

input-forms.xml

<form name="traditional"><page number="1">

<field>…

</field><field>

…</field>

</page><page number="…">

…</page>

</form>

input-forms.xml

<field><dc-schema>schema</dc-schema><dc-element>element</dc-element><dc-qualifier>qualifier</dc-qualifier><repeatable>true/false</repeatable><label>Text label</label><input-type>name/onebox/date/twobox/textarea/dropdown/

qualdrop_value</input-type><hint>Expanded hint</hint><required>Warning to show if not entered</required>

</field>

input-forms.xml

<value-pairs value-pairs-name=”name" dc-term=”element">

<pair>

<displayed-value>English</displayed-value>

<stored-value>en</stored-value>

</pair>

<pair>

<displayed-value>Welsh</displayed-value>

<stored-value>cy</stored-value>

</pair>

<value-pairs>

input-forms.xml

<field><dc-schema>dc</dc-schema><dc-element>contributor</dc-element><dc-qualifier>author</dc-qualifier><repeatable>true</repeatable><label>Authors</label><input-type>name</input-type><hint>Enter the names of the authors of this item below.</hint><required></required>

</field>

input-forms.xml

<field><dc-schema>dc</dc-schema><dc-element>title</dc-element><dc-qualifier></dc-qualifier><repeatable>false</repeatable><label>Title</label><input-type>name</input-type><hint>Enter the main title of the item.</hint><required>You must enter a main title for this item.</required>

</field>

*Deposit an item

• Choose collection

• Enter metadata

• Upload file

• Confirm details

• Agree to the licence

Deposit an item

• Enter the collection you created– Tick ‘The item has been published or

publicly distributed before’ - asks extra questions about the publishing (i.e. date / publisher)

– Enter metadata– Upload file (‘/dspace-docs/RSP.pdf’)– Agree to licence– Submit item

*Create a workflow

• Three workflows– Accept/reject step

• E.g. Head of research• “Should item be included in the repository?”

– Accept/reject/edit metadata step• E.g. Repository manager

– Edit metadata step• E.g. Librarian

Create a workflow

• Create new collection– Tick ‘This submission will include and

accept/reject/edit metadata step’– Enter name and short description– Add ‘Administrator’ group to workflow

• Submit to the new collection• Go to ‘My DSpace’ to enter the workflow

– ‘Edit Metadata’– ‘Approve’

Search and browse

• Browse– By:

• Author / title / date

– Database driven– Always up to date

• Search– Lucene search engine– Define fields to index in dspace.cfg– Full texts– Not always up to date

*Search system initalisation

• Build indexes– Index metadata

• Extract from database

– Index full-texts• Extract from PDF/Doc files• Extra MediaFilters can be written

Search system initalisation

• Search ‘Aberystwyth’– No results

• Run:– ‘bin/filter-media’

• Extract full texts• (create thumbnails)• Build indexes

• Search ‘Aberystwyth’– See results!

*Scheduled background jobs

• filter-media– Extract texts and build indexes

• sub-daily– Email subscription emails

• checker– checks bitstream checksums

• stat-* – statistics

*RSS feeds and thumbnails

• Configured in dspace.cfg

• RSS feeds:– webui.feed.enable = [false|true]– webui.feed.localresolve = [false|true]

• Thumbnails:– webui.item.thumbnail.show = [true|false]– webui.browse.thumbnail.show = [false|true]

RSS feeds and thumbnails

• ‘gedit /dspace142/config/dspace.cfg’– Set webui.feed.enable to true– Set webui.feed.localresolve to true

• ‘sudo /etc/init.d/tomcat5.5 restart’• Upload new item with PNG file

– Upload png from /home/dspace/examples/– ‘/dspace142/bin/filter-media’– See thumbnail

Import / export

• See docs

• Bulk import command line tool– Imports one item per directory– Multiple files / metadata file / contents file

• Bulk exporter– Writes same file format– Adds file containing handle (for re-import)

Import / export

archive_directory/item_000/

dublin_core.xml -- qualified DC metadata

contents -- one line per filename file_1.doc -- files to be added file_2.pdf

[dspace]/bin/dsrun org.dspace.app.itemimport.ItemImport --add

--eperson= joe@user.com --collection=collectionID --source=items_dir --mapfile=mapfile

Harvesting / OAI-PMH

• OAI-PMH interface– Separate web application– /dspace-oai/– /dspace-oai/request?verb=– /dspace-oai/request?verb=Identify– /dspace-oai/request?verb=ListSets– /dspace-oai/request?verb=GetRecord– /dspace-oai/request?verb=ListIdentifiers– /dspace-oai/request?verb=ListMetadataFormats– /dspace-oai/request?verb=ListRecords

The end…

• Incomplete– Lots lots more!– Email support@rsp.ac.uk– Email dspace-tech email list

• Advanced tutorial this afternoon– Or surgery / open discussion / demos