D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata
-
Upload
musicnet -
Category
Technology
-
view
1.105 -
download
1
description
Transcript of D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata
![Page 1: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/1.jpg)
Music Linked Data Workshop
12 May 2011 • JISC, London
MusicNet: Aligning Musicology’s Metadata
David Bretherton (Music), Daniel Alexander Smith, Joe Lambert and mc schraefel (Electronics and
Computer Science)
http://musicnet.mspace.fm
![Page 2: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/2.jpg)
David Bretherton
2
![Page 3: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/3.jpg)
musicSpace, the precursor to MusicNet
3
![Page 4: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/4.jpg)
Problem
4
![Page 5: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/5.jpg)
Digitised data is often ‘siloed’.
Geographical dispersal has been replaced by virtual dispersal on the web. Data is now segregated into countless online repositories by: – Media type (text, image, audio,
video)– Date of creation/publication– Subject
5
![Page 6: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/6.jpg)
Digitised data is often ‘siloed’.
Geographical dispersal has been replaced by virtual dispersal on the web. Data is now segregated into countless online repositories by: – Language– Copyright holder– Ad hoc/insecure nature of project
funding
6
![Page 7: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/7.jpg)
Digitised data is often ‘siloed’.
Interoperability has generally not been given a high enough priority.
And, because the datasets are ‘mature’ the data isn’t Linked Data.
7
![Page 8: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/8.jpg)
Solution
8
![Page 9: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/9.jpg)
9
‘musicSpace’ is a faceted browser
![Page 10: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/10.jpg)
10
Demonstration
‘What recording of works by Cage exist, which performers have recorded a particular work by Cage, and what else by Cage have they recorded?
Screencast 1:
http://www.youtube.com/watch?v=keTN12OWies&hd=1
![Page 11: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/11.jpg)
How musicSpace provided the motivation for MusicNet
11
![Page 12: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/12.jpg)
Problem: you can align metadata fields, but this doesn’t align the data in those fields
12
Schubert Schubert, Franz Schubert, Franz Peter Shu-po-tʻe, ‡d 1797-1828 Schubert ‡d 1797-1828 F. P. Schubert Schubert, ... ‡d 1797-1828 Schubert, F. Schubert, F. ‡d 1797-1828 Schubert, Fr. Schubert, Fr. ‡d 1797-1828 Schubert, Franciszek. Schubert, Franc. ‡d 1797-1828 Schubert, Francois ‡d 1797-1828 Schubert, Franz P. ‡d 1797-1828
Schubert, Franz Peter Schubert, Franz Peter, ‡d 1797-1828 Schubert, Franz Peter ‡d 1797-1828 Schubert, Francois, ‡d 1797-1828 Schubert. Schubert ‡d 1797-1828 Shu-po-tʿe ‡d 1797-1828 Shubert, F. (Frant $s% ) ‡d 1797-1828 Shubert, F. ‡q (Frant $s% ), ‡d 1797-1828 Shubert, Frant $s% , ‡d 1797-1828 Shubert, Frant $s% ‡d 1797-1828 Shūberuto, F. Shūberuto, Furantsu ‡d 1797-1828 Subert, Franc ‡d 1797-1828 Subertas, F. (Francas), ‡d 1797-1828
Subertas, Francas Peteris, 1797-1828‡d Subert, F.
, .Subertas F ‡d 1797-1828 פרנץ, שוברט
シューベルト, F., 1797-1828 シューベルト , フランツ ‡d 1797-1828 舒柏特 , 弗朗茨 Schubert, Francois 1797-1828‡d
, Schubert Franz Peter 1797-1828‡d
![Page 13: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/13.jpg)
Causes of ‘dirty’ data (for names)
Different naming conventions;– e.g. ‘Bach, Johann Sebastian’ or ‘J. S. Bach’
Inclusion of non-name data in name field; – e.g. ‘Schubert, Franz, 1797-1828. Songs’,
or ‘Allen, Betty (Teresa)’
Different languages (and alphabets);
User input errors. – e.g. ‘Bach, Johhan Sebastien’
13
![Page 14: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/14.jpg)
Dirty data degrades the user experience
14
Searching for compositions by the composer Franz Schubert (1797–1828)...
Screencast 2:
http://www.youtube.com/watch?v=pFsYfz1vlAg&hd=1
![Page 15: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/15.jpg)
MusicNet’s alignment tool
15
![Page 16: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/16.jpg)
Prototype 1 (musicSpace era)
16
![Page 17: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/17.jpg)
Used Alignment API & Google Docs
We used Alignment API to compare the names as strings, using WordNet to enable word stemming, synonym support, etc.
Alignment API produces a similarity measure for each possible match.
We planned to set a threshold for automatic approval.
Matches below that threshold would be sent to a Google Docs spreadsheet for expert review.
17
![Page 18: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/18.jpg)
Shortcoming: no threshold
False matches with high similarity measures:
True matches with low similarity measures:
18
![Page 19: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/19.jpg)
Prototype 2 (building a custom tool
for MusicNet)
19
![Page 20: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/20.jpg)
Design considerations
From Prototype 1:– A completely automated solution is out of the
question (for the moment...). – We needed a custom tool with a human-friendly UI
(we also wanted keyboard shortcuts for speed).– Access to additional metadata (i.e. context), so
matches can be researched by the reviewer.
From experience with faceted browsers: – Alphabetically sorted columns enable one to spot
synonymous names at a glance.· Normally sources give names surname first; duplication
arises from the different representation of given names.
20
![Page 21: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/21.jpg)
Alignment process Data*
21
Suggested groups
Algorithm compares hash of alpha-only l.c. version of name
No groups suggested
User verified* or rejected*
Synonym groups
Manual grouping (research*)
URIs Alternative names Back links*
![Page 22: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/22.jpg)
UI of Prototype 2
22
![Page 23: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/23.jpg)
Prototype 2 demo
23
Screencast 3:
http://www.youtube.com/watch?v=5f8iaryZMk0&hd=1
![Page 24: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/24.jpg)
Daniel Alexander Smith
24
![Page 25: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/25.jpg)
Linked Data
25
URI for everything
e.g. Beethoven is:– http://musicnet.mspace.fm/person/367b10
7e07a7f9db8aed7c72d2ebeab2#id– http://dbpedia.org/resource/Ludwig_van_B
eethoven– http://www.bbc.co.uk/music/artists/1f9df1
92-a621-4f54-8850-2c5373b7eac9#artist
![Page 26: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/26.jpg)
Contribution
26
MusicNet provides links between composers in multiple scholarly repositories
We also link to MusicBrainz and BBC /music
This can be fed back into projects like musicSpace where disambiguation is a problem
![Page 27: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/27.jpg)
27
![Page 28: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/28.jpg)
MusicNet Published Data
28
Links between multiple URIs
Representations from each source
Machine-readable, standardised to build applications over this data
Human searchable and usable too
http://musicspace.mspace.fm
![Page 29: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/29.jpg)
29
![Page 30: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/30.jpg)
30
![Page 31: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/31.jpg)
Provenance
31
Retains source of information
e.g. that Grove say “Schubert, Franz (Peter)” and British Library say “Schubert, Franz” and “Schubert”
![Page 32: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/32.jpg)
Provenance
32
When they don’t exist already, musicnet provides individual URIs for a composer from each source, e.g.:– http://musicnet.mspace.fm/person/7ca5e1
1353f11c7d625d9aabb27a6174#blcollection
Then links back to search URLs, e.g.:– http://catalogue.bl.uk/F/?
func=find-b&request=Schubert%2C+Franz&find_code=WNA
![Page 33: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/33.jpg)
33
![Page 34: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/34.jpg)
34
![Page 35: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/35.jpg)
Links from BBC /music
35
Harvested links from BBC to:– DBPedia– New York Times– IMDB– PBS– etc.
![Page 36: D. Bretherton, D. A. Smith, J. Lambert, mc schraefel. MusicNet: Aligning Musicology’s Metadata](https://reader035.fdocuments.us/reader035/viewer/2022070316/555c1795d8b42ad27e8b5444/html5/thumbnails/36.jpg)
36
Thank you for listening!