Jussi Nuorteva - Power of Open Data in Archives
-
Upload
icarus-international-centre-for-archival-research -
Category
Education
-
view
385 -
download
1
Transcript of Jussi Nuorteva - Power of Open Data in Archives
Power of Open Data in Archives
OpenGLAMLinz
2nd May 2016
Jussi Nuorteva
Outlines of the presentation• National Archives Services of
Finland – Brief Introduction• Building up a Digital Society
- Can analogue records be destroyed after digitization?• Renewal of Research Process
through Digital Research Data• Some Development Projects
Archives Act of Finland831/1994
4 §”The duty of the National Archives Service is to ensure the preservation and availability of records belonging to the national cultural heritage, to promote research and to guide, develop and study archives and records administration.”
Helsinki – The National Archives
Hämeenlinnan Provincial Archive
Joensuu Provincial Archive
Jyväskylä Provincial Archive
Mikkeli Provincial Archive
Oulun Provincial Archive
Turku Provincial Archive
Vaasa Provincial Archive
Sámi Archive, Inari (National Archives)
National Archives Service of Finland • Inari
Military Archives & Archives of the Prime Ministers Office to National Archives in 2008
Construction work has started at the site of the new Central Archive of National Archives in Mikkeli: April 2016 (ready in 2018)
Hämeenlinna provincial archive (2009)
Founded in 1927
S A J O S
Sámi Archives (2012)
http://www.arkisto.fi/en/the-national-archives-service/saamelaisarkisto-3
”The acquisition of private documents aims at creating archives that offer an authentic, balanced and sufficient picture of various sectors of the Finnish society in different
eras.”
Archives – a mirror of the society?
Closing party of the AvoinGLAM (OpenGLAM) at the National Archives
Hack4FI 2015
OpenGLAM
Holdings and services in brief• 213 shelf kilometers documents • 10 % of the holdings are private archives (individuals,
associations, enterprises)• 43 million digital units (2 % of the holdings)• 50 million annual uploadings from Digital Archive
(2012)• 12 000 different monthly user IP’s• 50 000 on site research visits annually• 72 000 orders at service desk• 21 000 public and private surveys done for customers
Are we really living up and using all the opportunities of the digital world?
• More than 200 km analogue holdings still coming in!• More than 15 million euros spent annually on rents of
archival space in the governmental administration!• Duplicated service systems – analogue and digital!• Digitization will take decades at todays speed!• Risk of marginalization of the analogue materials!• Autonomous municipalities are inefficient in e-management• How can we better use and share public data throughout the
public administration?
Digitization of Public Services – Spearhead of the present Governmental Program
• Adopted in August 2015• Public information should be collected only once! • Extensive digitalization to be carried out • Avoid duplication of processes in public administration• Open access to public information with regard of the
legislation and EU-directives protecting the data from misuse (ownership, personal protection, security)
• Freedom of academic research is a constitutional right
Digitization and e-Services in the Strategy of the National Archives of Finland 2020
• Promotes electronic archiving in public administration, and participates actively in the development of storage solutions
• Enhances open-minded utilisation of modern technologies in the provision of services
• Analogue materials transferred for permanent storage will be digitised as part of the transfer process.
• The proposed amendment of the Archives Act will enable the destruction of analogue material without changing the legal probative force of the documents.
How to proceed in practice? • Reform of the Archives Act
– National Archives to be turned to one legal entity 1.1.2017– Right to destroy digitized records after digitization without
them losing their legal power (ca 90 %) taking in regard their value as national cultural heritage. International survey.
– New law on Private Archives 1.1.2017• Decisions of the State Council (proposal)
– Provider is responsible for transfer costs throughout administration – digitization is interpreted as transfer
– ICT-system for preseravtion of digitized and born digital records including their operational use - Open Data principle
• Building up a process for mass-digitization
Rocket science for archivists?
NEWSCIENCE
eScience
Grids and Networks
New skills
ePublishing
Infra-structures
Open Access
Preservation and Access
InternationalStandards
Research Data
Renewal of Research Process
”
”The National Archives and the San Diego Supercomputer Center sign landmark agreement
… The partnering of NARA, SDSC and NSF comes at a time when the nation's scientists and engineers are seeking to increase U.S. competitiveness and leadership--and when the massive amounts of digital data being generated by researchers, educators and practitioners are demanding new and innovative strategies for digital preservation”
OECD 2004”The rapid development in
computing technology and the Internet have opened up new applications for the basic sources of research — the base material of research data — which has given a major impetus to scientific work in recent years.
Databases are rapidly becoming an essential part of the infrastructure of the global science system.”
”Supporting researchand innovation is a key priority of the Agenda” [Digital Agenda for Europe]
”To collect, curate, preserve and make available ever-increasing amounts of scientific data, new types of infrastructures will be needed”
Neelie Kroes, Vice-President of the European Commission, responsible for the Digital Agenda
AAAS 2011Washington
D.C.• Not only for natural
sciences!• Data-sharing and
interoperability• Linking digital records
to scholarly publishing• Open source and open
access (OAIS)• Trusted digital
repositories and IPR questions relating to data
• Licensed data and open access principle
• ”Supporting research and innovation is a key priority of the Agenda” [Digital Agenda for Europe]
• ”To collect, curate, preserve and make available ever-increasing amounts of scientific data, new types of infrastructures will be needed”Neelie Kroes, Vice-President of the European Commission, responsible for the Digital Agenda
Digital Agenda for Europe 2020
http://linkeddatabook.com/editions/1.0/
Freedom of Information!
Open Science!Open Access!
Open Data!
Protection of Privacy!
Data anonymization!
Intellectual Property Rights!
Data protection!
Privacy and Research in Personal Data Act (523/1999) of Finland
• Personal Data Act – General prohibition to process sensitive data (state of health, handicap, illness, race or ethnicity, social welfare needs etc)
• BUT: Prohibition does not prevent processing of data for purposes of historical, scientific or statistical research or a health care unit or a health care professional from processing data collected in the course of their operations and relating to the state of health, illness or handicap of the data subject or the treatment or other measures directed at the data subject
http://www.arkisto.fi/en/records
Church Records as Linked Data
Nihattula – Torkkoila. I Aa:1, Rippikirja 1732-1739. Hattulan seurakunnan arkisto, HMA.
https://www.finna.fi/?lng=en-gb
http://kronos.narc.fi/frontpage.html
EUROPEANA HERALDICAHERALDIC DATABASE FOR EUROPE
http://extranet.narc.fi/heraldica/
http://www.sa.dk/media%284667,1030%29/Heraldikrapport.pdf
Epitaphy in Tenala Church
Arvid Stålarm and Elin FlemingGrabbeHand
Nils Boije 1564 Måns Sild
Glass Paintings with Heraldry at Tenala Church
Coat of Arms at the Preaching Pulpit
Gyllenlood and Jägerhorn -families
Löwenhult Coat of Arms
Johan Adolph Löwenhult 1754
Commander Cross of the Order of the
Sword of Sweden
Evert Berent von Jordan
Hans Myyr (Muir)
Johan Göös Coat of Arms with Ancestric Coat of Arms
Johan Göös 1697
Birckholtz Fincke Frille
Göös Wildeman Ållongren
Recognition and Enrichment of Archival Documents (READ)
• Funded by European Commission for 3.5 years
• 13 partners from across Europe – computer scientists, archivists and scholars
• Handwritten text recognition (HTR)
• Allows users to search and automatically transcribe digitised historical material
Transkribus
• A downloadable platform for the automated recognition, transcription and searching of historical documents.
In Transkribus users can…
• Upload their own documents - keep them private or share them with other users
• Transcribe text for the training of HTR or in order to make digital editions
• Transcriptions can be enhanced with tags
• Documents can be exported in several formats – PAGE XML, TEI, PDF
HTR
Keyword spotting
READ will revolutionize access to archival collections…
• By offering the possibility of full-text search for handwritten documents (due to HTR)
• By providing tools that will make easier to index and analyze digitized collections (even automatically)
• By giving a platform where research groups across the world can work together with common data
• By enabling new kind of research use of extensive handwritten materials
Diplomatarium Fennicum
• Research database of medieval documents concerning Finland
• Consists of 6,700 documents from 11th to 16th century
• Based on Finlands Medeltids Urkunder (1910-1935) by Reinhold Hausen
• Beta release in October 2016
DF Database contains
• All existing editions of the documents
• Enriched metadata• Images of original
charters• Bibliography• Connection to other
medieval databases
• Advanced search mode; all the different
editions will be searchable
• Related documents are linked together
• New critical editions that meet demands of
linguistics (gradually)
• Possibility for the use of crowdsourcing
Administrative registers as a source for scientific research
• Official data are often in the format of electronic records in Finland
• Registers contain different kinds of individual level data which are needed for administrative purposes (e.g. health services, social services, education, taxation)
• All registers use the same personal identity code (PID) for each Finnish citizen
• Legislation allows data from different registers to be linked by PID for research purposes which makes register data a rich source for scientific research
• The confidentiality of the personal data is a special challenge for access services
Aims of the FMAS
The Finnish Microdata Research Services (FMAS) are developed in order to 1) inform and guide researchers about the possibilities of register
research2) help the researchers find available data 3) provide an electronic permit system for researchers to apply for
permits with one application per one study and 4) make available a remote desktop service, where researchers can
securely obtain and analyze unit level data
As a whole, the new services will help the whole research process of register-based research.
56
Construction project• In 2013, FMAS was accepted into Finland’s roadmap
for research infrastructure. • The construction project is hosted by the National
Archives and Statistics Finland • Funded so far by the Academy of Finland (2014),
Social insurance institute (2015-16), and the host organizations
• More funding has been and will be applied
• The services will be available at the earliest 2017
Joint permit service• The National Archives is constructing the FMAS joint
permit service• The service will be a centralized digital service through
which researchers can apply for permits to use register data
• Only one application per study will be needed; the researcher can ask for permit to use data from several different organizations simultaneously by the same interactive digital application form
• For the authorities the service would offer the platform to receive and handle the applications and to store the decisions which are official administrative documents
• The access to the system and documents will be controlled by strong authentication methods
New organizations New partnershipsNew approachesNew possibilities
Think Globally!
KIITOS!