Preserving email

The PeDALS approach

Pete WattersArizona State Library, project coordinatorpwatters@lib.az.us

Richard Pearce-MosesClayton State University, Georgia,principal investigatorrpearcemoses@clayton.edu

Brian SchnackelArizona State Library, lead developerbschnackel@lib.az.us

PeDALS strives for OAIS compliance

Archivists focus on process, not individual records

Business rules… generate normalized metadata transform SIPs into standardized AIPs create DIPs for each record

Suited to the PeDALS methodology Born digital Potential for historical value Message transmission information provides

a rich source of metadata

All partners had Outlook PST files

Atomize individual messages • To store as individual AIPs • To disseminate as browser-friendly DIPs

Create a database of rich metadata • From the process: to support administration• From the email headers: to support discovery• From BagIt, New Zealand Metadata Extractor,

other sources: to support preservation

PeDALS is intended for permanent records

PeDALS is not a records management system

Deleting files is difficult at best

When negotiating with the originating office, archivists encourage weeding PSTs of non-permanent records

Archivists work with rules rather than records – they don’t have time to weed the collections

If you give us junk, we’ll archive junk.

PSTs plucked from hard drives can work, but more likely to generate errors during processing.

Metadata taken from headers was surprisingly messy

One response is to learn to cope with a complete lack of authority control

Or possibly correct by “data wrangling” from within the database

Senders and recipients can be an email address or display name from one or more contact lists“Janet Napolitano” or “governor@az.gov” or “jnapolitano@az.gov” or “Napolitano, Janet “ or “Janet” or “J Napolitano”?

Subject line not reliable source for titles or abstracts – often blank, repetitive, or a remnant from an unrelated message

Email (and other records) may be open to the public by statute, but some content may be sensitive•Personally identifying information•Private information (intimate, of no public interest)

Repositories must develop procedures and policies for aggregates that may have some records with sensitive information

Boucher/Stearns draft legislation for online privacy would require “notice to and consent of an individual prior to the collection and disclosure of certain personal information” such as street and email addresses, phone numbers, aliases, and other common information.

Excludes government agencies, but may include academic libraries.

Possible chilling effect on archives: Keeping such information confidential would effectively block access to email and many other records

PST file structure was proprietary

Considered third-party Outlook plug-ins• Smithsonian Institution had done researchhttp://siarchives.si.edu/cerp/RAC_SIA_CERP_tools_V2_CC.pdf

Adopted open-source PST export utility• No longer supported• Written in Visual Basichttp://www.genusa.com/utils/pmseu.htm

Could generate human-readable XML of email messages

Was based on code open to public

Did not require understanding of PST structure

It’s more than just email

What to do with tasks, calendar items, contacts?

Need to give the archivist the ability to decide what to keep

What about viruses, corrupt attachments?

What is the record? What are we authenticating?

PST as database; messages are constructs of fields in tables tied together by keys and other tables

XML is best way to preserve these relations and dependencies

Did not use the full record

Had almost no way to handle errors

Tended to break when dealing with large PST files that had not been curated

Required a copy of Outlook

Ran very slowly

In late February, Microsoft released the PST specification

http://msdn.microsoft.com/en-us/library/ff385210.aspx

203 pages of techspeak with some errors and inaccuracies

Based on the spec, we’ve been developing a file-based tool that doesn’t require Outlook.

Generates XML from the entire PST file

Much improved exception handling

Does not require Outlook

Runs much more quickly

File-based processor was slow to develop because of some errors in Microsoft’s documentation.

Test on as many PST samples as possible. Don’t rely on small curated samples.

Discovered differences between Unicode PST files and earlier ANSI-encoded files.

PSTs are not an automatic occurrence in Outlook 2010

But they can be generated manually and can remain part of a scheduled retention routine

Preserving email

Documents

Transcript of Preserving email

Scale-aware Structure-Preserving Texture Filteringcg.postech.ac.kr/papers/safiltering.pdf · Scale-aware Structure-Preserving Texture ... preserving structure edges, pro-tecting easy-to-miss

PRESERVING & MAINTAINING

Preserving families, preserving wealth brochure, may 2013

Preserving Evidence in Trucking Injury Cases: Motor ...media.straffordpub.com/products/preserving... · 6/18/2019 · Preserving Evidence in Trucking Injury Cases: Motor Carrier

Preserving Normals

Positivity-preserving and asymptotic preserving method for ...jliu/research/pdf/... · Thisisarrnrvidedtotherbythepubr.Coprightrrnmayapply. POSITIVITY-PRESERVING AND ASYMPTOTIC PRESERVING

Privacy-Preserving Email Forensics - DFRWS...with respect to one email can be described as follows: A plaintext P is given together with a list of keywords w :¼ {w1,…,wn}. In our

Preserving Books

Robotic spleen-preserving laparoscopic distal ... · Robotic spleen-preserving laparoscopic distal pancreatectomy: a ... Spleen-preserving laparoscopic distal pancreatectomy is ...

Preserving book

Certificate Translation for Specification Preserving Advices · Certificate Translation for Specification Preserving Advices FOAL 2008 Certificate Translation for Specification Preserving

Commercial products for preserving clinical specimens … · for preserving clinical specimens for the diagnosis of tuberculosis ... Commercial products for preserving clinical specimens

Preserving email: The PeDALS approach

Preserving Memories

PRESERVING RELATIONSHIPS

Preserving Eyesight

TIMBER PRESERVING€¦ · TIMBER PRESERVING 52 Somerset St Minto NSW 2566 PO Box 5020 Minto NSW 2566 Australia Telephone: (02) 9603 1499 Facsimile: (02) 9603 4396 Email: sales@confast.net.au

Shock: Aggregating Information While Preserving Privacy · with users’ email clients, taking advantage of email as habitat. Key Words. privacy, peer-to-peer networks, expertise

Positivity-preserving method for high-order conservative ...archive.ymsc.tsinghua.edu.cn/pacm_download/113/1177-paper.pdf · 2.2 Positivity preserving ﬂux limiter The positivity-preserving

Preserving, Protecting, and Expanding Affordable Housingkresge.org/sites/default/files/Uploaded Docs/Preserving-affordable-housing-executive... · Preserving, Protecting, and Expanding