RDAP 15: Providing access to restricted data in our institutions
RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR
Transcript of RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR
![Page 1: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/1.jpg)
treating data like dataunifying data processing workflows for datasets in the IR
Steve Van Tuyl - @badgerbouse
Data and Digital Repository Librarian,
Oregon State University
#WorstTalkEver
![Page 2: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/2.jpg)
• introduction
• the setup
• phase 1: new definitions
• phase 2: what to expect
• lessons learned
![Page 3: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/3.jpg)
![Page 4: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/4.jpg)
![Page 5: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/5.jpg)
![Page 6: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/6.jpg)
![Page 7: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/7.jpg)
![Page 8: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/8.jpg)
1.9 gb
![Page 9: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/9.jpg)
![Page 10: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/10.jpg)
data = data
phase 1: new definitions
![Page 11: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/11.jpg)
![Page 12: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/12.jpg)
“I was a little daunted by the
documenting mentioned in the last
e-mail, as I am starting a new PhD
program, and have lots of
responsibilities there.”
- The Perpetrator
![Page 13: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/13.jpg)
iterate
![Page 14: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/14.jpg)
At least for the next couple of
years, until the RDM community
has made such an impact that
incoming graduate students
know how to manage data from
the start
![Page 15: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/15.jpg)
LOL, JK
![Page 16: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/16.jpg)
iterate
encourage
tattle
![Page 17: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/17.jpg)
phase 2: what to expect
![Page 18: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/18.jpg)
![Page 19: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/19.jpg)
93theses & dissertations
![Page 20: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/20.jpg)
45% excel
![Page 21: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/21.jpg)
22% images
![Page 22: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/22.jpg)
25% documents
![Page 23: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/23.jpg)
25% other “data”
text
database
statistical
![Page 24: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/24.jpg)
23% code
15% executables
![Page 25: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/25.jpg)
12% “metadata”
![Page 26: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/26.jpg)
33% of excel have:
linked info
charts
macros
![Page 27: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/27.jpg)
30% unknown
unopenable
obsolete
![Page 28: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/28.jpg)
3% missing data
![Page 29: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/29.jpg)
?
![Page 30: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/30.jpg)
definitions
![Page 31: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/31.jpg)
“ScholarsArchive@OSU promises to ensure that the
following common file formats (among many others)
are useable in the future, using whatever
combination of techniques (such as migration,
emulation, etc.) is appropriate given the context of
need”
- someone 10 years ago
promises, promises
![Page 32: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/32.jpg)
nothing ever changes
![Page 33: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/33.jpg)
new baseline
![Page 34: RDAP 15: Treating data like data: Unifying data processing workflows for datasets in the IR](https://reader030.fdocuments.us/reader030/viewer/2022032421/55a784701a28abbe7a8b45db/html5/thumbnails/34.jpg)