Documentation Unstructured
Transcript of Documentation Unstructured
-
8/18/2019 Documentation Unstructured
1/12
Data Management:
Documentation &Metadata
Types of Documentation
Data Life Cycle
Re-Purpose
Re-Use Deposit
DataCollection
DataAnalysis
DataSaring
ProposalPlanning
!riting
DataDisco"ery
#nd ofPro$ect
DataArci"e
Pro$ectStart Up
-
8/18/2019 Documentation Unstructured
2/12
%
Data Documentation
Metadata'• (nformal or formal metods todescri)e your data
• (mportant if you *ant to reuse your
o*n data in te future• Also necessary *en saring your
data
-
8/18/2019 Documentation Unstructured
3/12
+
,oure already documenting yourdata
• .ote)oo/ – Paper
– Digital
– La)
• 0olders *it notes1 te2t 3les
• Sources1 e2periments or sur"eys1
procedures1 etc4
-
8/18/2019 Documentation Unstructured
4/12
5
Documentation in Researc
Project Documentation Dataset Documentation• Conte2t of data collection
• Data collection metods
• Structure1 organi6ation ofdata 3les
• Data sources used
• Data "alidation1 7ualityassurance
• Transformations of data from
te ra* data trouganalysis
• (nformation oncon3dentiality1 access anduse conditions
• 8aria)le names anddescriptions
• #2planation of codes andscemas used
• Algoritms used to transformdata
• 0ile format and soft*areincluding "ersion' used
-
8/18/2019 Documentation Unstructured
5/12
9
Types of Documentation
Documentation for understanding &re-use
•Readme 0ile
•Data Dictionary
•Code)oo/
-
8/18/2019 Documentation Unstructured
6/12
ReadMe
• Descri)es te core documentationa)out an in"estigation and its data3les
• Typically a simple te2t 3le
• Can descri)e te indi"idual 3les'and;or data pac/age as a *ole
-
8/18/2019 Documentation Unstructured
7/12
<
ReadMe #2ample - Dataset
-
8/18/2019 Documentation Unstructured
8/12
=
Data Dictionary
• Pro"ides de3nitions of te data 3elds ina data 3le
• More details on te "aria)les1
o)ser"ations of a 3le• Used to understand te data and tedata)ases tat contain it
• (denti3es data elements and teir
attri)utes including names1 de3nitionsand units of measure and oterinformation
• >ften tey are organi6ed as a ta)le
-
8/18/2019 Documentation Unstructured
9/12
?
Data Dictionary #2ample
-
8/18/2019 Documentation Unstructured
10/12
@
!at is a Code)oo/B• Typical in social sciences researc
• (ncludes elements similar to readmeand dictionary
– Pro$ect le"el information e4g4 sur"eydesign and metodology'
– Response codes for eac "aria)le
– Codes used to indicate nonresponse and
missing data
http://www.icpsr.umich.edu/icpsrweb/ICPSR/support/faqs/2006/0/what
!is!codeboo"
-
8/18/2019 Documentation Unstructured
11/12
@@
!at is a Code)oo/B
• Additionally1 code)oo/s may alsocontain:
– A copy of te sur"ey 7uestionnaire if
applica)le' – #2act 7uestions and s/ip patterns used in
a sur"ey
– 0re7uencies of response
• uite longhttp://www.icpsr.umich.edu/icpsrweb/ICPSR/support/fa
qs/2006/0/what!is!codeboo"
-
8/18/2019 Documentation Unstructured
12/12
@%
>ter #2amples of DataDocumentation
• La) note)oo/s
• Soft*are synta2
• Programming code• (nstrument settings and;or
cali)ration
• Pro"enance of sources of data• #m)edded metadata e4g4 #E(01 0(TS'