Post on 23-Feb-2016
description
What’s Up With ReCAP?
Updates and Data Analysis
Zack LaneReCAP Coordinator
January 24, 2012
ReCAPColumbia University
ReCAPColumbia University
Zack’s Ulterior Motives
Excuse to share
Find forum to share, examine, discuss and publish
Share what I do and how I do it
Review of ReCAP operations◦ Physical plant and new modules◦ Mellon Project◦ Websites
Data◦ Part 1: Why (bother) and how◦ Part 2: Boring data everybody’s seen followed by
amazing breakthrough◦ Part 3: New leads from Department presentations◦ Part 4: Trailblazing
Feedback!
Outline
ReCAPColumbia University
9.4 million books CUL 3.9, NYPL 3.5, PUL
2.0 5 Modules complete 2 planned for
construction: Modules 8 & 9
CUL manages transfers with quotas
Tours conducted once or twice every year
ReCAP: Physical Plant
ReCAPColumbia University
Columbia will occupy 5 aisles of new modules
NYPL plans to transfer 5.3 million books
Quotas will expand to 250,000 accessions per year (FY15)
ReCAP: Physical Plant
ReCAPColumbia University
2 new modules 12 aisles each,
same length as aisle 5
Ground breaking in spring 2012, completed summer 2013
Wow
Mellon Project
ReCAPColumbia University
Mellon Foundation grant to run for 12 months, April 2012 start
“From Discovery to Delivery”
Making all general collections at ReCAP available to partners
Impetus: NYPL’s 5.3 million transfers
Issues:◦ Collection
management◦ Collection
development◦ Technological
change◦ Governance◦ Policy◦ Cost-Sharing◦ Discover/Delivery
Mellon Project
ReCAPColumbia University
Lizanne Payne, Planning Consultant
“Library Storage Facilities and the Future of Print Collections in North America,” 2007
Process:◦ Strategic planning◦ Holdings analysis◦ Policy changes◦ Cost analysis◦ Cost-sharing◦ Workflow and tech
requirements◦ Recommendations
New ReCAP Website!!
ReCAPColumbia University
New ReCAP Website!
ReCAPColumbia University
All data sets are publicly accessible
All ReCAP sites are publicly accessible
Squeaky wheels get the grease
New ReCAP Website
ReCAPColumbia University
ReCAPColumbia University
Data…always data What have we been looking at?
What else is there to look at?
What can it tell us?
What do we think it tells us?
How do we manipulate it?
ReCAPColumbia University
Circulation Data Publicly available, all charges in Voyager
both on campus and offsite
Summer 2010 Internship: “Understanding the Effects of Policy Changes by Evaluating Circulation Activity Data at Columbia University Libraries” Serials Librarian, v.64:no.1, 2012.
Policy changes and user behavior
ReCAPColumbia University
13,326 14,951 17,615 21,383 25,083 28,029 27,524 28,73514,228
557,501 552,788
509,556
483,905
449,135 441,188 437,064415,987
207,069
0
100000
200000
300000
400000
500000
600000
FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
CHARGES: Offsite vs. On Campus by Fiscal Year
Offsite On Campus
ReCAPColumbia University
Zack’s Crystal Ball (extrapolation) Drop in total on campus charges: 399,208 Rise in total offsite charges: 29,282
◦ July-Dec = 51.87% of on campus, 48.59% of offsite)
User behavior: Faculty request offsite collections more and renew in greater proportion
Adaptation to technology takes time
Caveat: Causation is hard to prove
ReCAPColumbia University
0
100000
200000
300000
400000
500000
600000
FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
Circulation of ON CAMPUS Collections: Charges vs Renewals
On Campus - - - Sum of CHARGES On Campus - - - Sum of RENEWALS
ReCAPColumbia University
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
Circulation of OFFSITE Collections: Charges vs Renewals
Offsite - - - Sum of CHARGES Offsite - - - Sum of RENEWALS
ReCAPColumbia University
0
2000
4000
6000
8000
10000
12000
FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11
RECALLS: OPAC vs Everywhere Else
OPAC Everywhere Else
ReCAPColumbia University
To Analyze Data is to Suffer
Translating and defining
Quantity of data
Software/hardware limitations
Automation
Time
ReCAPColumbia University
File types:◦ .txt◦ .xlsx◦ .accdb◦ .ppt and .pptx
Dynamic links Changeable
data sets Delicate
balance
Data Management
ReCAPColumbia University
Circulation
Request Timing
TIME_SUM Calculation in Access
ReCAPColumbia University
Critical F(x)s PivotTable/PivotChart
vlookup
ReCAPColumbia University
Data: The Usual Suspects General categories Illustrate history of facility Lacks detail and nuance Provides big metrics we like Low hanging knowables
How can the data be mined?◦ Key: Accurate calculation of request rate
ReCAPColumbia University
0
100000
200000
300000
400000
500000
600000
700000
800000
FY01/02 FY02/03 FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
455,734
712,183
542,056
302,364
267,327
315,345
256,629
308,145
209,630183,324
83,479
System-wide Accessions by Fiscal Year
Load In Middle Phase Restrained
ReCAPColumbia University
2,147
14,520
26,564
35,757
42,866
50,009
59,755
69,06071,119 71,582
34,536
0
10000
20000
30000
40000
50000
60000
70000
80000
FY01/02 FY02/03 FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
System-wide Requests by Fiscal Year
Continuous Increase Level
ReCAPColumbia University
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
System-wide Requests by Hour
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
January February March April May June July August September October November December
System-wide Requests by Month
ReCAPColumbia University
ReCAPColumbia University
0
200
400
600
800
1000
1200
1400
1600
1800
200001
.01
01.0
901
.17
01.2
502
.02
02.1
002
.18
02.2
603
.05
03.1
303
.21
03.2
904
.06
04.1
404
.22
04.3
005
.08
05.1
605
.24
06.0
106
.09
06.1
706
.25
07.0
307
.11
07.1
907
.27
08.0
408
.12
08.2
008
.28
09.0
509
.13
09.2
109
.29
10.0
710
.15
10.2
310
.31
11.0
811
.16
11.2
412
.02
12.1
012
.18
12.2
6
System-wide Request by Calendar Day
ReCAPColumbia University
Yawn Boring“Zack, we know we know. Show us something new.”
Ok
Request rate calculated by “TIME_SUM” Item-by-Item analysis Allows focus on facets of bibliographic data Begin by looking again at what we’ve seen
before…
TIME_SUM Calculation in Access
Request Rate
ReCAPColumbia University
Primary metric of collection use
Percent of collection requested per year
Target : 2.00% Target set to
estimate staffing and costs
2.06% as of January 2012
Does not include some staff, Google Book Project, Law, and other small outliers
Grotesque distinction between request and retrieval
Google Project is the Woodstock of ReCAP
What is the CUL Request Rate?
ReCAPColumbia University
ReCAPColumbia University
0.00%
2.00%
4.00%
6.00%
8.00%
10.00%
12.00%
14.00%
16.00%
18.00%
20.00%
1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Request Rate of Monographs by Publication Date:1991-2011
ReCAPColumbia University
Request Rate by Publication Date Calculated for monographs with known
publication dates between 1850 and 2010 Volume of pre-1850 holdings is small in
comparison and considered outliers
1850-1990 : 1.74%1991-2011 : 6.73% Increased overall rate attributed new
acquisitions sent directly to ReCAP
ReCAPColumbia University
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
3.50%
4.00%
4.50%
1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Request Rate of Serials by Publication Date: 1991-2011
ReCAPColumbia University
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
3.50%
4.00%
4.50%
1850
1854
1858
1862
1866
1870
1874
1878
1882
1886
1890
1894
1898
1902
1906
1910
1914
1918
1922
1926
1930
1934
1938
1942
1946
1950
1954
1958
1962
1966
1970
1974
1978
1982
1986
1990
1994
1998
2002
2006
2010
Request Rate of Serials by Publication Date: 1850-2011
ReCAPColumbia University
Breakthrough Analyze any point in time, not just summary Calculate changing rates over time Faceted, item-by-item analysis
Combination of TIME_SUM and new formula
“Can the data get more specific?” Yes
Refined version of old news…
ReCAPColumbia University
1.65%1.71%
1.79%1.84%
1.89%1.93%
1.99%2.06% 2.08% 2.08% 2.06%
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
FY01/02 FY02/03 FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
Request Rate by Fiscal Year
ReCAPColumbia University
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
FY01/02 FY02/03 FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
Request Rate by Format
Monograph Serial Everything Else
ReCAPColumbia University
Breakthrough Remember chart of request rate by
publication date?
It will always look the same - high rate for most recent publications
Conceptually, there is something else going on
Track the request rate of each publication date by fiscal year
ReCAPColumbia University
0.00%
2.00%
4.00%
6.00%
8.00%
10.00%
12.00%
14.00%
16.00%
18.00%
20.00%
1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Request Rate of Monographs by Publication Date:1991-2011
ReCAPColumbia University
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
7.00%
8.00%
9.00%
10.00%
FY01/02 FY02/03 FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
Request Rates of 1991-2011 Imprints by Fiscal Year
1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
ReCAPColumbia University
…lead to innovation
Examples:◦ Sciences: roll-out of subject analysis◦ Area Studies: focus on language, place of
publication and high use serials
Opportunities: ◦ Institutional knowledge◦ Variety of interest◦ Refining data
Department Presentations…
ReCAPColumbia University
Request Rate by LC Class at Sciences Iterative process of refining reports and data
◦ Call number added July 2009
◦ Parsed call numbers added Nov. 2010
◦ LC/Non-LC (852 i1) added Jan. 2012
For Avery it was a mess Hard to analyze departments with non-LC
alphanumeric call numbers
ReCAPColumbia University
B BF E G GC GF H HC HE HM HQ HV JK L LD P QA QC QE QK QM QR RA RC RM S SD SH T TC TE TJ TL TP TS V Z
Request Rate by LC Subject Heading
What is Area Studies? ◦ Oops, I didn’t exactly know (it’s not “Lehman”)
Language/subject based selection Focus on two CLIO locations:
◦ “Area Studies Collection” = off,glx and off,leh Pamela helped focus topic: language, high-
use titles and request rate New challenge: language codes
Area Studies and ReCAP
ReCAPColumbia University
ReCAPColumbia University
0
50000
100000
150000
200000
250000
300000
350000
FY01/02 FY02/03 FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
Accession of Area Studies Collections
ReCAPColumbia University
Avery6.4% Burke
4.7%Business
6.1%
Butler36.1%
East Asian13.1%
Engineering4.8%
Geology/Geoscience1.4%
Health Science
4.5%
Journalism0.0%
Lehman7.2%
Mathematics0.0%
Music2.6%
RBML4.3%
Sciences off,bio1.6%
Sciences off,che1.4%
Sciences off,phy
1.3% Sciences off,psy
0.8%
Sciences off,sci1.6%
Social Work1.1%
(blank) (blank)1.0%
System-wide Accession by Department
TOTAL : 1,482,420 / 181,195 (Request Rate 1.90%)
off,glx : 1,267,596 / 155,674 (1.81%) off,leh : 214,824 / 25,521 (2.74%)
Accessions / Requests by CLIO Location
ReCAPColumbia University
“Libraries” consist of multiple CLIO locations (collections)
For several years there has been a steady decline for EDD requests
Likely due to e-access to current titles and purchase of backfiles
Individual patrons can have large effect on data (e.g. early 2009)
Pagination now required (August 2009) Patrons must have active borrowing
privileges to request EDD (August 2011)
EDD (Electronic Document Delivery)
ReCAPColumbia University
ReCAPColumbia University
0
100
200
300
400
500
600
700
800
3/1/
2002
6/1/
2002
9/1/
2002
12/1
/200
2
3/1/
2003
6/1/
2003
9/1/
2003
12/1
/200
3
3/1/
2004
6/1/
2004
9/1/
2004
12/1
/200
4
3/1/
2005
6/1/
2005
9/1/
2005
12/1
/200
5
3/1/
2006
6/1/
2006
9/1/
2006
12/1
/200
6
3/1/
2007
6/1/
2007
9/1/
2007
12/1
/200
7
3/1/
2008
6/1/
2008
9/1/
2008
12/1
/200
8
3/1/
2009
6/1/
2009
9/1/
2009
12/1
/200
9
3/1/
2010
6/1/
2010
9/1/
2010
12/1
/201
0
3/1/
2011
6/1/
2011
9/1/
2011
12/1
/201
1
Monthly EDD Requests:Health Sciences vs All Other Departments
All Other Departments HSL
A high-use title is any title that has been requested 5 or more times since accession
Includes both physical delivery and EDD Desire by staff to study high-use titles Initial purpose of ReCAP was to shelve low-
use collections Due to space need, selector decision and
patron trends, some titles may be considered higher- or high-use
Excludes: eng, fre, ger and ita
High-Use Titles at ReCAP
ReCAPColumbia University
High(est) Use Titles: Area Studies
ReCAPColumbia University
Monographs Serials
ReCAPColumbia University
Analysis of HU Spanish Serials
0
20
40
60
80
100
120
140
160
180
200
High Use Spanish Language Serials in Area Studies by Month
ReCAPColumbia University
Butler80.39%
Lehman8.96%
ILL8.28%
EDD1.25%
Everywhere Else1.13%
High Use Spanish Language Serials in Area Studies by Delivery Location
VIS2.27%
OFF10.54%
GRD33.22%
REG53.97%
High Use Spanish Language Serials in Area Studies by Patron Group
0
10
20
30
40
50
60
1900
1903
1906
1909
1912
1915
1918
1921
1924
1927
1930
1933
1936
1939
1942
1945
1948
1951
1954
1957
1960
1963
1966
1969
1972
1975
1978
1981
1984
1987
1990
1993
1996
1999
2002
2005
2008
2011
Distribution of HU spa Serials Requests by Publication Date
ReCAPColumbia University
Compare to e-HLDGs
0
10
20
30
40
50
60
1900
1903
1906
1909
1912
1915
1918
1921
1924
1927
1930
1933
1936
1939
1942
1945
1948
1951
1954
1957
1960
1963
1966
1969
1972
1975
1978
1981
1984
1987
1990
1993
1996
1999
2002
2005
2008
2011
Distribution of HU spa Serials Requests by Publication Date
ReCAPColumbia University
English78.3%
Spanish7.4%
French3.8%
Everything Else2.5%
Russian2.2%
German1.8%
Arabic1.3%
Italian1.0%
Portugese0.9%
Persian0.8%
Breakdown of Area Studies Requests by Language
ReCAPColumbia University
More Data Needed Language is good and
can be tailored…but not the whole story
Place of publication important
ReCAPColumbia University
Place of publication added Jan. 2012
Metadata must often be created to analyze
Will provide better data for decisions to Area Studies
But also creates new possibilities…
Place of Publication
ReCAPColumbia University
Trailblazing This whole thing was actually leading
somewhere…
Impact of Mass Digitization Elements:
◦ Request rate over time◦ Faceted analysis◦ New facet: place of publication
New data and methods allow posing question: How has request rate of pre-1923 USA imprints changed over past 10 years?
ReCAPColumbia University
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
FY01/02 FY02/03 FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
Request Rate of Pre-1923 USA Imprints
ReCAPColumbia University
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
3.50%
4.00%
4.50%
FY01/02 FY02/03 FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
Request Rate of USA Imprints by Era
Pre-1923 1923-1963 Post-1963
ReCAPColumbia University
0.00%
0.20%
0.40%
0.60%
0.80%
1.00%
1.20%
1.40%
1.60%
1.80%
2.00%
FY01/02 FY02/03 FY03/04 FY04/05 FY05/06 FY06/07 FY07/08 FY08/09 FY09/10 FY10/11 FY11/12
Request Rate of Foreign Imprints by Era
Pre-1923 1923-1963 Post-1963
ReCAPColumbia University
Plato’s CaveEvery chart is a shadow on the cave wall – none the full picture and often difficult to discern
How can we pursue useful analysis?
How can results be shared?
More information about data sets can be found on the ReCAP Data Center
Primary data categories include: accession, retrieval, delivery and circulation
Tailored data sets and analysis will be provided to staff via the ReCAP Coordinator
Please see the main ReCAP website for general information about CUL procedures and systems
More Data Available
ReCAPColumbia University