Adding data sources to the reporter
-
Upload
rogan-hamby -
Category
Technology
-
view
217 -
download
4
Transcript of Adding data sources to the reporter
Adding Data Sources to the Reporter
Evergreen International
Conference 2015
Amount of Stuff
• As of 4/14/15 – 696 tables in my production
• 5,505 columns
• Over 30 million theoretical relationships
• Theory, not reality
• Still, there are a lot there
Do you want to do it?
Two Types of Change
Addition vs. Alteration
Overhead
Additions. Keeping track of your changes and adding
them back in after releases.
Alterations. Keeping track of your changes, Evergreen’s, resolving
conflicts and adding them back in.
“Was the population of British citizens in
India, in the 1800s, a significant part of
the empire’s population?”
How to Answer a Reference Question in SQL (and Stretch a Metaphor)
1. Phase 1 – Ask the question.
2. Phase 2 – Allocate resources.
3. Phase 3 – Model the data (and query).
4. Phase 4 – Implement new data sources.
5. Phase 5 – Profit (or at least rejoice).
An Example
How to Answer a Reference Question in SQL (and Stretch a Metaphor)
1. Phase 1 – Ask the question.
2. Phase 2 – Allocate resources.
3. Phase 3 – Model the data (and query).
4. Phase 4 – Implement new data sources.
5. Phase 5 – Profit (or at least rejoice).
The Question
“How many Hugo and Nebula award winning novels do I have circulating?”
Is that the right question?
Broadening the scope of the question helps. You don’t want to end up down a never ending
road of new questions but you do want to create a way to answer the question this time that can easily be reused. Too narrow a scope can create
a solution that only answers one question.
How to Answer a Reference Question in SQL (and Stretch a Metaphor)
1. Phase 1 – Ask the question.
2. Phase 2 – Allocate resources.
3. Phase 3 – Model the data (and query).
4. Phase 4 – Implement new data sources.
5. Phase 5 – Profit (or at least rejoice).
AllocatingGather External Data, Reference Assistant
Rough Prototype of Data Structure for SQL Table, AKA Google Sheet, Me
Someone in IT or a data savy
librarian to design tables and upload
data
Someone to write queries
Short vs. Long TermInternal vs. External Data
Internal data is usually a new view or otherwise using data internal to Evergreen only. You can usually make this a self perpetuating source.
External data will have to be maintained. Do you have the resources and commitment to do this?
What I did.
• I started a Google Sheet with the columns I thought I needed: award, category, author, title, year and notes.
• I pointed a reference assistant at it and said fill it in for novels for the last thirty years.
• Then I reviewed it.
How to Answer a Reference Question in SQL (and Stretch a Metaphor)
1. Phase 1 – Ask the question.
2. Phase 2 – Allocate resources.
3. Phase 3 – Model the data (and query).
4. Phase 4 – Implement new data sources.
5. Phase 5 – Profit (or at least rejoice).
Everyone must work together at this stage to understand what can be stored and what
can be queried.
1 Table or 3
I could have created a single table but I instead decided to go with three.
Awards TableAuthorTitle
Reporter SSRAuthorTitleRecord
Call NumbersRecord
CopiesCall NumberCirculating Library
Org Unit
Testing the Query
• Now I’m going to prototype what I want to do later in the reporter in SQL.
• Faster for prototyping than reporter and let me know if there’s anything here critical to change before I have more to change.
• Or if you decide to scrap it, it’s that much sooner you start catching up on House of Cards.
A Quick SQL Query
select era.year, era.author, era.title, era.notes, cat.category, agency.agency,array_agg(aou.shortname) as reporterstufffrom extend_reporter.awards erajoin extend_reporter.awards_agency agency on agency.id = era.agencyleft join extend_reporter.awards_category cat on cat.id = era.categoryjoin reporter.super_simple_record ssr on ssr.author = era.authorjoin asset.call_number acn on acn.id = ssr.idjoin asset.copy ac on ac.call_number = acn.idjoin actor.org_unit aou on aou.id = ac.circ_libwhere agency.agency ilike '%hugo%' and ssr.title = era.titlegroup by 1, 2, 3, 4, 5, 6;
Results
How to Answer a Reference Question in SQL (and Stretch a Metaphor)
1. Phase 1 – Ask the question.
2. Phase 2 – Allocate resources.
3. Phase 3 – Model the data (and query).
4. Phase 4 – Implement new data sources.
5. Phase 5 – Profit (or at least rejoice).
Remember when I said this wouldn’t be about the technology so much?
create table extend_reporter.awards_agency (id serial PRIMARY KEY,agency text);
create table extend_reporter.awards_category (id serial PRIMARY KEY,category text);
create table extend_reporter.awards (id serial,agency INT NOT NULL REFERENCES extend_reporter.awards_agency (id),category INT REFERENCES extend_reporter.awards_category (id),year integer NOT NULL,author text NOT NULL,title text NOT NULL,notes text);
1) Create tables.
2) Upload data.
3) Create field mapper entries (fm_IDL.xml).
4) Update IDLs on server (both).
5) Run autogen
6) Restart services, Apache and clark-kent.
Fm_IDL.xml
• Aka the Field Mapper
• /openils/var/web/reports/fm_IDL.xml
• /openils/conf/fm_IDL.xml
What does it do?
1. Controls how source information
appears in the reporter.
2. Controls how sources connect in
the reporter.
<class id="eraw" controller="open-ils.cstore open-ils.pcrud"
oils_obj:fieldmapper="extend_reporter::awards"
oils_persist:tablename="extend_reporter.awards”
reporter:core="false" reporter:label="Award Winners">
<fields oils_persist:primary="id">
<field reporter:label="All Award Agencies" name="agency"
oils_persist:virtual="true" reporter:datatype="link"/>
<field reporter:label="All Award Categories" name="category”
oils_persist:virtual="true" reporter:datatype="link"/>
<field reporter:label="Year of Award" name="year" reporter:datatype="int"/>
<field reporter:label="Author" name="author" reporter:datatype="text"/>
<field reporter:label="Title" name="title" reporter:datatype="text"/>
<field reporter:label="Notes" name="notes" reporter:datatype="text"/>
</fields>
<links>
<link field="agency" reltype="has_a" key="id" map="" class="erawagency"/>
<link field="category" reltype="might_have" key="id" map="" class="erawcat"/>
</links>
</class>
Fm_IDL.xml part
Two Tips
• As your group is working be careful what collaboration you use. Formatting can be evil, SQL can suffer from single quotes being changed and the fieldmapper XML doesn’t like capitalizations.
• If you have trouble with your field mapper entries look for similar examples in the stock one.
How to Answer a Reference Question in SQL (and Stretch a Metaphor)
1. Phase 1 – Ask the question.
2. Phase 2 – Allocate resources.
3. Phase 3 – Model the data (and query).
4. Phase 4 – Implement new data sources.
5. Phase 5 – Profit (or at least rejoice).
The Aftermath
• Document and write manual
• Back up changes (git?)
• Write reports
Now Catch Up On House of Cards
Purchase Alert
Calculating thresholds, combining different
kinds of holds
Make easier for staff
Weeding Reports
Unions of never checked outs and non
recent checkouts
Takes a long time to run
Make easier for staff
Two Other Examples