Adding data sources to the reporter

Post on 21-Jul-2015

217 views 4 download

Tags:

Transcript of Adding data sources to the reporter

Adding Data Sources to the Reporter

Evergreen International

Conference 2015

Amount of Stuff

• As of 4/14/15 – 696 tables in my production

• 5,505 columns

• Over 30 million theoretical relationships

• Theory, not reality

• Still, there are a lot there

Do you want to do it?

Two Types of Change

Addition vs. Alteration

Overhead

Additions. Keeping track of your changes and adding

them back in after releases.

Alterations. Keeping track of your changes, Evergreen’s, resolving

conflicts and adding them back in.

“Was the population of British citizens in

India, in the 1800s, a significant part of

the empire’s population?”

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

An Example

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

The Question

“How many Hugo and Nebula award winning novels do I have circulating?”

Is that the right question?

Broadening the scope of the question helps. You don’t want to end up down a never ending

road of new questions but you do want to create a way to answer the question this time that can easily be reused. Too narrow a scope can create

a solution that only answers one question.

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

AllocatingGather External Data, Reference Assistant

Rough Prototype of Data Structure for SQL Table, AKA Google Sheet, Me

Someone in IT or a data savy

librarian to design tables and upload

data

Someone to write queries

Short vs. Long TermInternal vs. External Data

Internal data is usually a new view or otherwise using data internal to Evergreen only. You can usually make this a self perpetuating source.

External data will have to be maintained. Do you have the resources and commitment to do this?

What I did.

• I started a Google Sheet with the columns I thought I needed: award, category, author, title, year and notes.

• I pointed a reference assistant at it and said fill it in for novels for the last thirty years.

• Then I reviewed it.

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

Everyone must work together at this stage to understand what can be stored and what

can be queried.

1 Table or 3

I could have created a single table but I instead decided to go with three.

Awards TableAuthorTitle

Reporter SSRAuthorTitleRecord

Call NumbersRecord

CopiesCall NumberCirculating Library

Org Unit

Testing the Query

• Now I’m going to prototype what I want to do later in the reporter in SQL.

• Faster for prototyping than reporter and let me know if there’s anything here critical to change before I have more to change.

• Or if you decide to scrap it, it’s that much sooner you start catching up on House of Cards.

A Quick SQL Query

select era.year, era.author, era.title, era.notes, cat.category, agency.agency,array_agg(aou.shortname) as reporterstufffrom extend_reporter.awards erajoin extend_reporter.awards_agency agency on agency.id = era.agencyleft join extend_reporter.awards_category cat on cat.id = era.categoryjoin reporter.super_simple_record ssr on ssr.author = era.authorjoin asset.call_number acn on acn.id = ssr.idjoin asset.copy ac on ac.call_number = acn.idjoin actor.org_unit aou on aou.id = ac.circ_libwhere agency.agency ilike '%hugo%' and ssr.title = era.titlegroup by 1, 2, 3, 4, 5, 6;

Results

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

Remember when I said this wouldn’t be about the technology so much?

create table extend_reporter.awards_agency (id serial PRIMARY KEY,agency text);

create table extend_reporter.awards_category (id serial PRIMARY KEY,category text);

create table extend_reporter.awards (id serial,agency INT NOT NULL REFERENCES extend_reporter.awards_agency (id),category INT REFERENCES extend_reporter.awards_category (id),year integer NOT NULL,author text NOT NULL,title text NOT NULL,notes text);

1) Create tables.

2) Upload data.

3) Create field mapper entries (fm_IDL.xml).

4) Update IDLs on server (both).

5) Run autogen

6) Restart services, Apache and clark-kent.

Fm_IDL.xml

• Aka the Field Mapper

• /openils/var/web/reports/fm_IDL.xml

• /openils/conf/fm_IDL.xml

What does it do?

1. Controls how source information

appears in the reporter.

2. Controls how sources connect in

the reporter.

<class id="eraw" controller="open-ils.cstore open-ils.pcrud"

oils_obj:fieldmapper="extend_reporter::awards"

oils_persist:tablename="extend_reporter.awards”

reporter:core="false" reporter:label="Award Winners">

<fields oils_persist:primary="id">

<field reporter:label="All Award Agencies" name="agency"

oils_persist:virtual="true" reporter:datatype="link"/>

<field reporter:label="All Award Categories" name="category”

oils_persist:virtual="true" reporter:datatype="link"/>

<field reporter:label="Year of Award" name="year" reporter:datatype="int"/>

<field reporter:label="Author" name="author" reporter:datatype="text"/>

<field reporter:label="Title" name="title" reporter:datatype="text"/>

<field reporter:label="Notes" name="notes" reporter:datatype="text"/>

</fields>

<links>

<link field="agency" reltype="has_a" key="id" map="" class="erawagency"/>

<link field="category" reltype="might_have" key="id" map="" class="erawcat"/>

</links>

</class>

Fm_IDL.xml part

Two Tips

• As your group is working be careful what collaboration you use. Formatting can be evil, SQL can suffer from single quotes being changed and the fieldmapper XML doesn’t like capitalizations.

• If you have trouble with your field mapper entries look for similar examples in the stock one.

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

The Aftermath

• Document and write manual

• Back up changes (git?)

• Write reports

Now Catch Up On House of Cards

Purchase Alert

Calculating thresholds, combining different

kinds of holds

Make easier for staff

Weeding Reports

Unions of never checked outs and non

recent checkouts

Takes a long time to run

Make easier for staff

Two Other Examples