Adding data sources to the reporter

45
Adding Data Sources to the Reporter Evergreen International Conference 2015

Transcript of Adding data sources to the reporter

Page 1: Adding data sources to the reporter

Adding Data Sources to the Reporter

Evergreen International

Conference 2015

Page 2: Adding data sources to the reporter
Page 3: Adding data sources to the reporter
Page 4: Adding data sources to the reporter
Page 5: Adding data sources to the reporter
Page 6: Adding data sources to the reporter

Amount of Stuff

• As of 4/14/15 – 696 tables in my production

• 5,505 columns

• Over 30 million theoretical relationships

• Theory, not reality

• Still, there are a lot there

Page 7: Adding data sources to the reporter
Page 8: Adding data sources to the reporter
Page 9: Adding data sources to the reporter
Page 10: Adding data sources to the reporter
Page 11: Adding data sources to the reporter

Do you want to do it?

Page 12: Adding data sources to the reporter

Two Types of Change

Addition vs. Alteration

Page 13: Adding data sources to the reporter

Overhead

Additions. Keeping track of your changes and adding

them back in after releases.

Alterations. Keeping track of your changes, Evergreen’s, resolving

conflicts and adding them back in.

Page 14: Adding data sources to the reporter
Page 15: Adding data sources to the reporter

“Was the population of British citizens in

India, in the 1800s, a significant part of

the empire’s population?”

Page 16: Adding data sources to the reporter

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

Page 17: Adding data sources to the reporter

An Example

Page 18: Adding data sources to the reporter

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

Page 19: Adding data sources to the reporter

The Question

“How many Hugo and Nebula award winning novels do I have circulating?”

Page 20: Adding data sources to the reporter

Is that the right question?

Page 21: Adding data sources to the reporter

Broadening the scope of the question helps. You don’t want to end up down a never ending

road of new questions but you do want to create a way to answer the question this time that can easily be reused. Too narrow a scope can create

a solution that only answers one question.

Page 22: Adding data sources to the reporter

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

Page 23: Adding data sources to the reporter

AllocatingGather External Data, Reference Assistant

Rough Prototype of Data Structure for SQL Table, AKA Google Sheet, Me

Someone in IT or a data savy

librarian to design tables and upload

data

Someone to write queries

Page 24: Adding data sources to the reporter

Short vs. Long TermInternal vs. External Data

Internal data is usually a new view or otherwise using data internal to Evergreen only. You can usually make this a self perpetuating source.

External data will have to be maintained. Do you have the resources and commitment to do this?

Page 25: Adding data sources to the reporter

What I did.

• I started a Google Sheet with the columns I thought I needed: award, category, author, title, year and notes.

• I pointed a reference assistant at it and said fill it in for novels for the last thirty years.

• Then I reviewed it.

Page 26: Adding data sources to the reporter

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

Page 27: Adding data sources to the reporter

Everyone must work together at this stage to understand what can be stored and what

can be queried.

Page 28: Adding data sources to the reporter

1 Table or 3

I could have created a single table but I instead decided to go with three.

Page 29: Adding data sources to the reporter

Awards TableAuthorTitle

Reporter SSRAuthorTitleRecord

Call NumbersRecord

CopiesCall NumberCirculating Library

Org Unit

Page 30: Adding data sources to the reporter

Testing the Query

• Now I’m going to prototype what I want to do later in the reporter in SQL.

• Faster for prototyping than reporter and let me know if there’s anything here critical to change before I have more to change.

• Or if you decide to scrap it, it’s that much sooner you start catching up on House of Cards.

Page 31: Adding data sources to the reporter

A Quick SQL Query

select era.year, era.author, era.title, era.notes, cat.category, agency.agency,array_agg(aou.shortname) as reporterstufffrom extend_reporter.awards erajoin extend_reporter.awards_agency agency on agency.id = era.agencyleft join extend_reporter.awards_category cat on cat.id = era.categoryjoin reporter.super_simple_record ssr on ssr.author = era.authorjoin asset.call_number acn on acn.id = ssr.idjoin asset.copy ac on ac.call_number = acn.idjoin actor.org_unit aou on aou.id = ac.circ_libwhere agency.agency ilike '%hugo%' and ssr.title = era.titlegroup by 1, 2, 3, 4, 5, 6;

Page 32: Adding data sources to the reporter

Results

Page 33: Adding data sources to the reporter

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

Page 34: Adding data sources to the reporter

Remember when I said this wouldn’t be about the technology so much?

create table extend_reporter.awards_agency (id serial PRIMARY KEY,agency text);

create table extend_reporter.awards_category (id serial PRIMARY KEY,category text);

create table extend_reporter.awards (id serial,agency INT NOT NULL REFERENCES extend_reporter.awards_agency (id),category INT REFERENCES extend_reporter.awards_category (id),year integer NOT NULL,author text NOT NULL,title text NOT NULL,notes text);

Page 35: Adding data sources to the reporter

1) Create tables.

2) Upload data.

3) Create field mapper entries (fm_IDL.xml).

4) Update IDLs on server (both).

5) Run autogen

6) Restart services, Apache and clark-kent.

Page 36: Adding data sources to the reporter

Fm_IDL.xml

• Aka the Field Mapper

• /openils/var/web/reports/fm_IDL.xml

• /openils/conf/fm_IDL.xml

Page 37: Adding data sources to the reporter

What does it do?

1. Controls how source information

appears in the reporter.

2. Controls how sources connect in

the reporter.

Page 38: Adding data sources to the reporter

<class id="eraw" controller="open-ils.cstore open-ils.pcrud"

oils_obj:fieldmapper="extend_reporter::awards"

oils_persist:tablename="extend_reporter.awards”

reporter:core="false" reporter:label="Award Winners">

<fields oils_persist:primary="id">

<field reporter:label="All Award Agencies" name="agency"

oils_persist:virtual="true" reporter:datatype="link"/>

<field reporter:label="All Award Categories" name="category”

oils_persist:virtual="true" reporter:datatype="link"/>

<field reporter:label="Year of Award" name="year" reporter:datatype="int"/>

<field reporter:label="Author" name="author" reporter:datatype="text"/>

<field reporter:label="Title" name="title" reporter:datatype="text"/>

<field reporter:label="Notes" name="notes" reporter:datatype="text"/>

</fields>

<links>

<link field="agency" reltype="has_a" key="id" map="" class="erawagency"/>

<link field="category" reltype="might_have" key="id" map="" class="erawcat"/>

</links>

</class>

Fm_IDL.xml part

Page 39: Adding data sources to the reporter

Two Tips

• As your group is working be careful what collaboration you use. Formatting can be evil, SQL can suffer from single quotes being changed and the fieldmapper XML doesn’t like capitalizations.

• If you have trouble with your field mapper entries look for similar examples in the stock one.

Page 40: Adding data sources to the reporter

How to Answer a Reference Question in SQL (and Stretch a Metaphor)

1. Phase 1 – Ask the question.

2. Phase 2 – Allocate resources.

3. Phase 3 – Model the data (and query).

4. Phase 4 – Implement new data sources.

5. Phase 5 – Profit (or at least rejoice).

Page 41: Adding data sources to the reporter
Page 42: Adding data sources to the reporter

The Aftermath

• Document and write manual

• Back up changes (git?)

• Write reports

Page 43: Adding data sources to the reporter

Now Catch Up On House of Cards

Page 44: Adding data sources to the reporter

Purchase Alert

Calculating thresholds, combining different

kinds of holds

Make easier for staff

Weeding Reports

Unions of never checked outs and non

recent checkouts

Takes a long time to run

Make easier for staff

Two Other Examples

Page 45: Adding data sources to the reporter