UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really...

26
UKSG PRESENTATION PUBLISHER SOLUTIONS INTERNATIONAL, KEITH ABBOTT AND CHARLIE WHITE APRIL 2016 BOURNEMOUTH

Transcript of UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really...

Page 1: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

UKSG PRESENTATION PUBLISHER SOLUTIONS INTERNATIONAL, KEITH ABBOTT AND CHARLIE WHITE

APRIL 2016BOURNEMOUTH

Page 2: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Publisher Solutions International

• Established in 2005• Initial focus on the identification, case

development, and remediation efforts relating to subscription abuse.

• Specifically created to serve as an independent third party enabling STM publishing industry to benefit from the aggregation and analysis of confidential data without competitive or anti-trust concerns.

Page 3: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Transition to IP Address Verification Work

• PSI customers asked us to expand fraud identification work to include IP Address/Site License business.

• As part of this effort, PSI and Wiley created JV to conduct global clean-up of IP address data.

• Proprietary database of >50k institutions & >1 billion IPs• Data from 150+ publishers• US data (last significant territory) to be completed by

March 2016

Page 4: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts
Page 5: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Key Takeaway• State of IP Address data and management

of same within the STM industry is poor.• ca. 58% of IP Address data requires further investigation– e.g.

Territory Lines of Data Red Amber Green

France 63,071 4% 58% 38%

Germany 58,145 5% 43% 52%

China 149,435 1% 64% 35%

Avg/Total 270,651 3% 58% 39%

Page 6: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Takeaways from completion of IP Address Clean Up

• Publisher and even library data is universally poor.• Poor IP Address data extends far beyond initial

expectation that problems would be primarily attributed to fraud.

• Neither publishers nor libraries are equipped to address the problem and maintain long-term solution.

• Resource requirements for publishers and libraries alike is overwhelming for current systems, processes, and budgets – even at existing levels of inaccuracy.

• Keeping IP Address data clean provides no competitive advantage – but not doing so presents significant risk on many levels.

Page 7: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Associated Risks/Problems

• Easy to insert false IP addresses into systemswith no inherent checks

• Wrong IP addresses on accounts result in false usage reporting• Incorrect usage reporting carries significant implications for

pricing and widely used marketing metrics across industry• Fraud can go undetected for years• IP data errors create “openings” for illegal proxy/downloading• Open Access publishers have little or no idea where usage is

coming from• Data gets dirty as fast as it is cleaned

Page 8: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Institution APublisher 1

CURRENT STATEUNVETTED IP ADDRESS CHANGES/ADDITIONS

(Largely Manual Data Entry)

Publisher 2

Publisher 3

Publisher 4

Institution B

Institution C

Institutions Changes Publishers Unvetted Changes

70K 1 5.5K 3.85M Annual

70K 5 5.5K 1.93B Annual

70K 10 5.5K 3.85B Annual

Page 9: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

IP Address Segmentation at Wiley

Page 10: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Brief Introduction• Keith Abbott, 25 years in industry from a journals

fulfilment background• Current emphasis is on content licensing and underlying

data supporting access to content• Team of two people checking licenses and IP address data• My focus is on IP address data issues confronting industry• Working with PSI for eleven years to audit IP addresses

Page 11: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Print was difficult to handle

Page 12: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Is online access any better?

• And they all have the same IP address range

• 134.245.*.*

• University of Kiel• GEOMAR• IPN• ZBW (Kiel)• ZBW (Hamburg)• Christian Albrechts Universität zu Kiel• UKSH• Helmholtz-Zentrum für Ozeanforschung Kiel• German National Library of Economics• University Hospital Schleswig Holstein (Kiel)• Institut für die Pädagogik der

Naturwissenschaften und Mathematik an der Universität Kiel

• HWWA

Page 13: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Getting Better – we have got it down to six!

• University of Kiel• University Hospital Schleswig Holstein (Kiel)• GEOMAR• IPN• German National Library of Economics (Hamburg)• German National Library of Economics (Kiel)

• But they are all still sharing the same IP address

• 134.245.*.*

Page 14: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

IP addresses must be split out per location

University Hospital Schleswig Holstein (Kiel)134.245.121-255.*

German National Library of Economics (Kiel)134.245.101-110.*

GEOMAR134.245.1-50.*

IPN134.245.51-60.*

German National Library of Economics (Hamburg)

134.245.110-120.*

University of Kiel134.245.61-100.*

Page 15: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

What can we learn from this example?

• Data is complex and confusing with multiple namesacronyms and English/native language variants

• IP addresses in addition to database accounts must be accurately segmented

• Failure to maintain correct IP address information could lead to access being inappropriately shared or customers losing access

• A publisher must check their underlying data matches their license agreements

• Bad IP address data will lead to incorrect usage statistics

Page 16: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Introduction

• Charlie White, Senior Customer Service Advisor.

• Working on a day to day basis with Institutions, Individuals and Agents

• SAGE has been working with PSI on both Print and IP Fraud investigations for the past 7 years.

• I will be focusing on IP Fraud

Page 17: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

What is IP Fraud?

• Fraud definition - wrongful or criminal deception intended to result in financial or personal gain.

• How is it achieved in Publishing? It starts with data.• Publisher contacted by agent with a list of IP ranges for a mutual

customer.• Publisher trusts the IP ranges are correct and uploads onto their

system.• Hidden in the customer’s genuine IPs is a range owned by the

agent.• Publisher has unknowingly opened all the customer’s content to

the agent.• Back to our definition. What goes the agent gain?

Page 18: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Case Study

• A large Thai Agent “Agent X”was an subscription agent based in Thailand.

• Agent investigated by PSI initially for Print Fraud leading to publishers stopping all business with the agent.

• Agent X attempt to get around the ban.• Despite many negotiations, Agent X fail to settle and their

accounts are put on hold for good.• PSI approached by a ‘whistleblower’ with information

concerning the agent’s business practices. • The agent also involved in IP Fraud.

Page 19: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Case Study

Page 20: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Case Study

Page 21: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

What can we do to prevent this?

• IP Audits.

• Stop ‘rogue’ Subscription Agents from placing orders with us.

• A greater understanding within the Industry as a whole of IP abuse and the importance of keeping accurate and up-to-date information.

Page 22: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Moving Forward

• Industry needs a practical, economically viable,and effective solution for managing IP Addressdata and enabling publishers to gain a better understanding of who their customers are:– Institutions accessing data– Potential customers visiting publishers– Authors contributing to publishers

Page 23: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

On-line IP Register

Unrestricted internet users

RegisteredPublisher or Agent

Publisher or Agent

RegisteredInstitution

Institution

Basic lookup

Request to be added to DB

Register themselves

Detailed lookup of their data

Request change to their data

Detailed lookup of any data

Request to register themselves

Request to add an institution

Request change to any data

All requests

Page 24: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

LONGTERM SOLUTION: CENTRALIZED IP ADDRESS REGISTRY

• Create a global IP address database for allPublishers to use and establish long-termindustry standard

• Clean up all publisher authentication databases• Verify all new IP additions and changes• Check Publisher Log Files against IP database for

abuse detection and usage anomalies• Enlist support from library community to keep the IP

address database current and accurate

Page 25: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Institution

e.g. University of Oxford

New IP address

Delete IP address

PSI Verify IP

Publisher 1

Publisher 2

Publisher 3

Publisher 4

API/unique IP

API/unique IP

API/unique IP

API/unique IP

PSI “Cube”IP Registry

PSI-PROACTIVE/CENTRALIZED VETTED IP ADDRESS VALIDATION(Largely Automated Process)

Institutions Changes Publishers Transactions

70K 1 5.5K 70K

70K 5 5.5K 350K

70K 10 5.5K 700K

Page 26: UKSG Conference 2016 Breakout Session - Who’s reading your valuable content and did they really pay for it?, Keith Abbot, Charles White and Andrew Pitts

Any Questions?

Andrew Pitts: [email protected] White: [email protected]

Keith Abbott: [email protected]