Google book settlement olita sept 2009

25
The Google Book Settlement: Where are we now, and how did we get here? Sian Meikle & Tony Horava University of Toronto Sept. 25, 2009

description

 

Transcript of Google book settlement olita sept 2009

Page 1: Google book settlement olita sept 2009

The Google Book Settlement: Where are we now, and how did we get here?

Sian Meikle & Tony Horava

University of Toronto

Sept. 25, 2009

Page 2: Google book settlement olita sept 2009

Outline

Overview of settlement Access to Google Books Copyright issues Marketplace impacts Integration/curation of content Competition issues Privacy matters Academic freedom Future business models

Page 3: Google book settlement olita sept 2009

Google Book Search

Started in 2004 42 Library Partners, many publishing partners Google-funded In-copyright and out-of-copyright material

Google and selected library partner servers only 10 million books to date:

2 million public domain (20%) 7.5 million in copyright, out of print (75%) 0.5 million in copyright, in print (5%)

Eventual aim: 30 million books

Page 4: Google book settlement olita sept 2009

Google digital products

Metadata Scanned (back files, library partner scans):

Scanned images: TIFFs Access derivatives:

JPEGs Image-based PDFs (one per page or one per book) Uncorrected OCR

Born-digital (front file, new content) Digital text, xml format

Page 5: Google book settlement olita sept 2009

Proposed Google Book Settlement 2005

US class-action lawsuit against Google American Publishers Association American Authors Guild

October 28 2008 proposed settlement announced

Oct 7 2009 (moved from June 11 09) Originally (final) Court Fairness Hearing Possible outcomes:

accept, reject, court oversight Out of scope: change agreement Sept 18: US Department of Justice advises Court not to accept

settlement but to encourage further discussion Sept 24: Court accepts motion to delay final hearing;

will hold status conference on Oct 7 instead

Page 6: Google book settlement olita sept 2009

Google Book Settlement Outline covers online access in US for books:

published before January 5 2009 covered by Berne copyright convention

Google pays $125 million $34.5 million to establish Book Registry $45 million to rights holders for books scanned prior to May

2009 ($60 per title) $45.5 million for legal fees

Split of future revenues: 63% to copyright holders 37% to Google

Page 7: Google book settlement olita sept 2009

Google Book Settlement: Products Display uses (saleable products):

Access, preview, snippets, book records Non display uses (free and research products):

Display of metadata only; full-text and geographic indexing without display of text; analytical research across corpus; and Google R&D

Inclusion: In print books: opt-in to display uses Out of print books: opt-out of display uses In print = commercially available in USA and Europe

Products: Individuals: sale of perpetual access per title via Google server Institutions: sale of annual access to ISD

Page 8: Google book settlement olita sept 2009

Book Registry

Non-profit independent agency representing plaintiff interests to: Manage rights database: book status, contact info Negotiate terms and prices of online book uses on

behalf of rights holders Distribute share of revenue to rights holders

Page 9: Google book settlement olita sept 2009

Institutional Subscription Licensing models Libraries that contribute books for scanning:

Fully participating libraries Give in-copyright books; get digital copies,

must meet security requirements Cooperating libraries

Give in-copyright books; get no digital copies Libraries that do not contribute books for scanning:

May subscribe to ISD (either whole or discipline based) Pricing model set by Google and Book Registry

Benefits of partnering: Ability to challenge institutional pricing model Some subscription discounts Information about inclusion / exclusion of books

Page 10: Google book settlement olita sept 2009

Google Research Corpus

All Google books except in-copyright works whose rights holders have removed their works

Hosted at Google, up to two other sites Non-consumptive research:

linguistic analysis, automated translation, book relationships, index/search techniques

Qualified users approved research agenda letter of support from participating library, book registry,

google, or corpus host

Page 11: Google book settlement olita sept 2009

Claimed21%

Orphans79%

How many orphans?

3.5 million out of print books scanned prior to May 2009

$45 million (rights claims)

÷ $60 per book______________ = 750,000 claimed books,

and 2.75 million orphans Out-of-print titles

But $45 million is the minimum payment, so numbers may vary.

One possible calculationOne possible calculation

Page 12: Google book settlement olita sept 2009

Google Book Settlement: Reactions Positive:

huge corpus available to wide audience bypassed orphan works log jam

Concerns: de facto monopoly user privacy intellectual freedom transparency equity of access long-term security of data

Page 13: Google book settlement olita sept 2009

Copyright challenges

Does the settlement erode statutory rights under copyright law? Is ‘fair use’ doctrine affected by the settlement? (NB – ‘fair use’ is

much broader than Canadian ‘fair dealing’) Many argue that it does not….it is a private settlement among

three parties. ‘Fair use’ legislative provisions haven’t changed. Many argue that it is more restrictive than ‘fair use’ General view that the settlement will be influential in setting de

facto standards for ‘fair use’, such as number of pages that can be displayed or printed; the conditions for archiving and indexing of text for discovery purposes

Contractual licensing is supplanting copyright legislation as the driver for reproducing and disseminating in-copyright books (in a commercial model) to our collective cultural heritage – enormous risk for the stewardship role of libraries.

Page 14: Google book settlement olita sept 2009

Copyright challenges (2)

The Registry will not be available to the public - a key tool is being developed privately

The board of the Registry will have no librarian or reader representation… will a balanced approach to copyright, access, and pricing issues be possible?

The settlement is silent on how agreements between libraries and the Registry would ensure that users can exercise their rights under the US Copyright Act

Will have a damper effect on Open Access and Creative Commons licensing – how will affect the long-term plans of the Open Content Alliance, for example?

Page 15: Google book settlement olita sept 2009

Pricing and market impacts

No other provider can be offered license terms better than Google’s for ten years – creates a virtual monopoly and enormous lead time advantage.

Upward pressure on pricing– affordability in a limited market. Pricing to be determined by: the pricing of similar products & services; the

scope of books available; the quality of the scan; and features offered via the subscription

The settlement refers to two goals: 1)market realization of revenues for rights holders, and 2) broad access by the public including institutions of higher education…to be based on “comparable products and services”. But which ones?

Only Google can license ‘orphan works’ (in the absence of legislation) Opting out for orphan works would be very problematic. Creates a huge

locked-in pool of books. Will have huge influence on market pricing for out-of-print books. Rights holder can set price, or the Registry will use a pricing algorithm

based on similar books to determine price

Page 16: Google book settlement olita sept 2009

Integration & curation issues

Google has opened Book Search via APIs: libraries can embed book images, previews, and links to Book Search within discovery layers and catalogues

Book Search won’t allow downloading of in-copyright books to mobile devices.

How can we leverage SFX link resolver to obtain maximum benefit from enriched content in Book Search?

Libraries can work closer with Google and OCLC to make it very easy to move from search results to purchased content (eg ‘Find in a Library’ link)

Could lead to better partnership arrangements with libraries for developing finding aids and user tools

Preservation is based on a commercial model, not on certifiable standards in a non-profit, research environment – what guarantee of permanence do we have? What if Google’s business model changes?

Page 17: Google book settlement olita sept 2009

Integration & curation (2)

Book Search offers a very limited form of collaboration (eg shared annotation among small & predefined groups) but:

Doesn’t permit enhancements of texts; Doesn’t permit the layering of new services upon texts; Doesn’t permit use of texts in digital mash-ups. Compare this with dynamic developments in ebook

interfaces for searching, sharing, storing, and managing ebook content

Page 18: Google book settlement olita sept 2009

The Competition….

The sheer size and scope of Book Search will invite comparisons with established commercial products, such as EEBO, ECCO, and the backlists of major publishers like Oxford or Taylor & Francis.

How will the pricing for the Institutional Database Subscription (IDS) affect pricing models in the academic marketplace?

There will be much pressure on US libraries to acquire the IDS. If and when it is available in Canada, there will be pressure on libraries to acquired it.

How will the ebook aggregators (eg NetLibrary, ebrary) be affected? Google has very deep pockets for R&D

Turf wars - Google is providing public domain titles in ePub format to the Sony ebook reader and recently announced that “it would let anyone resell the millions of out-of-print books it has scanned from the nation’s libraries.”

What was Amazon’s response? “an Amazon executive immediately rejected the idea of becoming Google’s

affiliate”

Page 19: Google book settlement olita sept 2009

The Competition (2)

Comparison of DRM systems will be important for access to material.

How will Google propose to integrate Book Search into the researcher’s workflow?

Book Search won’t be able to offer the range of functionality and tools on Scholars Portal as a discovery environment for researchers

Can we deem Book Search ‘a collection’ analogous to a library collection, eg on Scholars Portal?

Page 20: Google book settlement olita sept 2009

Privacy matters

Concerns over user privacy…will these be addressed? Google has an unprecedented opportunity to monitor

and track user reading habits, eg when a user prints out pages from a book in the ISD, there will be a visible watermark displaying encrypted session information “which could be used to identify the authorized user that printed the material or the access point from which the material was printed” (art 4.1)

“For purchases of online e-book access or access via institutional subscriptions, Google will have the technical ability to track every page that one views, even recording how long is spent on a page.” (Alan Inouye, ALA)

What will privacy look like in a Google environment?

Page 21: Google book settlement olita sept 2009

Google’s response re privacy concerns

Federal Trade Commission (consumer protection) letters and statements “..because the settlement agreement has not yet been approved by the court, and the

services authorized by the agreement have not been built or even designed yet, it's not possible to draft a final privacy policy that covers details of the settlement's anticipated services and features. Our privacy policies are usually based on detailed review of a final product -- and on weeks, months or years of careful work engineering the product itself to protect privacy. In this case, we've planned in advance for the protections that will later be built, and we've described some of those in the Google Books policy” – Jane Horvath, Global Privacy Counsel, Google

“The Bureau [of Consumer Protection] asks Google to commit to a continuing dialogue regarding consumer privacy policies for Google products and services…I believe such a commitment would require Google to adhere to the concept of privacy by design,..” – Commissioner Paula Jones Harbour

Center for Democracy and Technology Privacy Recommendations for the Google Book Settlement

Page 22: Google book settlement olita sept 2009

Intellectual Freedom

If qualified users want to search the Research Corpus for ‘non consumptive’ research, e.g. textual or linguistic analysis, their research agenda needs to be approved by the host institution

“Research Agenda” means a document that describes a research project in sufficient detail to demonstrate that it will be Non-Consumptive Research” (p. 17 of Settlement)

Host institution is responsible. What will the criteria be? This will certainly conflict with academic freedom…fundamental values

will be at play Google can exclude a book for editorial reasons: on what basis?

Pressure from governments, powerful interest groups could have an important impact, e.g. Google saving itself from embarrassment or bad PR by suppressing a controversial book

The Settlement requires Google to provide public access and the ISD for only 85% of the in-copyright, not commercially available books (potentially 1M books)

Censorship & freedom of expression – another conflict with library values

Page 23: Google book settlement olita sept 2009

Equity of Access

Works within works might be excluded, depending on rights holder exercising his rights independently, eg an essay, a poem, a chart or a table

The Settlement doesn’t include pictorial works, eg photographs and illustrations will be blacked out.

Momentum driving supply & demand : “…it is possible that faculty and students at institutions of higher education

will come to view the institutional subscription as an indispensable only because research libraries have invested significant resources in preserving out of print books.. They might insist that their institution’s library purchase such a subscription. The institution’s administration might also insist that the library purchase an institutional subscription so that the institution can remain competitive with other institutions of higher education in terms of the recruitment and retention of faculty and students.” ALA-ACRL-ARL Brief

This can exacerbate inequalities among libraries, based on budget realities.

Page 24: Google book settlement olita sept 2009

Future Business opportunities under the settlement… Print on Demand Custom Publishing PDF downloads Consumer subscriptions Summaries, abstracts, compilations To compete, publishers will need to focus more on

metadata, rights-management and new logically structured units (ie not pages) using a XML-based content architecture and workflow.

Announcement last week: Google will provide the public domain books to On Demand Books for print-on-demand publishing using the Espresso Book Machine.

Page 25: Google book settlement olita sept 2009

Conclusion

We need to monitor developments closely, and engage in vigorous, balanced advocacy with our stakeholders, and show support for US libraries & organizations that are raising serious concerns

What will be the future impact on our libraries? “Google is a behemoth, and the Google Settlement, if

approved, will make it the behemoth of the book….Will the restraints of the Book Rights Registry be enough to keep it from abusing such a position, or will they be like the ropes of the Lilliputians around the sleeping Gulliver? This story is surely only in chapter one” - Grace Westcott, Globe & Mail Feb 20, 2009