Presentation to AAP Digital Working Group

7
A Million Little Pieces Identifying Digital Book Content

Transcript of Presentation to AAP Digital Working Group

A Million Little Pieces

Identifying Digital Book Content

Ceci n’est pas un livre

• Different formats– Ebooks – EPUB, Kindle, PDF, downloadable audio– Physical books – hardcover, trade paper, mass

market, large print, POD, audiobook• Chapter-by-chapter – does the customer

need/want the whole book?• Some books make great websites– Travel (Frommers.com)– Recipes (Epicurious)

Identifying ebooks

• One ISBN/format?– Maintaining metadata on all these formats is confusing and

repetitive– ISBNs are expensive

• ISBN for EPUB only?– Ebook distribs will assign ISBNs (or other identifiers) – publishers

not in control– How to track sales for different formats?

• New identifier?– Reinventing the wheel– How to get the book industry to use it consistently?– Standards take a long time to get through ISO

Identifying chapters

• ISBN bloat in databases (Caravan project)• Exponentially more complicated than one-

ISBN-per-ebook-format• Cost of ISBNs for something like this is

prohibitive for many publishers• Cost to industry of inventing a new identifier for

chapters• DOI is useful – but is a pointer to content

location – content still needs to be identified

Identifying “chunks”

• Exponentially complex • For sale or display or license?• How to locate relevant chunks in GBS and

Internet Archive?

The Lady or the Tiger?

• Assign ISBNs to everything in sight (and deal with the metadata pandemonium)

• Invent new identifiers (which will take many years)

• Let the marketplace decide (and cede discoverability to third parties)

Useful resources

• BISG/BISAC Identifiers Subcommittee – www.bisg.org

• US ISBN Agency – www.isbn.org• LJNDawson – www.ljndawson.com• Persona Non Data –

personanondata.blogspot.com