Ebooks without Vendors: Using Open Source Software to Create and Share Meaningful Ebook Collections...
-
Upload
matt-weaver -
Category
Education
-
view
920 -
download
2
Transcript of Ebooks without Vendors: Using Open Source Software to Create and Share Meaningful Ebook Collections...
EBOOKS WITHOUT VENDORS
Using Open Source Tools to Create and Share Meaningful Ebook Collections
Who am I?
Matt Weaver
Systems Medical Librarian – Cleveland Clinic
Who am I?
Matt Weaver
FormerIT Manager - Westlake Porter Public Library
This talk is not about ebooks
as products
Not an alternative to Overdrive, ebrary, 3M, etc.
EBOOKS AS TOOLSTo be created by:
• the library• the community
Opportunities for:• collaboration• connection
An Experiment: Library as publisher
USAGE: late Oct. 2013 through Jul. 2015
More than 2,000 ebook downloads
More than 60,000 recipes downloaded/printed
15% by cardholders
Costs:
Content: $0
Software licensing: $0
Staff time: 4-7 hours per ebook (estimated)
Mostly editing
An Experiment: Library as publisher
SECURING ACCESS TO CONTENT
DIY: Copyright
Disclaimer:I am not now, nor have I ever been a lawyer.
I am not a copyright expert.
tap shoes
DIY: Copyright - Resources
http://collections.stanford.edu/copyrightrenewals/bin/page?forward=home
Digital Copyright Sliderhttp://librarycopyright.net/resources/digitalslider/
DIY: Copyright - Resources
Section 108 Spinnerhttp://librarycopyright.net/resources/spinner/
DIY: Copyright - Resources
Copyright Geniehttp://librarycopyright.net/resources/genie/
DIY: Copyright - Resources
Orphan Works
: http://bit.ly/1WRC8ck
“…the Copyright Office rejects the idea that fair use can provide an adequate solution [to the problem of orphan works]”… Krista Cox, Association of Research Libraries
DIY: Copyright Because of digital distribution,
and
because the library does not own titles to be digitized…
o no Fair Use case, o no section 108 protections
Documention ofcopyrightresearch
ContentPermissionagreements
DIY: Copyright
EBOOKS DISSECTED & DIGITIZED
ePub as zip file
ePub as zip file
ebook markup
HTML & CSS
Everything has been digitized, right?
Bad OCR: hours, fractionsScanned ≠ Digitized
CorrectedWPPLEpubpage
Everything has been digitized, right?
Curation/editing takes time.Who else would invest such time?
CorrectedWPPLEpubpage
PRODUCINGEBOOKFILES
Homer ebook project
http://bookscanner.pbworks.com/w/page/40965440/FrontPage
HomerThe following tools are installed as part of the Homer Project:
ImageMagick (for manipulating images) Jpegtran (loseless jpeg transformation) JBIG2 encoder (compression tool for bi-level images) Tesseract-OCR (optical character recognition) RubyInstaller (installs the Ruby programming language) Hpricot (HTML parser) RMagick (interface between the Ruby programming language
and ImageMagick) Pdfbeads (to create searchable PDF) Cmdow.exe (command-line utility used in Homer) ScanTailor (post-processing tool) Homer (command-line bash script)
HomerThe following tools are installed as part of the Homer Project:
ImageMagick (for manipulating images) Jpegtran (loseless jpeg transformation) JBIG2 encoder (compression tool for bi-level images) Tesseract-OCR (optical character recognition) RubyInstaller (installs the Ruby programming language) Hpricot (HTML parser) RMagick (interface between the Ruby programming language
and ImageMagick) Pdfbeads (to create searchable PDF) Cmdow.exe (command-line utility used in Homer) ScanTailor (post-processing tool) Homer (command-line bash script)
EbookProduction Workflow
Homer: ScanTailor
Preprocess tiff-format images of book pages
Deskewing De-speckling Correcting warp
Right-to-left language support
Outputs images for Homer
Homer: ScanTailor
Homer: ScanTailor
OCR challenges
HOMER BASH SCRIPT
It looks like command-line…
HOMER BASH SCRIPT
but it’s drag-and-drop!!!
Homer: tesseract-ocr
Optical Character Recognition
Multilingual support -From Afrikaans to
Vietnamese
Homer: pdfbeads
Outputs a searchable PDF
Homer & pdfbeads
Outputs a searchable PDF
Sigil
https://code.google.com/p/sigil/
Epub Validator
http://validator.idpf.org/
Calibre
http://calibre-ebook.com/
COMMUNITYCOOKBOOKCOOKING.WESTLAKELIBRARY.ORG
Drupal
Open source content management system
drupal.org
Drupal
Ability to create
custom fields for
metadata – can be
hidden from users
drupal.org
Drupal – Controlling Access
Private files
vs
public files
drupal.org
Drupal Controlling Access – ILS authentication module
Drupal – Controlling Access
Taxonomy Control Lite
module: permissions based
on taxonomy terms
drupal.org
Drupal – Recipe module
3 content types:•recipe•ebook•organization
Drupal 7 “Responsive”
layout
Drupal - Omega 3 Responsive Theme
PHASE TWO(WHAT MIGHT HAVE BEEN?)
Drupal – ePub module
Drupal – ePub module
Drupal – PDF module
Drupal – HTML import module
Merging Content
The Community Cookbook – mapping
Bonus:Capturing Original Content …with one more open-source tool, we can even help them design print versions:
Bonus: Capturing Original Content
We can do everything but the printing.
Further Reading
http://journal.code4lib.org/articles/9911
Further Reading
Jarret Buse Epub from the Ground UP:A Hands-on Guide to EPUB2 and EPUB3
Excellent guide to the guts of ebooks
Features many of the open-source programs I have discussed
http://www.worldcat.org/oclc/837954536
Further Reading
Stanford University: Copyright & Fair Use – Charts and Toolshttp://fairuse.stanford.edu/charts-and-tools/
mattrweaver}
Image creditsOpen Source Sign Timothy Appnel -
https://www.flickr.com/photos/tappnel/5798812875/“Librarian from Turn of the Century” -
http://www.moyak.com/researcher/Clients/male_librarians/index.html?id=34
Ereaders - Michael Porter https://www.flickr.com/photos/libraryman/5052936803/
Apples & oranges http://mrg.bz/n1xLHgTechno_background2.jpg (ones and zeroes)
http://www.morguefile.com/creative/GrafixarRicoh Copier:
http://www.itinstock.com/ekmps/shops/itinstock/images/ricoh-aficio-mp-4001-fast-photocopier-copier-printer-scan-fax-5598-p.jpg