Hackfest 3: THIS TIME IT'S PERSONAL (or) Can I Get A M etasearch? A Cast of Dozens And the O'Reilly...

Post on 11-Jan-2016

212 views 0 download

Tags:

Transcript of Hackfest 3: THIS TIME IT'S PERSONAL (or) Can I Get A M etasearch? A Cast of Dozens And the O'Reilly...

Hackfest 3: THIS TIME IT'S PERSONAL

(or)

Can I Get A

Metasearch?

A Cast of DozensAnd the O'Reilly Lithographic Spirit GodsAccess 2004 – Halifax – 13 October 2004

First, A Poem

"Guinness, Murphy’s, HarpHops is bitter fruit. AllGood from Dublin Core"

-A Librarian (Unknown)Found in a bar in Windsor, ONT, in 2002

A Definition

• Hack \Hack\. noun:– “A quick job that produces what is needed, but not

well.” (Jargon File 4.3.0)

– “One who works hard at boring tasks [syn: drudge, hacker.]” (WordNet (r) 1.7)

• Hack \Hack\. verb:– “To use frequently and indiscriminately, so as to render

commonplace.” (Webster’s, 1913)

• Fest \Fest\, Feste \Fes”te\, noun:– “A feast. [Obs.] –Chaucer.” (Webster’s 1913)

Objectives

• Solving problems or develop new ideas• Sharing a temporary, non-competitive, non-

work, no-pressure, no-string-attached, collaborative social, and educational environment

• Learning about contemporary issues in libraries and tech

• Learning from each other• Having fun

The Event

• 40+ signups, 35+ attendees, many newcomers• Held at St. Mary's University• Sixteen project ideas suggested in advance (but

kept private until Hackfest)• One big lab, one big meeting room• Two dedicated remote hackfest servers• 30+ minutes of discussing projects• Group up and go!

The People

The Suggestions

• Sixteen (16!) project suggestions• Several metasearch ideas (framework,

training)• Several personal library ideas (rss from

repositories, tables of contents, integrating with external systems)

• Others: cobrowsing, harvesting, conversions, "bitter date" normalization

Project: Tables of Contents

• Who: Richard Baer, John Dobson, Grant Gelinas-Brown, Todd Holbrook, Sherri Vokey, William Wueppelmann

• What: To discover a method for harvesting e-journal table of contents information from freely-accessible publisher web sites (without having to enter into negotiations with the publishers). Library users would be able to save a list of favorite journals in a web-based personal account and receive notification of updates. The ability to link to the full-text (with an active subscription) would be an additional feature, as would searching within the personal database.

Project: Tables of Contents

• Screen scraping: recovery of a document's underlying data structure by parsing its source code– inference of boundaries between records and

fields through examination of patterns in the tag structure

– inference of what data elements are represented through examination of table headings, field labels, other clues contained within tags such as name or class attributes

Project: Tables of Contents

Project: Tables of Contents

Project: Tables of Contents

Project: Personal Library

• Who: Nancy Hoebelheinrich, Tracy Seneca, Brandon Uhlman, Lisa Yeo

• What: To come up with the ways and means that our library systems can talk to personal library systems from Apple, Google, etc.

Project: Personal Library

• What is a personal library?– More than a simple list of the bibliographic (or

sales) info about items I own or have read.– I should have access to the full text, and to

related full text wherever feasible.• Reviews of the work• Materials by same author (or auto-link to federated

search)• Recommended (related) reading

(see project 8)

Project: Personal Library

• Not limited to items documented in databases, but can include scanned items, my own personal papers.

• I should be able to navigate by methods meaningful to me, not just info about the item. (personal timeline, categories I create).

• A personal library should grow more rich over the years, not just because I add items, but by learning how I use its contents. Not just the data about the item, but also data about how I used it.

Project: Personal Library

• Our bookshelves– The books– Our ephemera: folders, notepads, papers, photocopied

articles

• Bibliographic Management (ProCite, RefWorks, Citation Manager, Online Portfolios)

• Browser bookmarks, our own web pages/sites, blogs• Accounts: Amazon, Netflix, AllMusic, ITunes• Hard drive: downloaded articles, directories for

classes, projects

Project: Personal Library• LibDB

Emphasis on different interfaces for different user types

• Citation ManagerSimon Fraser UniversityIntegration with research sources; data entry not separate task

• Delicious LibraryVisual interface; easy to enter, gather related information with physical item in hand.

• Library LookupGather information from your library catalog while browsing amazon

• Project 8: Blog book recommendation activity

• Stuff I've SeenSusan Dumais - MicrosoftAutomatically index items you’re interested in; emphasis and ranking based on how you interact with the item, not just the item itself.

• Federated SearchingEnhance the information you have by pulling in related material. Your personal library should grow on its own.

Project: Personal Library

Project: Personal Library

Project: Personal Library

• Interesting issues:

– Copyright / AuthorizationThis can’t “belong to” an institution – it shouldn’t go away when you finish school, etc.

– How does your access to related materials change as you move from place to place?

Project: Metasearch Considerations• Who: Julie Arie, Roy Tennant, Kent

Weaver• What: Document to "highlight

issues to consider when reviewing metasearch software applications."

• Definition: "an application that performs simultaneous searching of two or more different types of resources and effectively presents results, with appropriate machine-level communication between related applications."

Project: Metasearch Considerations

• Local: configure/control, compatibility, licensing, political/privacy/administrative

• Application: protocols, syntax, authentication, configuration, target parameters (presentation, configuration, technical), deployment, results, interface, management, consortial support, hardware, interoperability

• Vendor: implementation costs, maintenance costs, support, roadmap/vision, selection process

Project: Metasearch Architecture

• Who: Walter Lewis, calvin mah, Art Rhyno

• What: How do you design an architecture for metasearch that can be used in different environments?

• Artifact: design docs, sample profiles

Project: Metasearch Architecture

• Design Layers:– Targets– Instances– Branding– Application space

• How do you model/define schemas for each?

• SETH: "Search Everything 'Till it Hurts"

Project: Metasearch Arch.: Target<?xml version="1.0"?> <targets> <target> <user_agent> <default/> </user_agent> <host/><HTTP_parms> <steps no=""> <base_HREF /><method> GET|POST </method> <nvp name=""/><extract name="" type="regexp/Xpath"/> <cookie name="" path="" domain="" age=""

secure=""> PASS_FORWARD COLLECT </cookie></steps> </HTTP_parms> <Z_parms> <database_name/> <port/>

<result_set_naming_required> </result_set_naming_required>

<Z_RecordSyntaxes> <Z_Syntax/><Z_Syntax/></Z_RecordSyntaxes> </Z_parms><wsdl>URI</wsdl><target_URL/><last_updated/></target></targets>

Project: Metasearch Arch.: Instance<?xml version="1.0"?> <target_sets> <target>

<search_label /> <search_descriptor /> <result_label />

<preferred_record_syntax> mergeability criteria dedup

</preferred_record_syntax><result_ranking /> <transformer /> <timeout /> <authentication> <user>machine login/pass</user>

<target> SAML? referrer apache_style form certificates</target> </authentication>

<resolver_url /><search_types><search_type> SUBJECT|TITLE|

<transform type="HTML|SCREEN|PDF" /></search_type> </search_types> <generator /><target_hints> <!-- hand off when search fails --> URI|

text </target_hints><!-- if failed --> <alt_target/><meta><instance_name />

<form_type/><form_label /> <help_files /> </meta> </target> </target_sets>

Project: Metasearch Guide

• Who: Joyce Wong, Simon Lloyd, Lissa Potter

• What: An interactive tool that incorporates critical thinking processes from library tutorials and help guides with meta-search functions

• Artifact: A new prototype user interface

Project: Metasearch Guide

• Major changes between proto-type and new draft include: – a more structured approach in which the tool is

presented as a series of steps. – emphasis is on examples to guide students through the

critical thinking – user is asked for backup search words near the

beginning of the tool. The backup search words are then included as alternate search strategies later on.

– demos in the form of videos

Project: Metasearch Guide

• The team also discussed more advanced features such as: – "test run" options by which users can test their

search statements by a preliminary result screen that provides both qualitative and quantitative evaluations.

– the option to save their search history in some form of personalized

Project: Rakoon

• Who: Peter Binkley, Corey Davis, John Durno, Kenton Good, Michael Hohner, Ross Singer, Steve Zinck

• What: a co-browser for RAKIM (virtual reference tool) using Cocoon

• Artifact: a working prototype

Project: Rakoon

• Co-browsing:– Bandwidth-intensive session w/shared screen,

mouse, etc. (e.g. QuestionPoint)– Co-proxy with regular screen refreshes from

shared cache (e.g. 24x7)

• Used latter as model, creating proxy using Cocoon, built on Art Rhyno's Hackfest I project

Project: Rakoon

Project: Rakoon

Project: Rakoon

Project: Rakoon

• Future development:– Proxies HTML well, but not other media types– Security audit– PATRIOT Act issues– Actual integration w/RAKIM (!)

Project: RSS From Repositories

• Who: Kristina Aston, Cameron Metcalf, Pat Moore, Miles Poindexter

• What: – getting an RSS feed out of a digital repository classified

in their field of study– alerting users when the latest additions are added– an RSS feed of new book acquisitions on a library's

homepage instead of statically-generated HTML pages.

• Artifact: Prototypes! See http://rockies.med.yale.edu/~group8/

Project: Mirroring Weblogs

• Who: Dan Chudnov, Brian Tingle

• What: How to enable a LOCKSS-like "lots of copies" of weblog data? If an "important" weblog "goes dark," how can we re-light it elsewhere?

• Artifact: Simple design, diagram

Project: Mirroring Weblogs

Summary

• 30+ people• 8+ projects• Wide range of activities:

– Focus on metasearch and personal library– New service models, new models for existing

services– Working demos– Whiteboard-only hacking– Building on previous years' work

Thoughts on Process

• Still not sure whether to share suggestions beforehand; good reasons for/against

• Quick re-assessment of ideas, skill balance soon after project groups assemble

• Post-lunch-ish reassembly, quick reports• Whole-day pre-conf, single location, format

works• Wiki helpful for organizing projects,

perhaps we can do more with it

New for 2004: Hackfest Awards

The "There's More Than One Way To Hack It" Duct Tape

Hackfest Awards

Art Rhyno:Access Pimp

(self-proclaimed!)(honest!!)

Hackfest Awards

2002

2003

2004

Peter Binkley:Lifetime Achievement

Acknowledgements

• Tamsin, Steven, Peter, et al. at Acadia, Saint Mary's for everything, esp. logistics, support

• Saint Mary's for facilities

• John for co-coordinating

• SFU (calvin), YCMI for servers

• Roy for the hype

• All the participants!