Small-world connectors across academic web spaces Lennart Björneborn Royal School of Library and...
-
Upload
kamron-parnham -
Category
Documents
-
view
216 -
download
1
Transcript of Small-world connectors across academic web spaces Lennart Björneborn Royal School of Library and...
Small-world connectorsSmall-world connectorsacross across
academic web spacesacademic web spaces
Lennart Björneborn
Royal School of Library and Information Science Copenhagen
AoIR-ASIST Workshop onWeb Science Research Methods
Association of Internet Researchers Conference, Brighton, UK
19 September 2004
M.C. Escher: House of Stairs, 1951
WWW = document network= collaborative weaving
Woo
d et
al.
(199
5)
3
web characteristicsweb characteristics
www = new type of document system
= no central control / coordination
= bottom-up construction
www = distributed knowledge organisation = ’3D’ = distributed + diversified + dynamic
www = individual input in collective medium
= collaborative weaving
www = self-organized macro-level aggregations (clusters) of micro-level interactions
www = local actions global consequences(e.g. small-world phenomena)
small-world networkssmall-world networks
small-world = highly clustered + short paths– short distances through shortcuts between nodes in network– small-world = short local + short global distances– efficient diffusion of signals, contacts, ideas, viruses, etc. in networks
social network analysis in 1960s: ’six degrees of separation’– today: ‘small worlds’ in biological, chemical, technical, social networks– brains, ecological food webs, scientific collaboration networks, etc.
Watts & Strogatz 1998
0
50
100
150
200
250
300
350
400
450
0 1000 2000 3000 4000 5000 6000 7000 8000
Subsites
In-n
eig
hb
ors
scale-free link distributionscale-free link distribution
power law = # in-neighbors / subsite
6
research motivationresearch motivation
distributed knowledge organization small world structures
exploratory capabilities (accessibility + navigability)
– core issues in LIS (library and information science)
– short link paths human web surfers + digital web crawlers can reach and retrieve web pages
what micro-level web activities contribute
to small-world link structures?
– how do academic link creators actually connect
documents, topics, genres, and sites across the Web?
7
main research questionmain research question what types of web links,
web pages and web sites
function as cross-topic connectors
in small-world link structures
across an academic web space?
webometricswebometrics
the study of quantitative aspects of
the construction and use of
info. resources, structures and technologies on the Web,
drawing on bibliometric and informetric approaches
informetrics
bibliometrics
scientometrics
webometrics
cybermetrics
© Björneborn 2004
basic link terminologybasic link terminology B has an inlink from A : ~ citation B has an outlink to C : ~ reference B has a selflink : ~ self-citation
E and F are reciprocally linked H is reachable from A by a link path A has a transversal link to G : shortcut
C and D have co-inlinks from B : ~ co-citation
B and E have co-outlinks to D : ~ bibliographic coupling
co-links
© Björneborn 2004
A
B
D
E G
F
H
C
UK link dataUK link data 20012001
109 UK universities 7669 subsites
– www.hum.port.ac.uk– www.atm.ox.ac.uk– ...
3.4 million web pages 39.3 mill. page outlinks
– 34.4 million site selflinks– 4.9 million site outlinks
delimited data set – 105 817 web pages– 207 865 links
between 7669 subsites
5-step methodology5-step methodologyA. Graph model of 7669 UK academic
subsites;
B. 189 random subsites in SCC(Strongest Connected Component);
C. 10 path nets with all shortest paths between five pairs of topically dissimilar SCC subsites;
D. Source and target pages along shortest link paths in 10 path nets;
E. Links, pages and subsites providing transversal (cross-topic) connections in 10 path nets.
A
CB
DE
corona modelcorona model1893 SCC
Strongest ConnectedComponent
96 IN-Tendrilsconnected from IN
2660 OUTreachable from SCC
626 INtraversable to
SCC
55 OUT-Tendrils connected to OUT
7 Tubeconnecting IN to OUT
2332 Dis-connected
© Björneborn 2004
bow-tie model
Broder et al. 2000
.ac.uk
.uk
cfd.me.umist.ac.uk
ercoftac.mech.surrey.ac.uk
cajun.cs.nott.ac.uk
ukoln.bath.ac.uk
cs.man.ac.uk
ashmol.ox.ac.uk
collections.ucl.ac.uk
vlmp.museophile.sbu.ac.uk
shortest shortest link pathlink path
© Björneborn 2004
path net = ‘mini’ small worldpath net = ‘mini’ small world
transversal link
path net = all shortest link paths between two given nodes (subsites)
© Björneborn 2004
15
10 path nets10 path nets
hum.port.ac.uk
Faculty of Humanities and Social Sciences, Portsmouth
Atmospheric, Oceanic and Planetary Physics,
Oxford atm.ox.ac.uk
economics.soton. ac.ukEconomics Dept, Southampton
Chemistry Dept, Glasgow chem.gla.ac.uk
psy.man.ac.ukPsychology Dept, Manchester
Mathematics Dept, Glasgow Caledonian maths.gcal.ac.uk
speech.essex.ac.uk
Speech Research Group, Linguistics Dept, Essex
Palaeontology Research Group, Earth Sciences Dept, Bristol palaeo.gly.bris.ac.uk
geog.plym.ac.ukGeography Dept, Plymouth
Ophthalmology Dept,[eye research] Oxford eye.ox.ac.uk
5 pairs of topically dissimilar subsites
+ both directions
= 10 path nets with all shortest paths
16
indicative findingsindicative findings no generalizable findings – indicative only
– national + sectoral + institutional delimitation = UK academic subsites
– temporal delimitation = 2001 snapshot : do not cover dynamic changes
– small stratified sample of 10 path nets
may however be fruitful for future large-scale investigations
– computer-science sites may be important transversal (cross-topic) connectors across academic web spaces
– personal link creators may be important connectors across sites and topics in academic web spaces – especially personal link lists
– over 80% of transversal links may be academic (research, teaching)
– close relation: hubs / authorities and betweenness centrality
web of genres & genre driftweb of genres & genre drift
© Björneborn 2004
18
possible small-world implications/applicationspossible small-world implications/applications library and information science
– also focus on distributed knowledge organization (www)– also focus on exploratory capabilities in distributed info.systems
convergent (goal-directed) and divergent (serendipitous) info.behavior
web sociology / cyberscience– small-world links > cross-social / cross-domain weak ties– counteract balkanization into disconnected / unreachable insularities – small-world ‘gate-keepers’ with betweenness centrality in networks– tracking interdisciplinary boundary crossings– web mining of fertile areas for cross-disciplinary exploration
and cross-pollination
search engines– better coverage in web traversal + harvesting– zoomable maps of web clusters + small-world shortcuts
19
Five ’laws’ of web connectivityFive ’laws’ of web connectivity– Links are for use – the very essence of hypertext;
– Every surfer his or her link– the rich diversity of links across topics and genres;
– Every link its surfer – ditto;
– Save the time of the surfer– by visualizing web clusters and small-world shortcuts;
– The Web is a growing organism.
Inspired by Ranganathan (1931). The five laws of library science:
“Books are for use.
Every reader his or her book.
Every book its reader.
Save the time of the reader.
The Library is a growing organism.”
© Björneborn 2004