What is Webometrics? Mike Thelwall Statistical Cybermetrics Research Group University of...

Post on 14-Jan-2016

217 views 1 download

Transcript of What is Webometrics? Mike Thelwall Statistical Cybermetrics Research Group University of...

What is Webometrics?

Mike ThelwallStatistical Cybermetrics Research Group

University of Wolverhampton, UK

Virtual Knowledge Studio (VKS)

Information Studies

1. Introduction

□Webometrics is concerned with gathering data on and measuring aspects of the Web□web sites□web pages□hyperlinks□web search engine results□YouTube video commenter networks□MySpace Friend networks

□…for very varied social science purposes

New problems: Web-based phenomena

□Webometrics can be applied to understanding web-based phenomena□Why do web sites interlink?□Which web sites interlink?□What interlinking patterns exist?□What topics are frequently blogged

about?

Old problems: Offline phenomena reflected online

□Some offline phenomena have measurable online reflections□International communication□Inter-university collaboration□University-business collaboration□The impact or spread of ideas□Public opinion

2. ExamplesBlog searching - blogpulse.com

Example: Identifying and tracking public science concerns

in blogsOver 100,000 Blogs and other sources tracked

daily via RSS feedsObjective: to identify and track public

concerns about scienceE.g., “Schiavo” identified and tracked as

potential public science concern

Example: The online impact of research groups (NetReAct)

Normalised linking, smallest countries removed

Geopoliticalconnected

SwedenFinland

Norway

UK

Germany

Austria Switzerland

Poland

Italy

Belgium

Spain

France

NL

Example:Links betweenEU universities

International biofuels research network

Example: MySpace age profiles

percentage of profiles containing swearing

moderate strong very strong sample size

US males 16-19 10% 47% 2% 1,530

US females 16-19 11% 38% 2% 1,287

UK males 16-19 33% 33% 8% 171

UK females 16-19 18% 38% 3% 130

(typical sample size 20-148 for non-web swearing research)

emphatic adverb/adjective OR adverbial booster OR premodifying intensifying negative adjective

(36% of swearing)

□and we r guna go to town again n make a ryt fuckin nyt of it again lol

□see look i'm fucking commenting u back□lol and stop fucking tickleing me!! □Thanks for the party last night it was fucking

good and you are great hosts. □That 50's rock and roll weekender was fucking

mint! □Fuckin my space, my arse □1/2 d ppl cudnt even speak fuckin english! □yeah so me and sarah broke up and

everythings fucking shit

YouTube – Video poster ages

YouTubefriend network

Online impact - Keywords in web pages mentioning IWRM

Data Gathering/Processing Tools

□Blogpulse.com – blog network diagrams

□LexiURL Searcher – links, web text, YouTube, Flickr, Technorati

□Issue Crawler, Google TouchGraph - links

Discussion points for online data

□ Validity – is the underlying meaning of the text/video/picture readily apparent to the researcher?□ Possibly not to any great degree for teenagers’ MySpace

comments or very personal YouTube videos

□ Reliability –are search engines accurate/good at returning the correct results?□ Google blog search shows unreliability – very variable

over time□ Researchers can triangulate different similar search

engines or over time to test reliability

Discussion points for online data

□Coverage – to what extent is all the phenomena of interest covered by the source (e.g., search engine) used?

□Sample bias – are certain types of people over-represented? (e.g., the more literate, the more vocal, the more politically active, youth, educated, creative types…)

Summary

□The web contains a wide variety of interesting web and “web 2.0” content posted by many different people in many different formats

□Webometric methods can give insights into this data

Books

□Thelwall, M. (2009). Introduction to webometrics: Quantitative web research for the social sciences. New York: Morgan & Claypool.

□Rogers, R. (2005). Information politics on the Web. Massachusetts: MIT Press.

□ http://lexiurl.wlv.ac.uk http://webometrics.wlv.ac.uk http://www.issuecrawler.net

Important considerations

□Data accuracy□Data cleaning□Context to help interpret results□Report results carefully

Example: Analysis of the accuracy of search engine

results

Live Search results analysis