Peter Andras School of Computing Science Newcastle University [email protected].
-
Upload
jonathan-cameron -
Category
Documents
-
view
220 -
download
0
Transcript of Peter Andras School of Computing Science Newcastle University [email protected].
RESEARCH: METRICS, QUALITY, AND MANAGEMENT
Peter Andras
School of Computing Science
Newcastle University
Overview
Introduction Research metrics Science domains and evolution of
science Quality and performance Science management
2
Introduction – 1
RAE – performance and funding REF – metrics based new performance
measurement system (2014) Scientomerics, bibliometrics Other countries: Netherlands, Australia,
US
3
Introduction – 2
What do we measure by metrics? What can be measured by metrics? How are metrics related to performance
and quality? How can higher education and research
management use science metrics?
4
Origins of research metrics Derek De Solla Price – 1960s Measuring ‘big’ science – physics Measuring national science output –
papers, citations, patents Bibliometrics and scientometrics –
1970s – today Early: counts Recent: networks, distributions and
conditional counts
5
What is measured – 1
Publication counts within a time window – at least a year, and usually within a range publication venues (conferences, journals, books)
Citation counts of counted publications – citation counts per year may grow for a while and the shrink to zero through some time
Links – co-authorship, co-citation, citing – Google page rank
6
What is measured – 2 Indices:
h-index – n publications, each with at least n citations – max n
g-index – m publications, with m citations in average – max m
Journal metrics: Impact factor – average (expected) citation per paper for
a time window (2 years)Immediacy index – the ratio of papers in a journal, which
are cited within a yearCitation half-life – the median age of papers that are
cited currentlyJournal h-index
7
Metrics
Metrics in academia:IndividualGroupDepartmentalUniversity level
Larger integration implies more stability and less variability over (shorter) time in the metrics
8
Domain comparability
Different science domains have different practices of doing research, and publishing and discussing research results
E.g. computer science – software engineering and software applications – use predominantly conferences to present new results
E.g. biology – uses predominantly journals to present new results
9
Normalisation Considering most sciences the underlying
generic pattern of publication and citation metrics is very similar – e.g. all follow power law distributions
Normalised metrics:Leiden research metrics centre – raw metrics are
divided by domain specific parameters to make them comparable across subjects
Domain specific parameters are calculated by analysing domain specific journals and papers (from high impact multi-domain journals) – journal classification
10
Domain nature
Domain specific public opinionHow are new results presentedHow are they discussedHow are they referencedWhat are the primary and preferred venuesHow is reputation established and maintained
– for individuals, groups and institutesHow is the socio-dynamics of the domainWhat are the ways of ‘experimental’ validation
11
Distortion factors Review papers – very highly cited Methods papers – most highly cited Self-citations – important at low counts,
ignorable above medium counts (domain specific counts)
Networking – citation cliques Spinning – relatively many highly cited papers
contain high claims that are refuted in relatively short time
‘Authority’ – unjustified benefits from already established authority status
12
Variability and noise
The metrics data may be highly variable – especially at individual or small group level, and also if too short time periods are considered
Structural noise caused by the mechanisms of science – e.g. time to citations peak (domain specific)
13
Common research metrics Publication and citation counts considering a
time window Individual and group level h-index / g-index Patent counts and patent revenue value Research funding value Number of PhD students satisfying some quality
minimum (e.g. submission of thesis on time, publication of journal paper)
Science esteem evaluation – measure of long term reputation management and networking ability
14
Metrics and quality
How are metrics related to research quality of an individual / group / department / university ?
Are metrics purely retrospective or do they have predictive value as well with respect to research quality ?
15
Evolution of science Karl Popper – continual additions to science,
all additions being falsifiable and valid until falsified, revisions caused by falsification of previously accepted scientific knowledge
Thomas Kuhn – normal science and revolutionary science, normal science grows by many little additions filling gaps in the current scientific knowledge, revolutionary science reorganises science fundamentally and grows rapidly until matures into normal science
16
Progress in normal science Use existing knowledge to make highly
likely predictions Check the validity of such predictions by
appropriately designed ‘experimental’ tests
E.g. demonstrating a statement or a new minor theorem in a well-established branch of mathematics
17
Progress in revolutionary science Use existing knowledge to discover
fundamental inconsistencies or relationships about previously unquestioned or believed-to-be-unrelated pieces of scientific knowledge
See what cannot be seen by most others and re-shape the related science according to the new discovery
E.g. non-Euclidean geometry (Bolyai, Lobachevsky), alternative continuum hypothesis (Godel, Cohen), axiom of choice alternatives (Zermelo, Godel, Cohen)
18
Predictability of success – 1 Normal science:
New hypotheses are about highly predictable outcomes
The predictability of success is lower if the hypothesis is in an area where experimental validation of earlier hypotheses is shaky - these are at the boundary of the normal science, and avoided by success-hungry researchers
Most research is successful in filling a gap and adding new bits of knowledge to the science
Most research is published and is referenced by further publications – being more mainstream in the current science implies more expected citations
19
Predictability of success – 2 Revolutionary science
Most hypotheses turn out to be wrongSuccessful revolutionary science is built on large
volume of unsuccessful tries and takes very long time (see the time between Euclid and non-Euclidean geometry, or the time between axiomatic set theory and alternatives of the choice axiom)
May take long time to be noticed, due to established authorities and authoritative views of the science
High risk of achieving no success and being largely ignored for a good while even if success is achieved
Many times it does not lead to publications, citations and reputation enhancement
20
Extent of impact The expected impact is reversely correlated
with the predictability of success Normal science has impact – however spinning
may create the impression of higher impact that the actual; usually the impact is small, since it is about filling a usually small gap in the science
Revolutionary science has major impact, if it is successful, in this case it is likely to change (rewrite) a large part of the existing science or initiate a fundamentally new branch of science with rapid initial growth potential
21
Domain maturity Maturity is indicated by the proportion of normal science
progress within the science Mature science: most progress is made by small
additions following normal science research, there are rare and relatively marginal revolutionary events – e.g. physics
Early stage science: most progress follows the pattern of revolutionary science discovery, normal science progress happens for relatively short periods, and normal science components are regularly undermined by new revolutionary science, discovering the same several times is not uncommon, spinning may play an important role in establishing reputation – e.g. management science
22
Domain specific progress & impact
Progress and impact depends on the maturity of the science
More mature science implies more work following the pattern of normal science
More mature science also implies relatively little impact for most new discoveries
23
Domain public opinion – 1 Science maturity determines the nature of the
public opinion of the science Mature science is based on a stable and
unchallenged body of scientific knowledge, which allows relatively few major novelties, implies well established and standardised experimental validations, and implies relatively low impact for most discoveries
Implication: archival style publication in journals, use of highly standardised language, large integrated scientific societies, the use of spin to create the impression of revolutionary-ness and to achieve visibility
24
Domain public opinion – 2 Early stage or less mature science allows regular
‘revolutionary’ novelties, doesn’t have an unquestionable body of established knowledge, allows experimentation with validation methods, allows repeated re-discoveries, and implies relatively high impact for the truly successful revolutionary innovations
This environment may facilitate the more informal way of publication of results, also implies less citations for most results, which turn out to be less revolutionary than expected, the high impact may find its way on alternative routes – i.e. avoiding formal citations, spinning may play an important role
25
Measurability of normal science
26
Normal science produces large volume output in well-established venues (journals, conference proceedings)
Normal science papers are regularly cited by other such papers – papers in the currently most fashionable mainstream areas can get many citations
Judging the quality of usual normal science publications is relatively straightforward, spinning has usually short-term effects
Research grants are frequently available for normal science projects with high predictability objectives
Measurability of revolutionary science Revolutionary science output is rare, most ‘revolutionary’-intended
publications are rejected or end up in obscure journals or conferences, scientists working on revolutionary topics may refrain from publication of half-baked ideas
Revolutionary science papers may stay as a ‘sleeping beauty’ for many years, they may get few citations for a long time, and may not get many citations ever (e.g. if other follower papers collect most of the citations)
It is hard to judge the quality of revolutionary science work – it is out of the boundaries or at the boundaries of normal science, there are no standardised methods or well established referential knowledge base, the impact is highly unpredictable
A very large part of science intended to be revolutionary leads to dead-ends, although the rare truly revolutionary works have extremely high impact on their science
Few research grants are available for revolutionary science, due to its unpredictable success, and out-of-ordinary methods and objectives, getting such grants many times needs excessive spinning
Metrics measure normal science
28
Research metrics measure almost exclusively normal science Publications Citations Research funding PhD students
Measuring revolutionary science
29
It is very difficult to measure revolutionary science It becomes obvious that a work is part of revolutionary
science usually well after the work is published – 5 – 25 or more years later – depending on the science domain
Revolutionary science work is confirmed to be truly revolutionary at or close to the point when the new science knowledge based on this work becomes normal science
It is very difficult to measure the expected impact of scientific work meant to be revolutionary
Revolutionary science is not measured by usual research metrics (or at most it measured to very little extent before it becomes normal science)
Quality and metrics – 1
30
Quality in case of normal science can be evaluated and predicted reliably – the methods are standardised and relatively easy to check that they are followed correctly, the expected results are highly predictable, their impact is also well predictable
Research metrics are well correlated with research quality in case of normal science
Quality and metrics – 2 It is difficult to compose valid short & medium term metrics for
revolutionary science – long term metrics work in the sense of evaluating whether it led to emergence of new normal science
Quality is hard to assess because results and impact are not easily predictable – an indicator can be the extent of the gap that would be bridged across by the new intended discovery, but spinning may blur the reliability of the estimate of the extent of the gap
Quality may be judged by others who are active in the same area and have similarly wide range of knowledge, but even these judgements can be easily wrong
There is not much relation between any common research metric and the quality of revolutionary research and its output
Output and funding
32
In short-to-medium term (3-5 years): Funding by QR and research council grants (in the UK)
follows the performance in output generation (primarily citation counts) – significant correlation
Output generation (publication and citation counts) depends on earlier output performance (primarily citation counts) and much less on change in the level of research funding
Research funding performance reflects earlier output performance (primarily normal science)
Extra research funding does not necessarily imply extra output above the level predicted based on earlier output
Metrics for management
33
How can we use research metrics in the context of management ?
What kind of research metrics can be used ? What can be achieved by using metrics ?
Management objectives
Management objectives should be set in clear terms, should be achievable, and the progress towards achievement should be measurable
Metrics can help in formulating such research management objectives
Objectives should be set after sufficiently clear measurement and evaluation of the current status
34
Measuring progress and achievement – 1 Research metrics are key to set objectives with
measurable progress in normal science Normal science based long-term retrospective
measurement of revolutionary science work is the main way of measuring progress and achievement in revolutionary science
35
Measuring progress and achievement – 2 Measurement should consider a large part
of the spectrum of possible outcomes (in quality terms) at sufficient resolution (e.g. papers published in a wide range of venues, grant applications with all levels of rating)
The context and methods of measurement should be establish clearly and transparently (e.g. set of considered journals, conferences, time windows, exclusion rules)
36
Metrics as a management tool – 1 Individual metrics:
Publication and citation counts and distributions (yearly counting) for 3 / 5 / 7 years (depending on status) – aim increasing distribution graphs and preferably exponentially increasing citation graph
Discounted publication and citation metrics – elimination of reviews, editorials, database reports, excessively-many-author publications, etc. – aim similar as above
List of highly cited excluded publications for 3 / 5 / 7 years – aim depends on individual
h-index and/or g-index – with and without excluded publications – change in the index over the last 3 / 5 / 7 years – aim steady increase
Research grants – count and yearly distribution of applications, distribution of evaluations, amount and yearly distribution of awarded grants for the last 3 / 5 / 7 years – aim steady level or increase
For top-flyers – comparison with metrics results of a set of agreed top-flyers elsewhere – this applies to metrics that can be calculated using common and accessible data sources
37
Group & department metrics Same set as metrics for individuals, but counted for the group or
department Distributions of individual metrics descriptors (e.g. publication
and citation count) over the group or department members – aim balanced distribution with a set of high-flyers
Change in the above distributions over time – aim steady or improved balance, with a steady or increasing count of high-flyers
Cohesion metrics: member co-authorships of counted papers and grant applications and proportion of these within the whole measured sets – aim sufficient but not excessive co-working relations
Comparison with metrics results for a set of departments considered close competitors – this applies to metrics that can be calculated using common and accessible data sources
38
Metrics as a management tool – 2
University metricsSame metrics as for groups & departmentsGroup distribution and cohesion metrics
should be calculated by considering departments instead of individuals
Comparison with metrics results for a set of universities considered close competitors – this applies to metrics that can be calculated using common and accessible data sources
39
Metrics as a management tool – 3
Setting targets
Targets are short term objectives – time limits/windows should be set very clearly
Should be clearly defined, based on evaluation of current standing, achievable, and clearly measurable
While targets apply to single years, they should be considered in the context of multi-year objectives
40
Domain specific impact
The impact of the work is domain specific
This should be considered when setting citation and related (e.g. h-index) targetsIndividualGroup & departmentUniversity
41
Clarity and predictability
It is critical to use metrics in management with clear setting of their context, methods, and evaluations
It is also critical to have predictable career-related outcomes associated with metrics-based management evaluationIt should be clear what gains/losses or
rewards/punishments are associated with delivering the target/objective and the non-delivery, sub-delivery or over-delivery of it
42
Environment
Work time allocations should consider explicitly the time related to work for the achievement of research performance targets
Metrics, evaluations, related gains/losses, should be generally accepted as valid
43
Competition and transparency
Competition in achievement should be openly supported
The performance evaluation process should be open and transparent
Individuals / groups / departments should be able to compare themselves with others on the basis of openly available timely data (considering normalisation due to science domain)
44
Hype and spin The requirement for hype and spin is induced
by the larger market of research results (e.g. publishing papers, getting patents)
Ideally hype and spin should be discounted in internal evaluations, but without damaging the enthusiasm of researchers who hype and spin their work
More successful and higher reputation department can afford stricter hype and spin discounting – the negative subjective effects of discounting are compensated by the shared reputation dividend
Choice: normal vs. revolutionary science A choice between the extent of the normal
and revolutionary science work should be made at all levels (individual, group, department, university)
Early stage: preference should be for normal science to establish reputation
Revolutionary science may be very exciting and challenging, but too much of it may lead to drop in or lack of reputation
46
Funding and impact expectation Funding and impact expectation
depends on the normal / revolutionary science mix
The normal science part makes more predictable the expectations, and also limits these expectations
The revolutionary science part makes less predictable the expectations, but may also imply very high longer term expectations
47
Unit size – 1 Revolutionary science implies high risk High risk is affordable only if it can be
spread sufficiently widely research department size implication
Small department may easily disappear by taking too much risk
Doing revolutionary science needs sufficiently large size department that can compensate for the risk by doing sufficient amount of low risk normal science work as well
Unit size – 2 Phase transition: revolutionary science needs
to be embedded in very large departments doing large amount of normal science as well / departments not doing revolutionary science need to be smaller, otherwise they have too much inefficiency (e.g. unable to provide sufficient career perspective for many members, too much overlap, too much hype-driven competition)
Transition between the two phases is possible
Managing normal science Managing normal science work is
possible using metrics Since most of the well-established
sciences contain mostly normal science this means that individuals / groups / departments / universities can use research metrics-based management to a large extent
50
Managing revolutionary science – 1 Revolutionary science is very difficult to be measured
and can be measured reliably only by assessing its normal science impact over long or very long term
Investment in revolutionary science is very risky, and most of it has no return
Managing revolutionary science work may be possible to some extent on the basis of domain knowledge of established leaders of the domain – they may choose more junior colleagues deemed able and ready to get on the tracks of revolutionary science – this is also a high risk decision for these younger scientists (they may end up with very little and they may realise this only after long time)
51
Managing revolutionary science – 2 Revolutionary science investment may be based
on long-term revolutionary science track record of a department or university, but should be noted that any such track record is not a guarantee of successful continuation – however consecutive generations of revolutionary scientists may indicate that at least the previous generations of them knew how to select their followers
HR implications – normal science Mature science domains with sustained growth potential
need many (possibly increasingly many) researchers who deliver results for normal science progress
The ability distribution of the pool of potential researchers does not change in short time, implying that if more potential researchers find work as scientists, increasing number of less able researchers work on normal science topics, and the average ability of researchers working in the science domain is lowered
The increasing standardisation of normal science research compensates for the average ability drop, making possible to achieve good results for the less able researchers as well
53
HR implications – revolutionary science – 1 As the body of normal science grows longer time
is required for a researcher to show proficiency in production of normal science, which is required for establishing the reputation
The benefits of doing normal science may attract many who in principle would be able to do more than that in terms of revolutionary science discoveries – as the science grows likelihood of finding such significant discoveries reduces, increasing the chance that the ability to discover revolutionary science will be wasted in the end
54
HR implications –revolutionary science – 2 Researchers working on revolutionary topics may
become the funding and impact ballast of groups and departments, which may lead to pressure on them to switch for topics more rewarding in funding and impact terms
Finding the right candidates for doing revolutionary work becomes increasingly difficult as delivering results in normal science becomes easier and the spin-noise of pretended ‘revolutionary’-ness becomes louder
Summary – 1 Most of what can be measured to compile
research metrics belongs to normal science Revolutionary science work disappears in the flood
of normal science until around the time point, when it starts to turn into normal science
Research metrics can be used to measure quality in the context of normal science performance
In the case of revolutionary science performance the metrics work only in long term and retrospect
56
Summary – 2 Clarity, predictability, transparency, appropriate
hype discounting, and competition are critical for the management of research
Normal science is easier to practice, more researchers are available to do it, its impact and expected funding are more predictable than in the case of revolutionary science
Revolutionary science is increasingly riskier as the body of normal science grows, but its potential impact is also increasingly larger, while finding the right researchers to do it is increasingly difficult
57
Summary – 3 Normal science research is required in all
sciences, and to an increasing extent as the science matures
Having sufficient amount of good quality normal science research provides predictability and safety (in funding, impact and reputation terms) for a research department
Summary – 4 To cultivate revolutionary science, the
management of the institution should make clear long term decisions about getting into such risky venture – size matters
The expectations about impact, funding and personnel should be clear and should be set in realistic time frames (depending on the science domain)
The associated risk of non-delivery, the lack of ability to monitor progress in any close sense, and the autonomous funding requirements should be clear and acceptable for the management
59