Manifestation of Research Specialty Processes in Collections of Journal Papers Steven A. Morris...

39
Manifestation of Research Specialty Processes in Collections of Journal Papers Steven A. Morris Oklahoma State University

Transcript of Manifestation of Research Specialty Processes in Collections of Journal Papers Steven A. Morris...

Manifestation of Research Specialty Processes in Collections of Journal Papers

Steven A. Morris

Oklahoma State University

Summary

• Discuss research specialties• Model collections of papers as systems

of coupled bipartite networks• Discuss entities, links, and entity groups

as manifestations of research specialties in collections of papers

• Discuss visual presentation to reveal structural and dynamic information about a specialty

Goals

• Visualize structure and dynamics of a research specialty through a collection of papers– Social organization– Knowledge organization

• Present to subject matter experts for technology forecasting

Research specialty definitions

• A research specialty is a self-organized social organization whose members tend to study a common research topic, attend the same conferences, publish in the same journals, cite each other's work, and belong to the same social networks that are known as invisible colleges.

• Specialties create their own literature, i.e., a body of journal papers and books that broadly focus on the specialty's research topic.

• Define a collection of papers as a list of journal papers that constitutes a comprehensive sample of a specialty's journal literature.

Model of a research specialty

Kuhnian paradigm

ResearchersBody of

knowledge

• Symbolic generalizations• Metaphysical paradigms• Validation standards• Exemplars

• Researcher local organization(Researcher team processes)

• Researcher global self-organization(Research global communication processes)

• Researcher education & training(Researcher entrance processes)

• Researcher retirement/out-migration(Reseacher exit processes)

Funding Technical communication through journal literature and conferences

• Journal literature• Conference literature• Educational theses &

dissertations• Institutional reports• Books

Base knowledge

Research reports

Generated knowledge adopted as base knowledge produces ‘paradigm creep’

Size of specialties

• Specialties are usually small, less than 100 core members according to Kuhn.

• Collections of papers usually less than 5000 papers.

• Scaling not a big problem.

Static information sought about a specialty

• Identification and ranking of individual entities– Experts – Productive researchers– “Rising stars”– Centers of excellence– Exemplar references– Key journals

Static information sought about a specialty

• Structural mapping (groups and their relations)– Terms (subtopic ‘vocabularies’)– Papers (‘research fronts’ – papers grouped by subtopic)– References (exemplar reference groups, ‘paradigms’)– Paper authors (‘research teams’) – Reference authors (‘schools of thought’)– Paper journals (research report ‘libraries’)– Reference journals (base knowledge ‘libraries’)

Dynamic information sought about a specialty

• Monitoring– Trends

• Growth/decline of the specialty• Obsolescence of knowledge• Geographic migration of research activity

– Discontinuous events• Discoveries• External events

• Forecasting – Extrapolate trends – Predict risk of discontinuous events

Why use journal papers to investigate a specialty?

• Vetted through review process

• Public record of researcher communication

• Permanent record

• Formatted, structured information available (through abstract services)

Gathering collections of papers to ‘cover’ a specialty

• Gathered from Science Citation Index• Using seed references:

– Find all papers that cite a collection of key references in the specialty

• Query of terms.– Find all papers associated with keyword terms that are

related to the specialty– Index terms, title terms, abstract terms

• Query of reference authors– Find all papers that reference key authors in the

specialty.

Entity-Relation model of a collection of journal papers

• 6 direct bipartite networks• 15 indirect bipartite networks

formed from cascading bipartite networks

authors

papers

indexterms

paper journals reference journals

references

reference authors

PAPERS

PAPER AUTHORS

PAPER JOURNALS

REFERENCES

REFERENCE AUTHORS

REFERENCE JOURNALS

INDEXTERMS

INSTITU-TIONS

HAS MANYUNIQUE

APPEARS ONCEIN ONE

APPEARS ONCEIN MULTIPLE

HAS MANYUNIQUE APPEARS ONCE

IN MULTIPLEHAS ONE

HAS ONEAPPEARS ONCEIN MULTIPLE

APPEARS IN ONE

HAVE MANYUNIQUE

APPEARS MULTIPLETIMES IN MULTPLE

CONTAINS MULTIPLEMULTPLE TIMES

CONTAINS MULTIPLEUNIQUE

APPEARS ONCE INMULTIPLE

Entity-relationship model of a collection of journal papers

citing entities cited entities

Other entities:

Paper yearReference year

Bibliometric entities vs. physical entities

Paper authorH. G. Small

Reference authorSmall HG

Reference authorSmall H

Physical authorHenry Small

Bibliometric entities are objects in the paper collection and acquire separate meaning.

Physical entities are objects in the ‘real world’ that correspond to bibliometric entities.

PAPERS

PAPER AUTHORS

PAPER JOURNALS

REFERENCES

REFERENCE AUTHORS

REFERENCE JOURNALS

APPEARS ONCEIN MULTIPLE

HAS MANYUNIQUE

APPEARS ONCEIN MULTIPLEHAS

ONE

HAS ONEAPPEARS IN ONE

HAVE MANYUNIQUE

CONTAINS MULTIPLE UNIQUE

APPEARS ONCE INMULTIPLE

PHYSICAL JOURNALS

PHYSICAL PAPERS

PHYSICAL AUTHORSCORRESPONDS

TO ONE

CORRESPONDS TO MULTIPLE

CORRESPONDS TO ONE

CORRESPONDS TO MULTIPLE

CORRESPONDS TO ONE OR NONE

CORRESPONDS TO MULTIPLE

CORRESPONDS TO ONE

CORRESPONDS TO ONE

CORRESPONDS TO ONE

CORRESPONDS TO MULTIPLE

CORRESPONDS TO ONE

CORRESPONDS TO MULTIPLE

F404_2

Entity-relationship diagram showing relation of physical entities to bibliometric entities

PAPERS CITING PAPERS

AUTHORS CITING AUTHORS

JOURNALSCITING JOURNALS

Papers citing papers networks

papercitation

• Papers are reports, references are concept symbols: apples citing oranges• Typically 20 times more references than papers: how to handle?

PAPER AUTHORS

PAPER JOURNALS

REFERENCES

REFERENCE JOURNALS

TERMS PAPERS

REFERENCE AUTHORS

Bibliographic entities as tokens of research specialty objects

Concept symbols

Report archives

Base knowledge generators, experts

Research reports

Base knowledge archives

Concept symbols, base knowledge

Researchers

PAPER AUTHORS

PAPER JOURNALS

REFERENCES

REFERENCE JOURNALS

TERMS PAPERS

REFERENCE AUTHORS

Bibliographic links as tokens of research specialty relations

Term associated with research reported by

Journal archives research reported by

Researcher generated base knowledge represented by

Journal archives base knowledge represented by

Research reported used base knowledge represented by

Researcher participated in research reported by

Networks in collections of journal papers

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

3334

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

3334

authors papers

Unipartite cooccurence network Bipartite network

Like entities: entities of same entity-type

Unlike entities: entities drawn from more than one entity-type

authors

papers

terms

paper journals reference journals

references

reference authors

Networks in collections of journal papers

r1

r2

r3

r4

r5

r6

r7

p1

p2

p3

p4

p5

p6

p7

p8

ap1

ap2

ap3

ap4

ap5

ap6

ap7

ap8

ar1

ar2

ar3

ar4

referenceauthors

referencespaperspaper

authors

Cascaded bipartite networks

1 1 1 1 1 1 1 7 3 7 3 2 2 1 4

1 1 1 1 1 1 3 6 4 2 1 2 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 7 4 15 3 4 2 1 6

1 1 1 1 1 1 1 3 2 3 7 2 1 4

1 1 1 1 1 1 6 1 1 1 2

1 1 1 1 1 1 1 5 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 4 2 1 1 15 1 1 2

1 1 1 1 2 1 2 1 1 4 1

1 1 1 1 1 1 1 1 1 1 1 1 7 1

1 1 1 1 1 1 1 1 1 1 1 1 1 4 1 6 4 2 2 1 1 13

4 3 2 3 3 3 3 1 2 1 1 1 1 2 1 2 1 1 1 1 1 1 1 13 6 3 3 3 3 3 1 3 2 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 12 3 3 2 2 2 2 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 13 3 2 4 3 2 4 1 1 1 1 1 2 1 2 1 1 2 2 1 1 1 1 13 3 2 3 4 2 3 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 13 3 2 2 2 4 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 13 3 2 4 3 2 4 1 1 1 1 1 2 1 2 1 1 2 2 1 1 1 1 11 1 1 2 1 1 1 1 1 12 3 2 1 1 2 1 1 3 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 11 2 1 1 1 1 1 2 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 12 2 2 2 2 1 2 2 1 1 1 3 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 12 1 1 2 2 1 2 1 1 1 1 2 1 2 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 11 2 1 2 1 2 1 1 2 1 1 3 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 2 1 2 1 1 1 1 2 2 1 1 1 1 11 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 1

1 1 2 1 1 1 1 1 1 11 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 11 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 1

1 1 1 1 1 11 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1

1 1 1 1 11 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 11 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 11 1 1 1 1 1 11 1 1 1 1 1 11 1 1 1 1 1 11 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1

REFERENCES PAPERS

RE

FE

RE

NC

ES

PA

PE

RS

C[r ;p ]Co-citation matrix

C[p ;r ]Bibliographic couplingmatrix

O[p ;r ]Paper-reference

Occurrence and co-occurrence matricesEach occurrence matrix has two associated co-occurrence matrices.

PAPER AUTHORS

PAPER JOURNALS

REFERENCES

REFERENCE JOURNALS

TERMS

Bibliographiccoupling

Referenceco-citation

PAPERS

REFERENCE AUTHORSAuthor

co-citation

Journalco-citation

Co-authorship

Termco-occurrence

Paper couplingby term

Paper couplingby paper journal

Paper couplingby reference journal

Paper couplingby paper author

Paper couplingby reference author

PRIMARY ENTITY RELATIVE ENTITY

CO-OCCURRENCERELATION

KEY

PAPERSREFER-ENCES

Bibliographiccoupling

Co-occurrence relations

Co-occurrence relations are used map the structure of a scientific specialty by providing a means to find entity groups through clustering.

PAPER AUTHORS

PAPER JOURNALS

REFERENCES

REFERENCE JOURNALS

TERMS PAPERS

REFERENCE AUTHORS

Bibliographic cooccurrence links as tokens of unipartite research specialty relations

Two terms both associated with research reported by

Two papers associated with similar research

Two pieces of base knowledge both used in research reported by

Two papers’ research both used base knowledge represented by

Two researchers worked together on research reported by

Two researchers both generated base knowledge used by research reported by

Paper author to reference author matrix, O[ap;ar] Paper author co-occurrence matrix, C[ap;ar]

Paperauthor i

Occurrence feature vector, Oi[ap;ar] Co-occurrence feature vector, Ci[ap;ar]

F404_34

Entity feature vectors

Features are measurable quantities used to characterize entities for pattern recognition and clustering purposes. A feature vector is an array of features used for pattern recognition and clustering.

Each entity has two feature vectors per occurrence matrix.

Interpretation of occurrence feature vectors

Examples of occurrence feature vectors for entities in a collection of papers.

Primary entity-typex1

Secondary entity-typex2

Feature vector for entity i

Interpretation

paper reference Oi[p;r] a) The concept symbols used by a paper (Small, 1978). b) the knowledge sources used by a paper.

reference paper Oi[r;p] The papers using a reference as a concept symbol.

paper author

paper Oi[ap;p] A paper author’s oeuvre

paper author

reference author

Oi[ap;ar] The reference authors whose work a paper author reads and uses. An author’s identity (White, 2001).

reference author

paper author

Oi[ar;ap] The paper authors that read and use a reference author’s work.

paper journal

reference journal

Oi[jp;jr] The reference journals holding source knowledge used by papers in a paper journal

reference journal

paper journal

Oi[jr;jp] The paper journals whose papers draw knowledge from a reference journal

paper terms Oi[p;t] A paper’s research vocabulary

Interpretation of co-occurrence feature vectors

Examples of co-occurrence feature vectors for entities in a collection of papers.

Primary entity-typex1

Secondary entity-typex2

Feature vector for entity i

Interpretation

paper reference Ci[p;r] The papers that use the same concept symbols as paper i. (Papers covering the same topic as paper i.)

reference paper Ci[r;p] The references being used by the same papers the use reference i. (Exemplar references for the same Kuhnian paradigm as reference i. )

paper author

paper Ci[ap;p] The collaborators of paper author i.

paper author

reference author

Ci[ap;ar] The paper authors using the same knowledge sources as paper author i. Paper author i’s invisible college.

reference author

paper author

Ci[ar;ap] The reference authors used as knowledge sources by the same paper authors as reference author i. The image of reference author i. (White, 2001)

paper journal

reference journal

Ci[jp;jr] The paper journals using the same sources of knowledge as paper journal i.

reference journal

paper journal

Ci[jr;jp] The reference journals (sources of knowledge) being used by the same paper journals as reference journal i.

paper terms Ci[p;t] Papers using the same research vocabulary as paper i. (Papers covering the same topic as reference journal i.)

PAPER AUTHORS

PAPER JOURNALS

REFERENCES

REFERENCE JOURNALS

TERMS PAPERS

REFERENCE AUTHORS

Bibliographic cooccurrence clusters as tokens of research specialty group objects

Research subtopic vocabularies

Base knowledge groups, “paradigms”

Research front: “papers by topic”

Research teams

Base knowledge generator groups. “schools of thought”

Research teamoeuvres

Research frontlibrary Base knowledge

libraries

Visualization of matrices

Research front timeline

Papers clustered by common references to form a

hierarchical collection of research fronts

Papers plotted as circles in track by research front. Circle size is proportional

to total times cited, redness is proportional to times cited in the last

year.

Labels manually generated by browsing titles in paper

clusters for themes

Major sub-specialies

Modern toxin research

Vaccines and genetics

1950’s to 1970’s research

Detection of anthrax

Bioterrorism

Historical development

Research front to reference crossmap

CURRENT TOXIN RESEARCH

CURRENT VACCINE RESEARCH

BIOTERROR

ANTHRAX DETECTION

1980’S& EARLY ’90’S TOXIN AND VACCINE RESEARCH

EARLY TOXIN RESEARCH

EARLY RESEARCH TOXIN AND VACCINE

Dixon reference

Reference usage plot

BREAKTHROUGH IN TOXIN RESEARCH LEPPLA& FRIEDLANDER

KEY REFERENCES IN TOXIN RESEARCH

BRACHMAN STUDY OF VACCINE EFFICACY

OBSOLETE EARLY RESEARCH

OLD REFERENCES STILL CURRENT

BIOTERRORDETECTION

EARLY INHALATION ANTHRAX RESEARCH

Paper author usage plot

FRIEDLANDER

LEPPLA

TURNBULL

WRIGHTTHORNE

NO LONGER ACTIVE

COLLIER

MOCK

NO LONGER ACTIVE

Research front to index terms crossmap

TOXIN TERMS

TOXIN EXPRESSION TERMS

VACCINE TERMS

BIOTERROR TERMS

Questions

EVENT REPORT

LINGUIS-TIC

TERMS

INCIDENT TYPE

TOWN DISTRICT COUNTRY

GOVERN-MENT

OFFICIAL

TERROR-IST

PERSONAL NAMES

TERROR-IST

GROUP

VICTIM

LAW INFORCE-

MENT OFFICER

OTHER ENTITIES:

•EVENT DATE•REPORT DATE

LINKS SUPPLIED BY ANALYSTS OR INFERENCE

DIRECT LINKS FROM ENTITY EXTRACTION

PROPOSED ENTITY-RELATIONSHIP MODEL FOR TERRORIST INCIDENT REPORTS

ENTITIES IN YELLOW TO BE IMPLEMENTED

2

1

4

3

10

7

8

9

5

6

Lashkar-e-Jabbar33

Dukhtaran-e-M

illat39

Harkat-ul-Ansar

45

Ikhwan-ul-M

uslimeen

41

Inter-Services Intelligence40

ULFA

37

Al-Barq36

Laskhar-e-Toiba26

Ikhwaan-ul-M

uslimeen

28

Jaish-e-Muham

mad

38Jam

iat-ul-Mujahideen

24

SIMI

9

Jamait-ul-M

ujahideen42

Al Badr11

Pir Panjal Regim

ent17

Hizbul M

ujahideen14

Taliban16

Inter Services Intelligence34

Jaish-e-Moham

mad

15

JeM7

Jaish-e-Moham

med

6JKIF

30

Jamm

u and Kashmir Islam

ic Front21

Lashkar-e-Toiba1

Lashkar4

HM

2

Hizb-ul-M

ujahideen3

HuM

5

Harkat-ul-M

ujahideen8

Al-Badr13

Al-Badr Mujahideen

32

Muttahida Jehad C

ouncil19

MJC

20

Muttahida Jihad C

ouncil25

All Parties Hurriyat C

onference18

APHC

12

All Party Hurriyat C

onference10

JKLF27

Jamm

u and Kashmir Liberation Front

35

SSP29

Jamaat-e-Islam

31

Jamaat-U

lema-e-Islam

43

Harkat-ul-Jehad Islam

i44

Harkat-ul-Jehad

22

HU

JI23

Lashkar-e-Toiba

Lashkar-e-Toiba

Hizb-ul-Mujahideen

Harkat-ul-Mujahideen

Jaish-e-Mohammed

Jaish-e-Mohammed

HM

HUJI

All Parties Hurriyat Conf.

Hizb-ul-Mujahideen

20040821T105234.fig