Accessible Artificial Intelligence for Data Science...Shortliffe’sMYCIN Book (1976) (c) Jason H....

Post on 20-Sep-2020

4 views 0 download

Transcript of Accessible Artificial Intelligence for Data Science...Shortliffe’sMYCIN Book (1976) (c) Jason H....

(c) Jason H. Moore 1

Accessible Artificial Intelligence

for Data Science

Jason H. Moore, PhD, FACMI

Edward Rose Professor of Informatics

Director, Institute for Biomedical Informatics

Senior Associate Dean for Informatics

Perelman School of Medicine

University of Pennsylvania

Philadelphia, PA, USA

@moorejh epistasis.org jhmoore@upenn.edu

Atari and the University of UtahDEC PDP-1 ~ 1960 Spacewar! by Steve Russel (MIT) ~ 1962

Pong

Nolan Bushnell (Atari)

~ 1972

(c) Jason H. Moore 2

• Bioinformatics

• Computational Biology

Basic Science

• Clinical Informatics

• Clinical Research Informatics

• Consumer Health InformaticsClinical

• Public Health InformaticsPopulation

Biomedical Informatics

American Medical Informatics Association (AMIA)

Golden Era of Biomedical InformaticsMoore and Holmes, BioData Mining (2016)

Why?

• Big data

• High-performance computing

• Talented trainees

• Government recognition

• Industry recognition

• Patient recognition

• University investment

(c) Jason H. Moore 3

Golden Era of Biomedical Informatics

What next?

• Artificial intelligence

• Biomedical devices

• Data integration

• Data science

• Informatician scientists

• No-boundary thinking

• Visual analytics

Artificial Intelligence

History

(c) Jason H. Moore 4

Artificial Intelligence in Medicine

1960 1970 1980 1990 2000 2010 2020

Artificial Intelligence

• Computers that plan, solve problems, and reason

• 1950s – Alan Turing – “Can machines think?”

Turing Test

(c) Jason H. Moore 5

Artificial Intelligence

• Top-down AI: Build a machine that mimics the

mind

• Bottom-up AI: Neural networks, cellular

systems as building blocks for intelligent

machines

• AI coined at Dartmouth College in 1956

• "Every aspect of learning or any other feature

of intelligence can be so precisely described

that a machine can be made to simulate it."

(c) Jason H. Moore 6

Shortliffe’s MYCIN – 1970s

Patient Data Rules

Consultation

System

Explanation

System

Knowledge

Acquisition

(c) Jason H. Moore 7

Shortliffe’s MYCIN

Never used in clinical practice

• Ethical and legal issues

• Standalone system

• Required physician entry

(>30 mins)

Shortliffe’s MYCIN Book (1976)

(c) Jason H. Moore 8

Artificial Intelligence

Today

IBM Watson – 2010s

• Natural language processing

• Information retrieval

• Knowledge representation

• Automated reasoning

• Machine learning

• 200M pages – 4TB

• 2800 core threads

• 16TB RAM

• No internet

(c) Jason H. Moore 9

2010

IBM Watson – 2010s

(c) Jason H. Moore 10

(c) Jason H. Moore 11

Artificial Intelligence

For Data Analytics

(c) Jason H. Moore 12

Data Analytics Pipeline

D DI FS FC ML I A

*-

+

V

Big Data

D

http://www.kdnuggets.com/

(c) Jason H. Moore 13

Data Integration

DI Relational Database Graph Database

Michael Hunger – Neo4j

Feature Selection

FS

Ritchie – PLoS Genetics (2013)

Sohangir – J Soft Engin App (2013)

(c) Jason H. Moore 14

Feature Construction

FC

1 2 3

4 5 6

7 8 9

1 2 3 4 5 6 7 8 9

0 1 2

0

1

2

X1

X2

Z1

0 0 0

0 1 1

0 1 1

0 1 2

0

1

2

X3

X4 0 1

Z2 1 0 1

0 1 0

1 0 1

0 1 2

0

1

2

X5

X6 0 1

Z3

Machine Learning

ML

*-

+

http://suanfazu.com/

(c) Jason H. Moore 15

Statistical and Biological Interpretation

I

Biological Validation

V

Talbot, Zebrafish (2014) dev.biologists.org

(c) Jason H. Moore 16

Clinical Application

A

M.D. Anderson

Why Artificial Intelligence?

Importance

PCA

Polynomial

DTRF

LR

LR RF

LR

(c) Jason H. Moore 17

Accessible

Artificial Intelligence

PennAI

AI should be open, easy, and accessible

(c) Jason H. Moore 18

(c) Jason H. Moore 19

http://scikit-learn.org/

(c) Jason H. Moore 20

https://kaixhin.github.io/FGLab/

Controller: Future Gadget Lab

(c) Jason H. Moore 21

Database: MongoDB

(c) Jason H. Moore 22

ML Results -> Knowledge

ML Results -> Knowledge

(c) Jason H. Moore 23

(c) Jason H. Moore 24

(c) Jason H. Moore 25

Penn Machine Learning Benchmarks

(PMLB)BioData Mining 10:36 (2017)

(c) Jason H. Moore 26

Penn Machine Learning Benchmarks

(PMLB)Pacific Symposium on Biocomputing (2018)

(c) Jason H. Moore 27

Acknowledgments

• PennAI Team– Josh Cohen, Weixuan Fu, Paul Kopec, Bill La Cava, Randy Olson,

Moshe Sipper, Sharon Tartarone, Heather Williams

• NIH grants R01s AI11679, LM012601,

LM010098, UC4 DK112217

• jhmoore@upenn.edu

• epistasis.org, epistasisblog.org

• twitter.com: @moorejh

• PennAI.org