Ben Goertzel AIs, Superflies and the Path to Immortality - singsum au 2011

48
Dr. Ben Goertzel CEO, Novamente LLC and Biomind LLC CTO, Genescient Corp Adjunct Research Professor, Xiamen University, China ViceVice Chairman, Humanity+ Advisor, Singularity University and Singularity Institute Text AIs, Superflies and the Path to Immortality

Transcript of Ben Goertzel AIs, Superflies and the Path to Immortality - singsum au 2011

Dr. Ben Goertzel

CEO, Novamente LLC and Biomind LLCCTO, Genescient Corp

Adjunct Research Professor, Xiamen University, ChinaViceVice Chairman, Humanity+

Advisor, Singularity University and Singularity Institute

Text

AIs, Superflies and the Path to Immortality

For an earlier, textual treatment of some of these themes, see the article

“AIs, Superflies and the Path to Immortality”

in H+ Magazine, hplusmagazine.com

Also check out:

•genescient.com•biomind.com•http://code.google.com/p/openbiomind/•opencog.org

1.Why Biology, Biopharma & Longevity Research Need AI & AGI

2.OpenBiomind: OSS Machine Learning for Genomics

3.Understanding Longevity via AI-based Analysis of Genescient’s Long-lived Flies

4.OpenCog and the Path to Advanced AGI

1.Why Biology, Biopharma & Longevity Research Need AI & AGI

2.OpenBiomind: OSS Machine Learning for Genomics

3.Understanding Longevity via AI-based Analysis of Genescient’s Long-lived Flies

4.OpenCog and the Path to Advanced AGI

the human body can be effectively understood, for many purposes, as a very complex machine

genomics and experimental evolution, together, give us fantastic data about the operation of this machine

human minds struggle to understand this data

with the help of AI we can do better -- and more rapidly and dramatically improve human health and increase human healthspan

as well as analyzing data already obtained, AI can help pose new experiments, leading to the generation of new and better data -- a virtuous cycle

why AI?

• biological systems operate based on complex, multi-level, self-organizing networks

• as modern instrumentation probes these networks ever more thoroughly, the collective intelligence of human scientists proves ever less adequate to understand the data collected

• human brains are adapted for analyzing the sense-data relevant to “caveman” goals – not for analyzing complex biological datasets

• the amount of quantitative, relational and textual biological data currently online far exceeds the capacity of any human to comprehend

biology’s big challenge

• new therapeutics are badly needed, but development is too costly

• each new drug approved by the US FDA is estimated to cost anywhere between $802 million (Tufts) to $1.2 billion (Bain) to develop

• current pharma research methods, focused on specific single targets, are poorly suited to address the complexity of the biological networks underlying complex diseases like those related to longevity

pharma’s big challenge

• specific mechanisms like the Hayflick limit or recursively accelerating DNA damage only account for a small percentage of age-associated disease and death

• damage repair approaches like SENS may struggle to cope with side-effects ensuing from biological complexity

• cross-species data analysis strongly suggests that most age-associated disease and death is due to “antagonistic pleiotropy” – destructive interference between adaptations specialized for different age ranges. The result is that death rate increases through old age, and then stabilizes at a high constant rate in late life

• increased healthspan relies on thoroughgoing changes in multiple interlocked networks, not centrally on any specific genes or pathways

longevity research’s big challenges

What does “AI” Really Mean?

AGI “the ability to achieve complex goals in

complex environments using limited computational resources”

• Autonomy• understanding of self and others• solving new types of problem, unanticipated

by the system’s programmers

Narrow AI “software that can solve particular

problems whose solutions humans consider to require intelligence”

• example: machine learning bioinformatic data analysis software like OpenBiomind, which can see data patterns no human can

• more examples: Google, Deep Blue, DARPA Grand Challenge

Artificial General Intelligence versus Narrow AI

OpenCog:An Open Source Software Framework

&A Design & Vision for Advanced AGI

• 2011-2012: A Proto-AGI Virtual Agent in a Videogame type world

• 2013-2014: A Complete, Integrated Proto-AGI Mind ... virtual world + humanoid robot

• 2015-2016: Advanced Learning and Reasoning

• 2017-2018: AGI Experts: biology, finance, service robotics,???

• 2019-2021: Full-On Human Level AGI

• 2021-2023: Advanced Self-Improvement

Extremely tentative schedule,

assuming the design/theory is basically right and funding is adequate

Biomind LLC: advanced “narrow AI” for postgenomic bioinformatics

• leveraging and extending a large, relevant academic literature

• extending standard “machine learning” approaches via ensemble methods that find multiple patterns in biological datasets and study the meta-patterns connecting them

• Biomind’s machine learning approach found the first evidence of a genetic basis for Chronic Fatigue Syndrome, in mutations in genes associated with neural function (collaboration with CDC)

• near-100% accurate diagnostics for Parkinson’s and Alzheimer’s based on heteroplasmic mutations in mitochondrial DNA (collaboration with UVA Health System)

I

both narrow AI and AGI have massive potential to help biology and pharma

AGI scientists will one day put human scientists out of business

… but until that day, our best strategy is to allow AI and bioinformatics to advance hand in hand

• applying best-of-breed narrow AI and proto-AGI technology to understand biological data

• allowing bioscience requirements to help guide the path to human-level AGI and beyond

How general is human general intelligence?

Truly general intelligence requires infeasibly much computing power (e.g. AIXI)

Real-world intelligence is biased toward certain classes of goals and environments

Intelligent agents embodied in everyday human situations are more likely to have humanlike intelligence (biases)

“Artificial scientists” may benefit from having different biases & capabilities than humans

1.Why Biology, Biopharma & Longevity Research Need AI & AGI

2.OpenBiomind: OSS Machine Learning for Genomics

3.Understanding Longevity via AI-based Analysis of Genescient’s Long-lived Flies

4.OpenCog and the Path to Advanced AGI

openbiomind – open-source AIfor postgenomic bioinformatics

Finding nonlinear combinations of genes, mutations or clinical indicators that are associated with diseases, toxic reactions, symptoms, or other phenotypic qualities

openbiomind – open-source AIfor postgenomic bioinformatics

• unique ensemble based machine learning methods, focused on GP

• portions integrated into NIH-NIAID’s ImmPort portal for immunological data analysis

• customized for microarray and SNP data, also more broadly applicable

supervised classificationfor bio data analysis

Often more statistically meaningful than clustering– and allows one to do clustering of features based on whether

they’re used in the same categorization models

The researcher must divide the data into two or more categories, e.g.– Case vs. Control– Early vs. Late (in a time series experiment)– Multiclass categorization: which kind of cancer?

Algorithms learn rules (“models”) that predict which category a microarray gene expression profile falls into, via combining expression values in an automatically learned mathematical formula

supervised classificationfor bio data analysis

Many supervised categorization algorithms exist, each with strengths and weaknesses

Unlike with clustering, a choice may be made largely based on rigorous validation methodology

Decision trees Neural networks Logistic regression Support vector machines Genetic programming Etc.

supervised classificationfor bio data analysis

Classification models may be used as diagnostic rules Classification models may be studied to yield intuitive insight

– particularly interesting in the case of model ensembles

Classification models may be used as diagnostic rules

Classification models may be studied to yield intuitive insight– -- particularly interesting in the case of model ensembles

supervised classificationfor bio data analysis

if

(NM_005110 + NM_001614)/NM_002230 - .3* NM_002297 > 1

then Case

else Control

Example classification model learned viagenetic programming algorithm fromgene expression datat

inference and ontologies for enhancing feature vectors

inference and ontologies for enhancing feature vectors

inference and ontologies for enhancing feature vectors

example: high accuracy model predicting if a humanhas prostate cancer

“important features” analysis

Classification models may be used as diagnostic rules Classification models may be studied to yield intuitive insight

– particularly interesting in the case of model ensembles

Given a classification model ensemble, one can list the features that occur in the greatest number of models

These are NOT necessarily the same features that provide the greatest differentiation the two categories, considered individually

“important features” analysis

Classification models may be used as diagnostic rules Classification models may be studied to yield intuitive insight

– particularly interesting in the case of model ensembles

clustering based on category model utilization

Classification models may be used as diagnostic rules Classification models may be studied to yield intuitive insight

– particularly interesting in the case of model ensembles

1.Why Biology, Biopharma & Longevity Research Need AI & AGI

2.OpenBiomind: OSS Machine Learning for Genomics

3.Understanding Longevity via AI-based Analysis of Genescient’s Long-lived Flies

4.OpenCog and the Path to Advanced AGI

The “Holy Trinity” of21st Century Medicine:

Genomics, Experimental Evolution,

AI

Genescient, Biomind, UCI -- collaborative work combining experimental evolution, genomics and AI

Michael Rose’s lab at UCI has evolved a host of fly populations selected for various phenotypic characters

A subset of these flies have been spun out to Genescient Corp. -- these are “Methuselah flies” that live 4-5 times as long as normal flies of the same species

Text

The capability is in place to rapidly evolve new fly populations selected for phenotypic characters identified with the aid of AI analysis of the genomics of the existing populations

Methuselah flies

• These super-flies have greater total fecundity, much longer sex lives, increased athletic performance (flying), and increased ability to survive acute stresses (starvation, desiccation, toxins), & normal metabolic

• As such, we take them to be an appropriate model for extended healthspan; they now live nearly 5x as long as their controls

Methuselah flies have extremely strong hearts

• Running current through fly body to accelerate heart-rate, often to the point of failure

• After 2 minute recovery time, Methuselah (O) populations had significantly lower percentage heart failure than controls (B)(p<0.05, X2 test)

• use fundamental understanding of the genomics underlying aging and aging-associated diseases, to arrive at a rational understanding of which substances are most worthy of test

• rapid substance testing in the fly model, followed by testing in mouse and human

• longer-term: initiate a “virtuous cycle” involving repeated cycles of experimental-evolution experiments and advanced AI data analysis

Genescient’s proposed solution to pharma’s big problem

some of our discoveries about the Methuselah fliesusing Biomind AI technology

• Biomind’s machine learning algorithms applied to Genescient (expression and SNP) data, indicate that there are a few dozen key aging-associated genes that affect lifespan dramatically – with many other genes also playing significant roles

• hubs of the genetics underlying longevity have been isolated and their interactions limned

• multiple drugs and GRAS substances have been identified, acting on gene combinations associated with longevity and various age-associated diseases

gene expression analysis: 2009-present

• Samples of Methuselah (O) and ordinary (B) flies compared using Affymetrix gene expression profiling

• Genes with significantly differential expression identified • Comparison with databases to determine human orthologs

• Comparison with WTCCC human SNP dataset

• Machine learning data analysis to discover combinations, networks

Genetic programming for classification; mutual information to find network hubs

• Comparison with DrugBank database of gene/substance mappings

novel gene/disease relationships,obtained by correlating Methuselah fly genetics with public human SNP dataset

some existing drugs related to longevity,based on correlating Methuselah fly genetics with DrugBank

some existing supplements that act on proteins identified by ouranalysis as particularly important for Methuselah fly longevity

selenium vitamin E

estradiol sodium selenitevalproic acid quercetincalcitriol genisteinresveratrol zincfolic acid isoflavones

• seems to be giving us dramatically more insight into the genetic patterns underlying longevity

sequence data from the Methuselah fly genome is currently undergoing AI analysis

• Illumina whole genome resequencing of genomic DNA from the 5 long-lived (O) and 5 control (B) populations

• We can accurately estimate allele frequencies in each population and identify SNPs that are highly diverged in allele frequency. This allows us to identify the SNPs that make the O flies live a long time.

~150 B alleles

~2 million SNPs

sequencing of the Methuselah flies and controls

NYT article discussing a recent Nature paper co-authored by Molly Burke, Michael Rose and Anthony Long), based on gene sequence analysis of a similar fly population. Genescient’s Methuselah flies preliminarily appear to display qualitatively similar phenomena.

The panels (top to bottom) are chromosomes: X, 2L, 2R, 3L, 3R, tiny 4. The "x" axis is position along the chromosome. The "y" axis is -log10(p- value).

The three lines are: black -- a Fisher exact test differentiation between {pooled} B's (control) and O's (Methuselah); red -- chi-square test for allele frequency differentiation with the B's; green -- like the red, but for O's.

preliminary results from sequence analysis

• “Soft sweep” phenomenon is observed -- there are many, many changes in SNP frequency, all across the genome

• But still: some frequency changes are more important than others!• Can find (using Genetic Programming) dozens of rules

distinguishing Methuselah from ordinary flies with 100% accuracy, each rule using SNPs in the close vicinity of 2-3 genes

• Many of these genes closely interact, hinting at a central “influence network” underlying aging & longevity

• Can find rules distinguishing Methuselah from ordinary flies with >90% accuracy using only SNPs near a handful of genes with human homologues and known relationship to neurological function

... or, alternately, cardio function

... or, alternately, immune function

potential implications for drug discovery

Understanding of Networks

Novel Combinational Therapies

1.Why Biology, Biopharma & Longevity Research Need AI & AGI

2.OpenBiomind: OSS Machine Learning for Genomics

3.Understanding Longevity via AI-based Analysis of Genescient’s Long-lived Flies

4.OpenCog and the Path to Advanced AGI

For an earlier, textual treatment of some of these themes, see the article

“AIs, Superflies and the Path to Immortality”

in H+ Magazine, hplusmagazine.com

Also check out:

•genescient.com•biomind.com•http://code.google.com/p/openbiomind/•opencog.org