Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD)...

29
7/5/2017 1 Introduction of New Paradigms to Materials Science 1 st paradigm: Empirical Science Experiments 2 nd paradigm: Model-based theoretical science Laws of clas- sical mecha- nics, electrody- namics, ther- modynamics, quantum mechanics 3 rd paradigm: Computational science (simulations) Density-func- tional theory, molecular dynamics 4 th paradigm: Big-Data driven science Machine lear- ning, com- pressed sen- sing, relation- ship mining, anomaly detection Starting times Scientific knowledge 1600 1960 2010 year add figures add figures add figures Change in internal energy Heat added to system Work done by system ΔU = Q W add figures The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important computational materials science codes. From its open-access data it builds several Big-Data Services helping to advance materials science and engineering. Watch a 3-minute summary on the NOMAD Laboratory CoE NOMAD Scope and Overview Data is the raw material of the 21st century. Surprisingly, extreme-scale aspects of Big-Data are very much under-ex- plored in materials science and engineering, one reason being that ‘towards exascale’ computing initiatives typically focus on standard hardware and

Transcript of Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD)...

Page 1: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

1

Introduction of New Paradigms to Materials Science

1st paradigm:EmpiricalScience

Experiments

2nd paradigm:Model-based

theoretical science

Laws of clas-sical mecha-

nics, electrody-namics, ther-modynamics,

quantum mechanics

3rd paradigm:Computational

science (simulations)

Density-func-tional theory,

molecular dynamics

4th paradigm:Big-Data

driven science

Machine lear-ning, com-

pressed sen-sing, relation-ship mining,

anomaly detection

Starting times

Scientific knowledge

1600 1960 2010 year

add figuresadd figures

add figures

Change ininternalenergy

Heatadded

to system

Work done

by system

ΔU = Q – W

add figures

The Novel Materials Discovery (NOMAD) Laboratorymaintains the largest Repository for input and output files of all important computational materials science codes.

From its open-access data it builds several Big-Data Services helping to advance materials science and engineering.

Watch a 3-minute summary on the NOMAD Laboratory CoE

NOMAD Scope and Overview

Data is the raw material of the 21st century.

Surprisingly, extreme-scale aspects of Big-Data are very much under-ex-plored in materials science and engineering, one reason being that ‘towards exascale’ computing initiatives typically focus on standard hardware and software challenges. Clearly, much of the value of high-throughput calcula-tions is wasted without deeper Big-Data driven analysis of the results. This is the extreme-scale computing challenge addressed by NOMAD. ... more

Page 2: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

2

https://youtu.be/yawM2ThVlGw

Currently, NOMAD holds >30 million total-energy calculations. -- Amount is rapidly increasing

Additional high-through-put calculations needed for creating important “missing data”

The NOMADScope

Page 3: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

3

https://repository.nomad-coe.eu also described at youtube.com

The NOMAD LaboratoryA European Centre of Excellence

There are 30-40 important codes used in computational materials science.

Nomenclature, data representation, and file formats of the input and output files of these codes are different. The heterogeneity could hardly be worse.

https://repository.nomad-coe.eu/

https://youtu.be/UcnHGokl2Nc

The NOMAD Repository accepts /requests in- and output files of all important codes. Currently, the NOMAD Repository contains > 30 million total-energy calculations.

The NOMAD LaboratoryA European Centre of Excellence

https://repository.nomad-coe.eu/

https://youtu.be/UcnHGokl2Nc

The NOMAD Repository accepts /requests in- and output files of all important codes. Currently, the NOMAD Repository contains > 30 million total-energy calculations.

https://repository.nomad-coe.eu also described at youtube.com

Page 4: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

4

The NOMAD Archive stores - in a code-independent format - calculations performed with all the most important and widely used electronic-structure and force-field codes.

Summary statistics of the Archive content, updated June 1st, 2017:

Total-Energy Calculations Bulk Crystals Surfaces Molecules

Different Geometries Chemical Compositions Band Structures Phonon Calculations

The NOMAD Archive stores - in a code-independent format - calculations performed with all the most important and widely used electronic-structure and force-field codes.

Summary statistics of the Archive content, updated June 1st, 2017:

Total-Energy Calculations Bulk Crystals Surfaces Molecules

Different Geometries Chemical Compositions Band Structures Phonon Calculations

Page 5: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

5

The NOMAD Archive

90% of the VASP files are from AFLOWliband OQMD

10,000

100

1

# G

eom

etri

es p

er C

om

po

siti

on

1 Mio

1,000

1

# C

om

po

siti

on

s

The Big-Data Challenge

Query and readout what was stored; high-throughput screening.

until recently

Volume (amount of data),

Variety (heterogeneity of form and meaning of data),

Velocity at which data may change or new data arrive,

Veracity (uncertainty of quality).

Page 6: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

6

The Big-Data Challenge

(Big) Data of materials does not only provide direct information but the data is structured.

How can we find (and understand) this structure?

The Big-Data Challenge of Computational Materials Science

Code-independent representation of the computed properties

• Unique classification and labeling of the data

• Data normalization

Generic and code-specific metadata

Veracity

Analyze the influence of xc functionals, basis sets, pseudopotentials, force-fields, various other approximations …

Define error bars and confidence levels

Page 7: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

7

Metadata for Computational Material Science

Purposes of metadata • when storing “key” – “value” pairs, metadata is the “key”. E.g. “program_name” is a

metadata and “VASP” is a possible “value”.• Describe and organize data• Keep track of relations between data

section_runprogram_name FHI-aims

program_version 081912

section_systemsimulation_cell [[1.4e-9 ...]]atom_positions [[0.0,....]...]atom_labels ["Cu",...]

section_methodbasis_set fhi_aims_tight

XC_method DFT_GGA_PBE

section_single_configuration_calculation...

Structures and names: MetadataValues: Data

SI Units:• length: m• energy: J• …

Nested sections (hierarchical tree structure)References to other sections

Metadata for Computational Material Science

section_runprogram_name FHI-aims

program_version 081912

section_systemsimulation_cell [[1.4e-9 ...]]atom_positions [[0.0,....]...]atom_labels ["Cu",...]

section_methodbasis_set fhi_aims_tight

XC_method DFT_GGA_PBE

section_single_configuration_calculation...

Structures and names: MetadataValues: Data

SI Units:• length: m• energy: J• …

Nested sections (hierarchical tree structure)References to other sections

At this point, NOMAD has defined 434 metadata for the code-independent representation and 1,736 code-specific “keys”.

Page 8: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

8

Storing and Exchanging Data

With the NOMAD Metadata we have a conceptual model, flexible enough to be useful and support several file formats:

JSON: human readable & web ready

HDF5: indexed and optimized for multidimensional arrays

Parquet: a modern format for Big-Data storage (immutable and efficient columnar storage for hierarchical data)

The Big-Data Challenge of Computational Materials Science

Code-independent representation of the computed properties

• Unique classification and labeling of the data

• Data normalization

Generic and code-specific metadata

Veracity

Analyze the influence of xc functionals, basis sets, pseudopotentials, force-fields, various other approximations …

Define error bars and confidence levels

Page 9: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

9

Fig. 1. Historical evolution of the predicted

equilibrium lattice parameter for silicon. All

calculations within

framework. Values from

15, 16,

) are compared with (i)

predictions from the different codes used in

this study (2016 data points, magnified in

or calculations

with lower numerical settings) and (ii) the

0 K and

point effects (red line)

Historical evolution of the

predicted equilibrium

lattice parameter for

silicon

Science 351, aad3000 (2016)

Si Lattice Parameter Calculated with DFT-PBE

Science 351, aad3000 (2016)

Science 351, aad3000 (2016)

Page 10: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

10

Science 351, aad3000 (2016)

Science 351, aad3000 (2016)

Just The First Stepbulk crystals

and PBE xc functional

An exhaustive test set(71 elemental solids)

Δ-Value Project in Materials ScienceScience 351, aad3000 (2016)

Test methods/codes (pseudopoten-tials, relativistic treatment, etc.)

and numerical accuracy (basis sets, k-points, etc.)

Page 11: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

11

Code Version Basis Electron treatment Δ-valuemeV/atom

WIEN2k 13.1 LAPW/APW+lo all-electron 0

FHI-aims 081213 tier2 all-electron (relativistic 0.2numerical orbitals atomic_zora scalar)

Exciting development LAPW+xlo all-electron 0.2version

Quantum 5.1 plane waves SSSP Accuracy (mixed 0.3ESPRESSO NC/US/PAW potential lib.)Elk 3.1.5 APW+lo all-electron 0.3VASP 5.2.12 plane waves PAW 2015 0.4FHI-aims 081213 tier2 all-electron (relativistic 0.4

numerical orbitals zora scalar 1e-12)CASTEP 9.0 plane waves OTFG CASTEP 9.0 0.5

https://molmod.ugent.be/deltacodesdftScience 351, aad3000 (2016)

Comparing Solid-State DFT Codes, Potentials, and Basis Sets (K. Lejaeghere et al.)

Comparing Solid-State DFT Codes, Potentials, and Basis Sets (K. Lejaeghere et al.)

Science 351, aad3000 (2016)

slide from S. Cottenier

Page 12: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

12

Code Version Basis Electron treatment Δ-valuemeV/atom

WIEN2k 13.1 LAPW/APW+lo all-electron 0

FHI-aims 081213 tier2 all-electron (relativistic 0.2numerical orbitals atomic_zora scalar)

Exciting development LAPW+xlo all-electron 0.2version

Quantum 5.1 plane waves SSSP Accuracy (mixed 0.3ESPRESSO NC/US/PAW potential lib.)Elk 3.1.5 APW+lo all-electron 0.3VASP 5.2.12 plane waves PAW 2015 0.4FHI-aims 081213 tier2 all-electron (relativistic 0.4

numerical orbitals zora scalar 1e-12)CASTEP 9.0 plane waves OTFG CASTEP 9.0 0.5

https://molmod.ugent.be/deltacodesdftScience 351, aad3000 (2016)

Comparing Solid-State DFT Codes, Potentials, and Basis Sets (K. Lejaeghere et al.)

https://nomad-coe.eu

https://nomad-coe.eu

Request by colleagues form industry

Give guidelines about …

Date for special issues missing

Analyzing and estimating error bars from high-accuracy references

Page 13: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

13

(Big)-Data Analytics

(Big) Data of materials does not only provide direct information but the data is structured.

How can we find (and understand) this structure?

(*) Work performed in collaboration with

Luca Ghiringhelli, Jan Vybiral, Claudia Draxl, Mario Boley,

Bryan Goldsmith, Runhai Ouyang, et al.

o crystal-structure prediction

o figure of merit of thermoelectrics (as function of T)

o turn-over frequency of catalytic materials (as function of T and p)

o efficiency of photovoltaic systems

o etc.

Dmitri Mendeleev(1834-1907)

From the periodic table of the elements to a chart (a map) of mate-rials: Organize materials according to their properties and functions.

Learning Descriptors for Materials-Science (Big) Data

Page 14: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

14

Only DFT-LDA: Can we predict not yet calculated LDA structures from ZA and ZB?

82 octet AB binary compounds

We need to arrange the data such that statistical learning is efficient. We need a good set of descriptive parameters.

d1

d2

RS

?

Crystal-Structure PredictionClassification “Zincblende/Wurtzite or Rocksalt?”

How to find d1, d2?In reality the representation will be higher than 2-dimensional.

A map of materials

Only DFT-LDA: Can we predict not yet calculated LDA structures from ZA and ZB?

82 octet AB binary compounds

We need to arrange the data such that statistical learning is efficient. We need a good set of descriptive parameters.

d1

d2

RS

?

Crystal-Structure PredictionClassification “Zincblende/Wurtzite or Rocksalt?”

How to find d1, d2?In reality the representation will be higher than 2-dimensional.

A map of materials

Page 15: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

15

Targ

et P

rop

erty

Targ

et P

rop

erty

Calculation # Descriptor D

Find Structure in Big Data That Is A Priori “Not Visible”Data Fitting, Statistical Learning, Machine Learning

Arrange/organize materials with respect to a property and a set of simple descriptive parameters (a descriptor).

Targ

et P

rop

erty

Targ

et P

rop

erty

Calculation # Descriptor D

Find Structure in Big Data That Is A Priori “Not Visible”Data Fitting, Statistical Learning, Machine Learning

The descriptor can be designed: Rupp, von Lilienfeld, Behler, Csanyi, Seko, Tsuda, Tanaka, …

The descriptor can be selected out of a large set of candidates: Ozolins, Ghiringhelli, Ouyang.

More data means a better representation. Will we ever have enough data?

Arrange/organize materials with respect to a property and a set of simple descriptive parameters (a descriptor).

Page 16: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

16

Targ

et P

rop

erty

Targ

et P

rop

erty

Calculation # Descriptor D

Find Structure in Big Data That Is A Priori “Not Visible”Data Fitting, Statistical Learning, Machine Learning

The descriptor can be designed: Rupp, von Lilienfeld, Behler, Csanyi, Seko, Tsuda, Tanaka, …

The descriptor can be selected out of a large set of candidates: Ozolins, Ghiringhelli, Ouyang.

More data means a better representation. Will we ever have enough data?

Arrange/organize materials with respect to a property and a set of simple descriptive parameters (a descriptor).

We have data {Pi} at “coordinates” {xi} xi = set of descriptive parameters (descriptor)

Linear regression:

Polynomial kernel

Gaussian kernel

K(xi, xk) = xi . xk P(xi) = xi . c*

K(xi, xk) = exp ( Σj ( xi xk )2 / 2σj2 )

Kernel Regression

Pi = P(xi) = Σk=1 ck K(xi, xk)N

K(xi, xk) = ( xi . xk + c ) d

Page 17: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

17

…. the Gaussian Kernel

“With five Gaussians you can fit an elephant, and if you use a sixth one, the animal will waive its tail.”

1975 (probably since earlier): Experimentalists used “Gaussians least square fits” to fit their results (to fit any curve they produced); photoemission, vibrational spectroscopy, etc.

Now we are using a sum over thousands Gaussians to fit our data (KRRwith Gaussian kernels; used in nearly all ML work).

Kernel Regression

We have data {Pi} at “coordinates” {xi} xi = set of descriptive parameters (descriptor)

Linear regression:

Polynomial kernel

Gaussian kernel

K(xi, xk) = xi . xk P(xi) = xi . c*

K(xi, xk) = exp ( Σj ( xi xk )2 / 2σj2 )

Pi = P(xi) = Σk=1 ck K(xi, xk)N

For successful learning, we need a “good” descriptor: P(xi) P(di)

K(xi, xk) = ( xi . xk + c ) d

Page 18: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

18

Kernel Regression

We have data {Pi} at “coordinates” {xi} xi = set of descriptive parameters (descriptor)

Linear regression:

Polynomial kernel

Gaussian kernel

K(xi, xk) = xi . xk P(xi) = xi . c*

K(xi, xk) = exp ( Σj ( xi xk )2 / 2σj2 )

Pi = P(xi) = Σk=1 ck K(xi, xk)N

For successful learning, we need a “good” descriptor: P(xi) P(di)

K(xi, xk) = ( xi . xk + c ) d

For materials science and quantum mechanics we need knowledge-based, domain-specific approaches. Even fitting with >10,000 parameters may not capture the enormous variety and intricate nature of materials phenomena.

Statistical Learning (Machine Learning) vs. Compressed Sensing

fit and/or interpolation of known data points { Pi } and building a function P(d)

the key scientific challenge: find a reliable, low-dimensional descriptor d.

kernel ridge regression linear

+ +minimize

Page 19: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

19

fit and/or interpolation of known data points { Pi } and building a function P(d)

the key scientific challenge: find a reliable, low dimensional descriptor d.

kernel ridge regression linear

+ +minimize

least absolute shrinkage and selection operator (LASSO) for feature selection

R. Tibshirani, J. Royal Statist. Soc. B 58, 267 (1996)

Statistical Learning (Machine Learning) vs. Compressed Sensing

fit and/or interpolation of known data points { Pi } and building a function P(d)

the key scientific challenge: find a reliable, low dimensional descriptor d.

kernel ridge regression linear

+ +minimize

least absolute shrinkage and selection operator (LASSO) for feature selection

R. Tibshirani, J. Royal Statist. Soc. B 58, 267 (1996)

Statistical Learning (Machine Learning) vs. Compressed Sensing

l2 norm: sqrt(x12 + y1

2 )x1

y1

l1 norm: | x1| + | y1| Manhattan (taxi cab) distance

Page 20: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

20

Primary Features for theRock Salt/Zincblende Structure Prediction

free atoms

free dimers

Enabling Feature Spaces with Billions of Elements by Sure Independence Screening

[1] J. Fan and J. Lv, J. R. Statist. Soc. B 70, 849 (2008)

Runhai Ouyang

1. Systematically construct a huge feature space (1011) from

0 (23 primary features): 𝑅 = {+, , ∙, 1, 2, 3, , exp, log, ||}

2. Select top ranked features using Sure Independence Screening (SIS)[1] (correlation learning). Select n features corresponding to the n largest projection on the target property, i.e. largest components of the vector ( 𝑫𝑇𝒚 ).

𝑫 : matrix of the feature space (82 x 100 billion elements)

y : target property (here: rock salt-zincblende energy differences; 82 elements)

Page 21: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

21

Iterative Application of Sure Independence Screening

And l0

Huge feature space (100 billion elements) constructed from 0 (23 elements)

…𝑆1 (~1,000elements)

𝑆2 𝑆1

l0

𝑆𝑛 𝑖=1

𝑛−1

𝑆𝑖

l0

SIS𝒚( ) SISRn-1( )

l0

SIS = sure independence screening Si = feature subspace

𝑹𝑛: residual of the n-dim. descriptor w.r.t. y; e.g.: R1 = y c1*d1+c0

SISR1( )

1-dim. descriptor 2-dim. descriptor n-dim. descriptor

Using no information on BN and C we would have predicted the existence and unusual stability of these materials.

“The Map” -- Compressed Sensing --2-Dimensíonal Descriptor

The complexity and science is in the

descriptor -- identified from >10,000 features.

L.M. Ghiringhelli, J. Vybiral, S.V. Levchenko, C. Draxl, and M. Scheffler, PRL 114 (2015).

Page 22: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

22

https://nomad-coe.eu

https://nomad-coe.eu

Request by colleagues form industry

Give guidelines about …

Date for special issues missing

Crystal structure prediction (probably the most fundamental and important challenge in materials science)

Predicting energy differences between crystal structures

Building structure maps for crystal-structure classification

Topological Insulators (Quantum Spin Hall Systems)

Spin-orbit coupling, electric field, strain

Trivial insulator

Topological insulator

Topological transition

Surface states

Fermi level

Ener

gy

Momentum

2D topological insulators: No backscattering in edge stated.

Promising materials in spintronics applications.

Characterized by a Z2

topological index.

Page 23: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

23

Towards an Understanding Topological InsulatorsFunctionalized Honeycomb (2D) Systems

ABC2 CompoundsCalculate geometry, bandstruc-ture, and Z2

for 220 materials.

Towards an Understanding Topological InsulatorsFunctionalized Honeycomb (2D) Systems

ABC2 CompoundsCalculate geometry, bandstruc-ture, and Z2

for 220 materials.

Page 24: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

24

Calculate geometry, bandstructure, and Z2 for 220 materials.

ABC2 Compounds

Towards an Understanding Topological InsulatorsFunctionalized Honeycomb (2D) Systems

Page 25: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

25

Towards an Understanding Topological InsulatorsFunctionalized Honeycomb (2D) Systems

Compounds functionalized with F, Cl, Br, I are represented by diamonds, squares, circles and triangles, respectively.

Metals: green,

Functionalization Dependent-QSHIs: red,

Functionalization Independent-QSHIs: blue,

Trivial insulators: white/grey.

Colored background: compressed sensing

ABC2 Compounds

Towards an Understanding Topological InsulatorsFunctionalized Honeycomb (2D) Systems

Compounds functionalized with F, Cl, Br, I are represented by diamonds, squares, circles and triangles, respectively.

Metals: green,

Functionalization Dependent-QSHIs: red,

Functionalization Independent-QSHIs: blue,

Trivial insulators: white/grey.

Colored background: compressed sensing

ABC2 Compounds

The identified descriptors only include properties of

the free atoms!

Page 26: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

26

Towards an Understanding Topological InsulatorsFunctionalized Honeycomb (2D) Systems

Compounds functionalized with F, Cl, Br, I are represented by diamonds, squares, circles and triangles, respectively.

Metals: green,

Functionalization Dependent-QSHIs: red,

Functionalization Independent-QSHIs: blue,

Trivial insulators: white/grey.

Colored background: compressed sensing

ABC2 Compounds

The descriptor identifies materials that do not followcommonly believed rules of

thumb.

Success Stories and Ongoing Projects (at the NOMAD home page)

Page 27: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

27

ΔE

energy of the free reactants

energy of

the products

e.g. CO2

e.g. CO and O2

reaction coordinate

En

erg

y

Catalysis

Jöns Jakob Berzelius1779-1848

Wilhelm Ostwald 1853-1932

Nobel prize for chemistry 1909

A catalyst is a material that accelerates the kinetics of a chemical reaction. But it is not part of the final chemical product.

Catalysis

Transform CO2 or CO Into Something Useful

Carbon-aceous Acid

Formalde-hyde

Methanol

Methane

We need better catalysts!

Our understanding of the dynamics and kine-tics of heterogeneous catalysis is very shallow. Structure, and composi-tion of the material that is catalytically active (at realistic p and high Tconditions) is largely unknown.

Page 28: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

28

Significant Progress … But Our Knowledge Is Still Close to Zero

About 240,000 inorganic compounds have been synthesized so far. Many more are possible.

And what do we know?

Elastic constants: about 200 compounds

Super conductors ≈ 1000

Dielectric constant ≈ 300-400

For almost every property we are below 1% in coverage ….

Page 29: Introduction of New Paradigms to Materials Science ... · The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for input and output files of all important

7/5/2017

29

The NOMAD LaboratoryA European Centre of Excellence

We need to develop domain-specificcompressed-sensing and other

machine-learning tools toidentify causal descriptors

(physical models).

The amount of different materials is huge. However, the number of materials that exhibit a certain

function, is rather small, i.e. the space of chemical and structural compounds is sparsely populated.

Reality

big-data analytics in materials science

Rel

eva

nce

of

an

ew t

ech

no

logy

Time

Reality

big-data analytics in materials science

The NOMAD LaboratoryA European Centre of Excellence

We need to develop domain-specificcompressed-sensing and other

machine-learning tools toidentify causal descriptors

(physical models).

The amount of different materials is huge. However, the number of materials that exhibit a certain

function, is rather small, i.e. the space of chemical and structural compounds is sparsely populated.

Reality

big-data analytics in materials science

Rel

eva

nce

of

an

ew t

ech

no

logy

Time

Reality

Perception

we are probably here

big-data analytics in materials science